Constrained Adaptive TTL Estimation - Cacheability Estimation: Whether and How Long to Cache

4.2 Cacheability Estimation: Whether and How Long to Cache

4.2.2 Constrained Adaptive TTL Estimation

The goal of the Constrained Adaptive TTL Estimator (CATE) is to minimize the cost function (see Equation 4.3), while constraining the size of the Cache Sketch to meet a good false positive rate. To this end, CATE adjusts TTLs to the cache miss rateλrand write rateλwinstead of merely estimating the time to the next write. The estimation approach is illustrated in Figure 4.6a: write and cache miss metrics are aggregated in the server and fed into the estimator for each cache miss to retrieve a new TTL. The algorithm is based on four design choices:

1. Read-onlyobjects yieldT T L_max andwrite-onlyobjects are not cached.

2. If the miss rate λm approximately equals the write rate λw, the object should be cached for its expected lifetime expressed by theinterarrival time median of writes Q(0.5,λw), i.e., the TTL is chosen so that the probability of a write before expiration is 50%.

3. A ratio function f: R→[0,1] expresses how the miss-write ratio impacts the esti-mated TTLs. It maps the imbalance between misses and writes to p_targetwhich gives the TTL as the quantileQ(ptarget,λw). If for instance misses dominate writes, p=0.9 would allow a 90% chance of a write before expiration, in order to increase cache hits. Using quantiles over TTLs for the ratio function has two advantages. First, the probability of a write happening before the expiration is easier to interpret than an abstract TTL. Second, the quantile scales with the write rate. The ratio function and its parameters can be tuned to reflect the weights in the cost function.

4. Constraintson the false positive rate of the Cache Sketch and the number of invali-dations per time period are satisfied by lowering TTLs.

Algorithm 1 describes CATE. The ESTIMATE procedure is invoked for each cache miss. It requires three constants: the maximum TTL T T Lmax, the ratio function f, and theslope which defines how strongly f translates the imbalance between misses and writes into smaller or greater TTLs.

First, the miss-writeimbalanceis calculated (line 4). We define it to be0ifλm=λw,xifλm

isxtimes greater thanλw and−x ifλw isxtimes greater thanλm(line 5). Next, the ratio function maps the imbalance to the allowed probabilityptargetof a write (and invalidation) before the expiration date. p_targetis capped atp_max=Pr[Ti<T T L_max], so that the estimated TTL never gets larger thanT T Lmax. We consider three types of ratio functions shown in

Algorithm 1Constrained Adaptive TTL Estimation (CATE)

1: procedure^ESTIMATE(λm:miss rate,λ_w:write rate)→T T L 2: constants:T T L_max,slope, f :ratio function

3: ifλ_w=NILthen returnT T L_max 4: imbalance=







λm/λ_w−1 ifλm≥λw

−(λ_r/λ_w−1) else

5: p_max←Pr[T_i<T T L_max] = (1−e⁻^λ^w^{T T L}^max) 6: if f islinearthen p_target←0.5+slope·imbalance

7: else if f islogisticthenp_target←p_max/(2p_max·e−slope·imbalance) 8: else if f isunweightedthenp_target←λ_m/(λ_m+λ_w)

9: ifCache Sketch capacity exceededthen

10: Decrease p_targetby a penalty proportional to false positive rate 11: ifInvalidation budget exceededthen

12: Decrease ptarget

13: T T L=











0 ifptarget≤0

T T L_max ifp_target≥p_max Q(p_target,λw) else

14: returnT T L

Figure 4.6c: a linear and a logisticfunction of the imbalance, as well as the unweighted fraction of misses in all operations (lines 6 to 8).

In order not to overfill the Cache Sketch, its current false positive rate is considered. If it exceeds a defined threshold, ptarget is decreased to trade invalidations on non-expired objects against revalidations on expired objects (lines 9 to 10). By lowering the probability of writes on non-expired objects, Cache Sketch additions decrease, too. Invalidations are treated similarly: if the budget of allowed invalidations is exceeded, ptarget is decreased (lines 11 to 12). In this way, Cache Sketch additions and invalidations are effectively rate-limited. The optimal amount to decrement depends on the severitya of a violation and can be computed as p_target =p_target·(1−f)^a, where f is the degree of violation, for example the difference between the allowed and actual false positive rate. Last, the TTL derived as the quantileQ(p,λw)is returned (lines 13 to 14).

Figure 4.6b gives an example of estimated TTLs for a read-heavy scenario, as well as the corresponding probabilityPr[Ti<T T L]of a write before expiration. By construction, all three ratio functions yield a TTL that is higher than the median time between two writes in order to drive cache misses down. The magnitude of this TTL correction is determined by the ratio function and its slope. This makes it obvious that minimizing the cost function requires tuning of the ratio function in order to meet the relative weights between misses, invalidations, stale reads, and false positives. As finding the right T T L_max and slope in a running system is a cumbersome, manual, and error-prone process, we introduce a framework in Section 4 that chooses parameters using Monte Carlo simulations to find the best solution under a given workload and error function.

Client

Server

Reads

Misses

λm: Miss Rate λw: Write Rate

Collect TTL

Per Record

λm λw

Caches

Writes

~ Poisson

TTL Estimator

Objective:

-Maximize Hits -Minimize Purges -Minimize Stale Reads

-Bound Cache Sketch False Positive Rate Writes

~ Poisson

(a) The TTL estimation process.

Median Unw. LinearLogistic

60 TTL[s]

0.0 0.2 0.4 0.6 0.8 Write CDF

E[T_M]=19 000, E[T_W]=30 000

(b) TTL estimations for an example workload.

Linear(slope=0.5) Logistic(slope=1) Unweighted

1:3 1:2 1:1 2:1 3:1 4:1 5:1 6:1 0.0

0.2 0.4 0.6 0.8 1.0

Miss:Write Ratio

InvalidationProbability

0.0 0.5 1.0 1.5 2.0

miss rate [ops/time unit]

writerate[ops/timeunit]

0.1 0.3 0.5 0.7 0.9

Maximum TTL NoCaching

(d)p_targetcontour plot for linear ratio function.

Figure 4.6: Constrained Adaptive TTL Estimation.

Figure 4.6d shows the effect of different miss and write rates as a contour plot of the linear ratio function. In the upper left area, writes clearly dominate misses, so the estimator opts to not cache the object at all – frequent invalidations would clearly outweigh seldom cache hits. In the bottom right area, on the other hand, misses dominate writes, so the object is cached forT T Lmax. The area in between gradually shifts to higher TTLs (values ofptarget), with the steepness of the ascent varying with the slope.

As explained above, estimating TTLs requires each Orestes server to have approximations ofwriteandmiss ratesfor each object. To this end, inter-arrival times are monitored and averaged over a time window using a simple moving average (SMA) or exponentially-weighted moving average (EWMA). The space requirements of the SMA are high, as the latest arrival times for each object have to be tracked, whereas the EWMA only requires a single value. Similarly, a cumulative moving average (CMA) requires little space, but weighs older inter-arrival times as heavily as newer ones. While this assumption is optimal for Poisson processes, it fails for non-stationary workloads, e.g., when the popularity of objects decreases over time. To address the overall space requirements, sampling can be applied. More specifically, exponentially-biased reservoir sampling is an appropriate

stream sampling method that prefers newly observed values over older ones [Agg06].

The reservoir is a fixed-size stream sample, i.e., a map of object IDs to their write and miss moving averages. In the Orestes approach of load-balanced middleware service nodes, every server already sees an unbiased sample of operations, whereas in the case that Cache Sketch maintenance is co-located with each partitioned database node, only local objects would have to be tracked, lowering the space requirements.

Im Dokument Low Latency for Cloud Data Management (Seite 151-154)