Cache Coherence for Query Results - Low Latency for Cloud Data Management

To illustrate the value of a cache coherence mechanism, consider query caching with static TTLs as a straw-man solution. In that case, the server would assign a constant, applica-tion-defined TTL to each query, so that any web cache may serve the query where staleness is bounded by the TTL. This does not require any query invalidation logic in the client or server, as the regular expiration-based semantics of HTTP web caching are used. The problem of this naive solution is that either many stale queries will occur when the TTL is too high, or cache hit ratios will suffer when the TTL is too low. As in object-based caching, the first step to improving this scheme is adapting the purely static TTLs to the actual frequency of changes for each query. However, even for a better stochastic TTL estimation, stale query results occur for each deviation from the estimate. To address this, the Cache Sketch needs to be extended to capture stale query results.

4.5.1 Cache Sketches for Query Caching

The purpose of the extended Cache Sketch is to answer the question whether a given query is potentially stale. This information allows the server to compensate for TTLs of queries that change before their TTL expires.

SDK Query

Hash1( ), Hashk( ) Is stale?

Client Cache Sketch

Revalidate if stale

Get ORESTES with Quaestor

After-image of update u

Distribution Layer (Publish/Subscribe, Active Queries, Capacity Management, Result States)

TTL Estimator Prediction for Query Results Server Cache

Sketch Stale Queries Database

CRUD & Queries handled by DB

(DBaaS/BaaS Middleware with query & record caching) Purge

Backend

Client Internet

InvaliDB - Streaming Layer (which cached queries does u invalidate?) InvaliDB - Streaming Layer

(which cached queries does u invalidate?)

4 Invalidation- based Caches

Decision Model Cache-Optimal Result Structure 1 0 0 1 1 0 1 1

1 0 0 1 1 0 1 1

Expiration- based Caches 3

Figure 4.14: Query Caching architecture and request flow for providing cacheable query results.

Request Flow for Queries

Figure 4.14 gives a high-level overview of the query caching architecture with the role of the Cache Sketch for queries. From the perspective of a client performing a query, the request flow is as follows:

1. Upon connection, the client gets a client Cache Sketch (cf. Theorem 4.1) containing freshness information on potentially stale query results. During a client’s session, the Cache Sketch is renewed periodically.

2. Before issuing a query, the Cache Sketch is queried by the SDK to decide between a normal cached load and a revalidation request.

3. The caches either serve their cached copy or forward the query upstream.

4. For cache misses and revalidations, the server returns the query result from the database using an appropriate TTL through query TTL estimation (cf. Section 4.6.2) and an appropriate result structure (cf. Section 4.6.3) using a decision model. The query is registered in InvaliDB to detect changes to the delivered query result in real time. If operations on the database implicitly update the query result before the TTL is expired, the query is added to the server Cache Sketch and purged from invalidation-based caches.

Construction and Properties of the Query Cache Sketch

A query or read is performed by querying the client Cache Sketchct that was generated at timet. The key is the normalized query string and hashed to the underlying Bloom filter, similar to object IDs. Theorem 4.2 derives the guarantees of the Cache Sketch for queries by generalizing Theorem 4.1 (see page 128) that derived ∆-atomic semantics for object reads.

Definition 4.4. Let ct₃ be the Expiring Bloom Filter generated at time t₃. It contains the normalized query stringq of every result result(q) that became stale before it expired in all caches. Formally, this is every qfor which holds that ∃r(q,t₁,T T L),w(x,t₂):t₁+T T L>t₃>

t₂>t₁. The operationr(q,t₁,T T L)is a query ofqat timet₁with aT T Lfor the query result and w(x,t2)is a write happening att₂on a recordxso thatresult(q)is invalidated (see notification events add, change, and remove in Section 4.6).

Theorem 4.2. A query q performed at timet₄ using the client Cache Sketchct₃ satisfies ∆-atomicity with∆=t₄−t₃, i.e., the client is guaranteed to see only query resultsresult(q)that are at most∆time units stale.

Proof. Analogous to the proof of Theorem 4.1, consider a query issued at timet₄usingct₃

returning the query result result(q)that was stale for ∆>t₄−t₃. Therefore, qmust have been invalidated at a timet₂<t₃ as otherwiset₄−t₂<∆. Hence, there must have been an earlier queryr(q,t₁,T T L)witht₁+T T L>t₄>t₂so thatresult(q)is still cached. By the construction ofc_t₃, the query is contained inc_t₃ untilt₁+T T L>t₄and therefore not stale at timet₄(proof by contradiction).

The Cache Sketch thus contains all stale queries for one point in time, i.e., queries that became invalid while still being stored in some cache.

Freshness Policies

The achieved freshness is linked to the age of the Cache Sketch. Similar to object caching, the basic way of utilizing the Cache Sketch is to fetch it on page load and use it for the initial resources of the application, e.g., stylesheets and images (cached initialization, see Definition 4.1). To maintain∆-bounded staleness, the Cache Sketch is refreshed in a con-figurable interval of∆. Clients can therefore precisely control the desired level of consis-tency for queries, objects, and files. This polling approach for the Cache Sketch resembles Pileus’ [TPK⁺13] method, where clients poll timestamps from all replication sites to de-termine which replica can satisfy the demanded consistency level. However, the Cache Sketch is significantly more scalable as the freshness information is already aggregated and does not have to be assembled by clients from different caches or replicas.

4.5.2 Consistency

The consistency levels provided by Quaestor are summarized in Figure 4.15. They can be grouped intodefault guaranteesthat are always met andopt-in guaranteesthat are associated with an overhead and can be enabled per request, session, or application¹³.

Default Consistency Guarantees

The central consistency level enabled by the Cache Sketch is∆-atomicitywith the applica-tion and clients being able to choose∆. Several additional session consistency guarantees

13In Section 2.2.4, formal definitions of the discussed consistency models are given.

are achieved. Monotonic writes, i.e., a global order of all writes from one client session, are assumed to be given by the database (e.g., MongoDB) and are not impeded by the Cache Sketch. Read-your-writesconsistency is obtained by having the client cache its own writes within a session: after a write, the client is able to read her writes from the local cache. Monotonic read consistency guarantees that a client will only see monotonically increasing versions of data within a session. This is achieved by having clients cache the most recently seen versions and comparing any subsequent reads to the highest seen ver-sion. If a read returns an older version (e.g., from a different cache), the client resorts to the cached version, if it is not contained in the Cache Sketch, or triggers a revalidation otherwise. These session consistency guarantees are maintained by the SDK, transparent for the developers using it.

As discussed in Section 4.1.7, Orestes can expose an eventually consistent data store.

The inconsistency window ∆DB of the data store then lowers the ∆-atomicity guarantee.

The same holds true, if invalidations are performed asynchronously with lag∆. However, as the probability that this violates consistency is low, it is a common choice to accept (∆+∆DB+∆c)-atomicity. By choosing a lower ∆, developers can easily compensate both effects. In practice, adjusting ∆ to ∆−∆c allows revalidation requests to be answered by invalidation-based caches instead of the origin servers. This optimization significantly offloads the backend. If however, the exposed database system does not offer strong consistency, but potentially unbounded staleness (e.g., due to asynchronous replication) the Cache Sketch’s guarantee becomes a probabilistic consistency level of (∆+∆DB ,p)-atomicity (cf. discussion in Section 4.1.7).

Consistency Level Realization

Δ-atomicity(staleness never

exceedsΔseconds) Controlled by age (i.e. refresh interval) of Cache Sketch

Monotonic Writes Guaranteed by underlying database system

Read-Your-Writesand

Monotonic Reads Written data and most recent read-versions cached in client

Causal Consistency If read timestamp is older than Cache Sketch it is given, else revalidation required Strong Consistency

(Linearizability) Explicit revalidation (cache miss at all levels)

AlwaysOpt-in

Figure 4.15: Consistency levels provided by Quaestor: ∆-atomicity, monotonic writes, read-your-writes, monotonic reads are given by default, causal consistency and strong consistency can be chosen per operation (with a performance penalty).

Opt-in Consistency Guarantees

By allowing additional cache misses, causal consistency and even strong consistency are possible as an opt-in by the client. With causal consistency, any causally related operations are observed in the same order by all clients [VV16]. With caching, causal consistency can be violated, if of two causally dependent writes one is observed in the latest version and the other is served by a cache. Using the Cache Sketch, any causal dependency younger than the Cache Sketch is observed by each client, as the Cache Sketch acts a staleness barrier for the moment in time it was generated: any writes that happened before the generation of the Cache Sketch are visible along with the causal dependencies.

However, if a read is newer than the Cache Sketch, causal consistency might be violated on a subsequent second read. Therefore, the client has two options to maintain causal consistency after a read newer than the Cache Sketch is returned¹⁴. First, The Cache Sketch can be refreshed to reflect recent updates. Second, every read happening before the next Cache Sketch refresh is turned into a revalidation. For strong consistency within a client session, every read within that session is performed as a revalidation. In that case, latency is not reduced, but an unnecessary transfer of the object or query result is prevented, if the data is still up-to-date.

All default and opt-in consistency guarantees are identical for objects, files, and queries.

From the perspective of the Orestes middleware, a query result is simply a special type of object identified by a query string that changes based on invalidation rules. Therefore, the consistency guarantees provided through the combination of the Cache Sketch and server-side invalidations are the same for all types of cached data delivered by Orestes.

The strongest semantics Orestes can provide are ACID guarantees through distributed cache-ware transactions (see Section 4.8). These optimistic transactions exploit the fact that caching reduces transaction durations and can thereby achieve low abort rates with a variant of backward-oriented optimistic concurrency control. As described in detail in Section 4.8, the key idea is to collect read sets of transactions in the client and validate them at commit time to detect both violations of serializability and stale reads. The scheme is similar to externally consistent optimistic transactions in F1 and Spanner [CDE⁺13, SVS⁺13], but can leverage caching and the Cache Sketch to decrease transaction duration for clients connected via wide-area networks.

Additionally, clients can directly subscribe to query result change streams that are other-wise only used for the construction of the Cache Sketch. For this purpose, Orestes exposes a continuous query interface that leverages Websockets [Gri13] to proactively push query result updates to connected end devices. Through this synchronization scheme, the appli-cation can define its critical data set through queries and keep it up-to-date in real time.

For applications with a well-defined scope of queries, this approach is preferable, while complex web applications will profit from using the Cache Sketch due to lower latency for the initial page load and lower resource usage in the backend.

14This can easily be observed based on theLastModifiedfield provided in the response for each object.

4.5.3 Cache Sketch Maintenance for Queries

Query caching relies on the server Cache Sketch that stores the Bloom filter and tracks a separate mapping of queries to their respective TTLs. In this way, only non-expired queries are added to the Bloom filter upon invalidation. After their TTL is expired, queries are automatically removed from the Bloom filter. These removals are based on a distributed queue implementation storing the outstanding Bloom filter removals shared across Orestes servers. To achieve this without coordination overhead, the Orestes prototype relies on sorted sets in Redis.

The client-side usage of the Cache Sketch for queries is similar to objects. A stale query is contained in the Cache Sketch until the highest TTL that the server previously issued for that query has expired. While contained, the query always causes a cache miss. To main-tain the Cache Sketch in the server, changes to cached query results have to be detected and added in real time, as described in the following section.

Im Dokument Low Latency for Cloud Data Management (Seite 166-171)