HTTP Caching for APIs: Cache-Control, ETags, and Conditional Requests
Caching is one of the highest-leverage performance improvements available to an API, and one of the least consistently implemented. A response that is cached at the right layer — in a CDN, a reverse proxy, or the client itself — eliminates a server round trip entirely. At scale, that elimination compounds: fewer database queries, lower infrastructure cost, faster responses for every consumer. The HTTP specification provides a complete, standardized caching system. Most APIs use it only partially, leaving significant headroom unrealized.
How HTTP Caching Works
HTTP caching is declarative. The server includes caching directives in its response headers, and compliant caches — browsers, CDNs, reverse proxies — respect those directives. The server controls the policy; the cache enforces it. This means you can implement sophisticated caching behavior without writing any cache management code in your clients.
The two primary mechanisms are expiration-based caching (the response is valid for a set duration) and validation-based caching (the client checks whether the cached response is still current before using it).
Cache-Control
Cache-Control is the primary header for expressing caching policy. It accepts a set of directives that define the scope and duration of caching:
Cache-Control: public, max-age=3600
max-age specifies the response’s freshness lifetime in seconds. A response with max-age=3600 can be served from cache without revalidation for one hour after it was received. public permits any cache — including CDNs and shared proxies — to store and serve the response. private restricts storage to the end client (typically the browser), preventing shared caches from serving the response to other users.
For responses that should never be cached:
Cache-Control: no-store
no-store prevents the response from being stored by any cache. no-cache is subtler — it allows storage but requires revalidation on every use (the client must confirm the cached version is still valid before serving it, which is useful when freshness matters but you want to avoid redundant body transfers).
For authentication-protected responses, use Cache-Control: private or no-store depending on sensitivity. A shared cache serving one user’s private data to another user is a security failure, not a caching misconfiguration.
ETags: Validation Without Timestamps
An ETag (entity tag) is a server-generated identifier representing the current version of a resource. It can be a hash of the response body, a version number, a modification timestamp hash — any value that changes when the resource changes and stays stable when it does not.
The server includes an ETag in the response:
ETag: "a3f8b2d1e7c4"
On subsequent requests for the same resource, the client sends the ETag back as a conditional header:
If-None-Match: "a3f8b2d1e7c4"
If the resource has not changed — the current ETag matches the provided value — the server returns 304 Not Modified with no response body. The client uses its cached version. If the resource has changed, the server returns 200 OK with the updated body and a new ETag.
The bandwidth saving is the point. A 304 response carries no body — just headers. For large responses, this is a substantial reduction in transfer cost. The server still processes the request (checking whether the resource changed), but the client and the network are spared the full body.
Last-Modified: The Timestamp Alternative
Last-Modified provides similar functionality to ETags using a timestamp rather than an opaque identifier:
Last-Modified: Sat, 02 May 2026 08:00:00 GMT
The client sends this back as:
If-Modified-Since: Sat, 02 May 2026 08:00:00 GMT
The server returns 304 if the resource has not been modified since that timestamp, or 200 with the updated body if it has.
ETags are generally preferred. Timestamps have second-level granularity, which is insufficient for resources that change multiple times per second. ETags can express version identity without any connection to time, and they support conditional writes in addition to conditional reads (using If-Match to implement optimistic concurrency control).
Conditional Writes: Preventing Lost Updates
ETags are not only for reads. They can prevent concurrent write conflicts through If-Match:
PUT /articles/42 HTTP/1.1
If-Match: "a3f8b2d1e7c4"
The server only applies the write if the current ETag matches the provided value. If another client has modified the resource since the first client fetched it, the ETags will differ, and the server returns 412 Precondition Failed. The client knows its update would have overwritten changes it has not seen, and can fetch the current state before retrying.
This is optimistic concurrency control — no locks are held, but conflicting writes are detected and rejected rather than silently overwriting each other. For collaborative APIs or any resource edited by multiple parties, this pattern prevents a common class of data corruption.
Vary: Cache Key Extension
The Vary header tells caches that the response may differ based on specific request headers, and those headers should be included in the cache key:
Vary: Accept-Encoding, Accept-Language
A cache seeing this response knows that a request with Accept-Language: fr should not be served the cached response for a request with Accept-Language: en. The Accept-Language header is part of the cache key.
For APIs returning different content based on Authorization, the correct directive is Cache-Control: private rather than Vary: Authorization — shared caches should not store user-specific responses at all. For APIs with response compression based on Accept-Encoding, Vary: Accept-Encoding ensures compressed and uncompressed versions are cached separately.
CDN Caching Considerations
CDNs provide a cache layer geographically close to users, reducing latency for cacheable responses. For CDN caching to work effectively, responses must use Cache-Control: public and have meaningful max-age values. A CDN that receives Cache-Control: no-store on every response is functioning as a reverse proxy, not a cache.
CDN caching requires care around cache invalidation. If a CDN caches a response for one hour and the underlying data changes after five minutes, users receive stale data for 55 minutes. The tradeoff between freshness and cache efficiency is inherent. For data that changes unpredictably, shorter TTLs or stale-while-revalidate (serving stale content while asynchronously revalidating) reduce the impact of staleness without eliminating caching entirely.
For data that changes on a known schedule — daily reports, weekly summaries — set max-age to match the update frequency. For data that rarely changes — reference data, configuration — aggressive caching with explicit invalidation on change is appropriate.
Practical Application
Most REST API responses fall into three categories for caching purposes. Resource reads (GET by ID) are highly cacheable — the ETag pattern with appropriate max-age is correct for most cases. Collection reads (GET with filters) are more complex because invalidation spans many possible cache keys; shorter TTLs or no-cache with ETags is appropriate. Write operations (POST, PUT, PATCH, DELETE) should return Cache-Control: no-store and include a Cache-Control: no-store directive to prevent any caching of mutating responses.
The baseline investment — adding Cache-Control, ETag, and Last-Modified headers to GET responses — requires modest implementation effort and compounds in performance benefit as traffic grows. It is the kind of infrastructure improvement that pays dividends for years without requiring additional maintenance.