Caching Strategies: Varnish, Redis and Browser Cache
Caching is the most impactful lever for fast server response times. A well-designed caching architecture reduces Time to First Byte (TTFB) from hundreds of milliseconds to under 10 milliseconds, lowers server load by 80 to 95 percent and keeps websites stable and responsive even under heavy traffic. Yet many websites only implement caching superficially -- a browser cache header here, a CDN there -- without coordinating the different caching layers. The result: cache misses, stale content and wasted performance potential. This article shows how Varnish as an HTTP accelerator, Redis as an application cache and browser caching together form a multi-tier architecture that enables response times under 100 milliseconds.
Why Multi-Tier Caching Makes the Difference
Every HTTP request passes through multiple stations on its way to the server: DNS resolution, TCP connection setup, TLS handshake, routing through load balancer and reverse proxy, processing by the application server, database queries and finally rendering the response. Without caching, this process typically takes 200 to 800 milliseconds for a dynamic page. According to Google, a website's TTFB should be under 800 milliseconds, with values under 200 milliseconds considered good (Source: Google, 2024).
Multi-tier caching intervenes at different points in this chain: The browser cache prevents known resources from being requested again at all. The CDN stores copies on edge servers close to the user and eliminates network latency to the origin server. Varnish as an HTTP accelerator serves cached pages in 1 to 5 milliseconds from memory without burdening the application server. Redis accelerates the application itself by storing database queries, sessions and computed results in RAM.
The strength of a multi-tier architecture lies in redundancy: even if one layer produces no cache hit, the next one takes over. With a typical website achieving a 95 percent Varnish hit rate, only 5 percent of requests reach the application server -- and there, Redis accelerates processing by a further 60 to 80 percent. The result is a consistently fast website that remains stable even during traffic spikes.
Varnish Cache: HTTP Acceleration in the Fast Lane
Varnish is an HTTP accelerator that sits as a reverse proxy in front of the web server and caches HTTP responses in memory. On a cache hit, Varnish serves the stored response directly without contacting the underlying web or application server. Response time on a cache hit is typically 1 to 5 milliseconds -- compared to 200 to 500 milliseconds for a dynamically generated page. According to W3Techs, Varnish is used by 2.3 percent of all websites, including Wikipedia, The Guardian and many high-traffic e-commerce platforms (Source: W3Techs, 2025).
Varnish is configured via the Varnish Configuration Language (VCL), a domain-specific language that provides precise control over caching behavior. VCL defines which requests are cached, how long cache entries are valid, how cookies are handled and under what conditions the cache is invalidated. VCL's flexibility is both its strength and challenge: a misconfigured VCL can result in either too little caching (low hit rate) or personalized content being served to the wrong users.
For content websites and blogs, VCL configuration is relatively straightforward: all GET requests without session cookies are cached, POST requests and pages with Set-Cookie headers are passed through. For Shopware-based shops and other e-commerce systems, configuration becomes more complex as it must distinguish between public pages (categories, product details for non-logged-in users), partially personalized pages (cart icon, mini-cart) and fully personalized pages (checkout, customer account).
Grace Mode
Serves expired cache entries while fetching a fresh version in the background. Prevents waiting times on cache misses and protects the backend during traffic spikes.
Cache Tags
Enables targeted invalidation of individual resources instead of complete cache clearing. When a product changes, only the affected pages are invalidated.
Edge Side Includes
ESI splits pages into cacheable and dynamic fragments. The page frame is cached; personalized elements like the shopping cart are inserted per request.
Redis: The Application Cache for Database and Session
Redis (Remote Dictionary Server) is an in-memory data store that sits as an application cache between application logic and the database. Unlike Varnish, which stores complete HTTP responses, Redis caches individual data fragments: database query results, sessions, computed values and serialized objects. Access time is 0.1 to 0.5 milliseconds -- hundreds of times faster than a database query, which typically takes 5 to 50 milliseconds.
The most important use of Redis in web applications is the session cache. By default, PHP applications store sessions as files on disk, which leads to I/O bottlenecks with many concurrent users and does not scale in cluster environments with multiple application servers. Redis solves both problems: sessions are stored in RAM (fast) and are accessible from all application servers (scalable). For a professional server infrastructure, Redis as session handler is standard.
The object cache stores the results of expensive computations and database queries. Instead of executing the same database query on every page load -- such as the category hierarchy, navigation structure or configuration values -- the result is computed once and stored in Redis. Subsequent requests read the result from Redis in under a millisecond. This technique is particularly effective for queries that are frequently executed and rarely change -- which applies to the majority of configuration queries.
Redis also offers specialized data structures that go beyond simple key-value caching: Sorted Sets for real-time leaderboards and rankings, Pub/Sub for communication between application servers during cache invalidation, Streams for event sourcing and message queue functionality, and Lua scripting for atomic server-side operations. This versatility makes Redis a central infrastructure component that extends far beyond pure caching.
Browser Cache: The First and Fastest Caching Layer
The browser cache is the fastest caching layer because it stores resources locally on the user's device. A cache hit in the browser cache requires no network request -- the resource is loaded from local storage in microseconds. Correct configuration of browser cache headers is critical for performance on repeat visits and when navigating between pages.
The two most important HTTP headers for browser caching are Cache-Control and ETag. Cache-Control defines how long a resource is valid in the browser cache and whether it must be revalidated. For static assets like CSS, JavaScript and images that are versioned via content hashing, we recommend Cache-Control: public, max-age=31536000, immutable -- this allows the browser to cache the resource for a year without ever revalidating it. When changes occur, the filename changes (hash), so the new version is automatically loaded.
For HTML documents, the strategy differs: Cache-Control: public, max-age=0, must-revalidate combined with an ETag or Last-Modified header. On each request, the browser checks with the server whether the document has changed (conditional request). If it has not changed, the server responds with a lightweight 304 Not Modified instead of retransmitting the entire document. The validation itself takes only a few milliseconds -- significantly less than fully loading the document.
The stale-while-revalidate mechanism combines the best of both worlds: the browser displays cached content immediately (no waiting) and updates it in the background. Cache-Control: public, max-age=3600, stale-while-revalidate=86400 means: the resource is considered fresh for one hour. After that, it is still served from cache for 24 hours while a fresh version is fetched in the background. For content websites, this pattern is ideal as users never have to wait and still always see relatively current content.
CDN Caching: Global Delivery with Edge Nodes
A CDN (Content Delivery Network) stores copies of resources on edge servers geographically close to the user. While physical distance to the origin server causes network latency -- typically 50 to 200 milliseconds between continents -- an edge server delivers content with a latency of 5 to 20 milliseconds. According to W3Techs, 68 percent of the top 1,000 websites use a CDN (Source: W3Techs, 2025).
CDN configuration for a multi-tier caching architecture requires clear rules for interplay with Varnish and the browser cache. The CDN respects the origin server's Cache-Control headers and stores resources accordingly. For static assets with long cache times, the CDN acts as a global distributor. For HTML documents cached via Varnish, the CDN acts as an additional caching layer with its own TTL, which is typically shorter than the Varnish TTL.
A critical aspect is CDN cache invalidation. When content changes on the origin server, cached copies on all edge servers must be updated. Most CDN providers offer a purge API through which individual URLs or URL patterns can be invalidated. For frequently updated websites, we recommend a combination of short TTLs (5 to 60 minutes) and stale-while-revalidate, so edge servers serve expired content while fetching a fresh version in the background.
Cache Invalidation: The Hardest Problem in Caching
Phil Karlton is credited with the quote: 'There are only two hard things in computer science: cache invalidation and naming things.' Cache invalidation is so complex because it must reconcile two contradictory goals: content should be cached as long as possible (performance), but changes should become visible as quickly as possible (freshness).
The simplest invalidation strategy is time-based (TTL): cache entries automatically expire after a defined lifetime. This strategy is simple to implement but has the disadvantage that changes only become visible after the TTL expires. For content that rarely changes (legal pages, static pages), TTLs of hours or days are acceptable. For frequently updated content (product prices, stock availability), short TTLs of minutes or even seconds are necessary.
The more elegant solution is tag-based invalidation: each cache entry is tagged with descriptors of its dependencies. A product page receives tags like product-123, category-shoes and brand-nike. When product 123 changes, all cache entries with the tag product-123 are invalidated -- regardless of whether it is the product detail page, a category listing or a search result cache. Varnish supports tag-based invalidation natively via xkey VCL modules.
For Shopware-based shops, the framework provides integrated cache invalidation via HTTP cache tags. On every change to products, categories or CMS pages, Shopware automatically sends invalidation requests to the reverse proxy, which deletes the affected cache entries. This integration significantly reduces configuration effort but requires correct setup of HTTP cache configuration and Varnish VCL.
| Strategy | Complexity | Freshness | Use Case |
|---|---|---|---|
| TTL (time-based) | Low | Delayed (TTL-dependent) | Static content, assets |
| Tag-based (xkey) | Medium | Immediate after invalidation | CMS pages, product data |
| Purge API (URL-based) | Low | Immediate after purge | Individual pages, emergency |
| Ban (regex-based) | High | Immediate (lazy evaluation) | Mass changes |
| stale-while-revalidate | Low | Background update | Content websites, blogs |
| Content hash (immutable) | Low | Immediate (new hash) | CSS, JS, images with hash |
Cache Header Strategy: The Right Headers for Every Resource Type
A consistent cache header strategy is the foundation for effective caching at all levels. Each resource type requires different cache headers tailored to its respective change frequency and sensitivity. Incorrect header configuration leads to either frequent unnecessary revalidations (too short TTLs) or stale content for users (too long TTLs without invalidation).
For static assets with content hashing (CSS, JavaScript, images with hash in filename), we recommend: Cache-Control: public, max-age=31536000, immutable. Since the filename changes with every modification, the browser can cache the file indefinitely. For HTML documents: Cache-Control: public, max-age=0, must-revalidate with ETag header. For API responses with personalized data: Cache-Control: private, no-store. For semi-static content like blog articles: Cache-Control: public, max-age=3600, stale-while-revalidate=86400.
The Vary header is an often overlooked but critical component of cache strategy. It tells caches (browser, CDN, Varnish) that the response depends on specific request headers. Vary: Accept-Encoding is standard and ensures that Brotli and Gzip compressed versions are cached separately. Vary: Accept enables caching different image formats (WebP, AVIF) under the same URL. But caution: each additional Vary header multiplies the number of cache entries and can reduce the hit rate.
Varnish VCL: Practical Examples for Common Scenarios
VCL configuration determines the effectiveness of the Varnish cache. The four most important VCL subroutines are vcl_recv (incoming request), vcl_backend_response (response from backend), vcl_deliver (delivery to client) and vcl_hash (cache key calculation). In vcl_recv, the decision is made whether a request can be cached or should be passed directly to the backend.
A typical scenario is cookie stripping: many websites set tracking cookies (analytics, advertising) that prevent caching even though they are irrelevant for page output. The VCL configuration removes these cookies from the request before the cache key is calculated, significantly increasing the hit rate. Only functional cookies like session IDs and consent status are preserved.
Another common scenario is grace mode configuration: when a cache entry has expired and the backend server responds slowly or is unreachable, Varnish in grace mode serves the expired entry instead of making the user wait. The grace period is typically set to several hours to days, so the website remains functional from cache even during a complete backend outage. For business-critical websites, grace mode is an indispensable safeguard against outages.
Redis Configuration: Memory Management and Eviction Policies
Redis stores all data in memory, which explains its speed but also makes capacity planning critical. The maxmemory configuration limits Redis memory consumption. The eviction policy determines what happens when the memory limit is reached. For caching scenarios, allkeys-lru (Least Recently Used) is the recommended policy: Redis removes the entries that have not been accessed for the longest time to make room for new ones.
Memory sizing depends on the use case. For a session cache with an average of 1 KB per session and 10,000 concurrent sessions, 16 MB is sufficient. For an object cache of a mid-sized website with 5,000 cached queries at an average of 5 KB per entry, 32 to 64 MB are needed. For a fragment cache in e-commerce shops with thousands of product fragments, we recommend 128 to 512 MB. Actual usage can be monitored with redis-cli info memory.
For production environments, we recommend a separate Redis instance per use case: one instance for sessions (with persistence via RDB snapshots), one instance for the object cache (without persistence, as data can be reconstructed from the database on demand) and optionally one instance for message queues. This separation prevents a memory bottleneck in the object cache from evicting session data and vice versa.
Monitoring and Performance Measurement of Caching Layers
The effectiveness of a caching architecture must be continuously measured. The most important metric for Varnish is the cache hit rate -- the proportion of requests answered from cache. A hit rate below 80 percent indicates configuration problems, such as faulty cookie handling or overly restrictive VCL rules. varnishstat and varnishtop provide real-time statistics on hit/miss ratio, backend requests and cache evictions.
For Redis, the key metrics are memory usage (actual vs. configured memory), hit rate (keyspace_hits / (keyspace_hits + keyspace_misses)), eviction rate (number of evicted entries) and latency. redis-cli info stats and redis-cli --latency-history provide this data. A sudden increase in eviction rate signals that configured memory is no longer sufficient.
TTFB measurement on the user side shows the combined effect of all caching layers. The Chrome User Experience Report (CrUX) field data provides TTFB distribution for different device types and connection speeds. A continuous performance monitoring setup combines these server-side and client-side metrics and automatically alerts on regressions.
Caching for E-Commerce: Uniting Personalization and Performance
E-commerce websites present special requirements for caching architecture because many pages contain personalized elements: cart icon with item count, customer-specific prices, personalized recommendations and availability displays. This personalization prevents naive full-page caching since the same URL contains different content for different users.
The solution is Edge Side Includes (ESI). ESI splits the page into cacheable and dynamic fragments. The page frame -- navigation, footer, product description, static content -- is cached in Varnish. Dynamic elements like the mini-cart, the price for logged-in customers and stock availability are marked as ESI tags and inserted individually with each request. Varnish supports ESI natively and processes fragment assembly with minimal latency.
An alternative strategy is AJAX-based lazy loading: the base page is fully cached and personalized elements are loaded via JavaScript requests after the page renders in the browser. This approach is simpler to implement than ESI but generates additional HTTP requests and can cause brief flickering of personalized content. For Shopware-based shops, we recommend a combination: ESI for critical personalized elements (prices) and AJAX for non-critical ones (recently viewed, recommendations).
Caching as the Foundation of Sustainable Web Performance
A well-designed caching architecture is the foundation of every performant website and a key factor for good Core Web Vitals. Varnish as an HTTP accelerator serves cached pages in milliseconds, Redis accelerates the application layer through in-memory caching and the browser cache eliminates unnecessary network requests entirely. Combined with a CDN for global delivery and an optimized frontend delivery, the result is a multi-tier architecture that enables response times under 100 milliseconds and remains stable even during traffic spikes.
The key to success lies in coordinating the layers: consistent cache headers that are respected equally by browser, CDN and Varnish. A thoughtful invalidation strategy that propagates changes quickly without unnecessarily reducing hit rates. And continuous monitoring that detects regressions before they impact user experience. With the right server optimization and caching strategy, every website becomes measurably faster -- and stays that way long-term.