Server Performance That Makes the Difference
The fastest website is useless if the server is the bottleneck. We optimize your entire server infrastructure from DNS resolution through TLS negotiation to database queries, reducing Time to First Byte to under 200 milliseconds.
87ms
average TTFB after optimization (project experience)
50+
optimized server infrastructures
4x
typical response time improvement
99.9%
uptime across all managed projects
Server-side performance forms the foundation of every fast website. While frontend optimizations accelerate the visible layer, the server infrastructure determines how quickly the first byte reaches the browser. A slow Time to First Byte (TTFB) shifts all downstream metrics and directly impacts Core Web Vitals. We systematically analyze your server configuration and optimize every step of the delivery chain: DNS resolution, TLS handshake, reverse proxy, application logic, database queries and caching layers.
Why Time to First Byte Is Critical
Time to First Byte measures the time between the browser's HTTP request and the arrival of the first response byte. Google considers a TTFB under 800 milliseconds acceptable but recommends values under 200 milliseconds for optimal performance (web.dev, 2024). Every millisecond of TTFB delays the Largest Contentful Paint and thus the perceived loading speed. In our optimization projects, we typically reduce TTFB by 70 to 90 percent by eliminating bottlenecks across all layers of the server infrastructure. The impact is directly measurable: faster server responses improve not only the user experience but also crawling efficiency by search engines. A low TTFB signals to Google that your website is technically sound and can lead to more frequent and deeper indexing.
The Components of Server Response Time
Before the server can deliver an HTML page, every request passes through multiple stages. Each stage can become a bottleneck. Effective server optimization requires analyzing and improving all stages, not just the most obvious ones. The following components directly influence TTFB and form the framework of our optimization work.
DNS Resolution
DNS resolution translates the domain name into an IP address. Slow DNS servers or missing DNS caching strategies can add 50 to 300 milliseconds of latency. We configure fast DNS providers with anycast networks and implement DNS prefetching for critical third-party domains.
TLS Handshake
The TLS handshake for encrypted connections requires two roundtrips with TLS 1.2. With TLS 1.3, this reduces to one roundtrip, and for reconnections even to zero (0-RTT). We configure TLS 1.3, OCSP stapling and optimal cipher suites for minimal handshake times.
Reverse Proxy
An upstream reverse proxy like nginx or Varnish intercepts requests before they reach the application. For cacheable content, the proxy delivers responses directly from memory with response times under 5 milliseconds, without PHP, Python or Node.js even starting.
Application Logic
Application processing time depends on code efficiency, framework configuration and caching mechanisms in use. We profile the application with tools like Blackfire or Xdebug and identify function calls that consume disproportionate amounts of time.
Database Queries
Inefficient SQL queries are the most common cause of slow server responses. Missing indices, N+1 problems and unnecessary joins can delay individual page loads by seconds. We analyze every query with EXPLAIN and optimize systematically.
Caching Layers
Multi-tier caching from OPcache through application cache to HTTP cache reduces the work the server must do on repeated requests. Properly configured, the server answers over 90 percent of all requests from cache.
HTTP/2 and HTTP/3: Modern Transfer Protocols
The transfer protocol determines how efficiently browser and server communicate. While HTTP/1.1 can process only one request per connection at a time, HTTP/2 enables simultaneous transfer of multiple resources over a single connection (multiplexing). HTTP/3 goes further, building on the QUIC protocol that better compensates for packet loss and accelerates connection establishment. According to the Web Almanac 2024, 37 percent of all websites already use HTTP/3 (HTTP Archive, 2024). We configure your server infrastructure for both protocols and ensure that older clients seamlessly fall back to HTTP/2 or HTTP/1.1.
| Property | HTTP/1.1 | HTTP/2 | HTTP/3 (QUIC) |
|---|---|---|---|
| Multiplexing | No (1 request/connection) | Yes (many streams/connection) | Yes (without head-of-line blocking) |
| Header Compression | None | HPACK | QPACK |
| Connection Setup | TCP + TLS (3 roundtrips) | TCP + TLS (2-3 roundtrips) | QUIC (1 roundtrip, 0-RTT possible) |
| Server Push | Not available | Available | Available |
| Packet Loss Handling | Blocks all streams | Blocks all streams | Only affected stream blocked |
Compression: Brotli vs. Gzip
Text compression reduces the amount of data transferred between server and browser. Gzip has been the standard for over two decades, but Brotli offers 15 to 25 percent better compression ratios at comparable decompression speeds (Google Research, 2015). We configure Brotli as the primary compression format with Gzip as a fallback for older clients. For static assets, we use Brotli level 11 (maximum compression), precomputed during the build process. For dynamic content, we use Brotli level 4 to 6, balancing compression ratio against CPU load. In practice, this configuration reduces the transferred data for HTML, CSS and JavaScript by 70 to 85 percent compared to uncompressed files.
A frequently overlooked aspect is the compression of JSON API responses and SVG files. Both formats are text-based and benefit significantly from Brotli compression. In a typical Shopware shop with extensive API calls for product data and filter options, compressing JSON responses alone can reduce transferred data by 80 percent and noticeably improve load times.
CDN Strategies for Optimal Delivery
A content delivery network distributes your content to edge servers worldwide, reducing the physical distance between server and user. For websites with a primarily German audience, this means edge servers in Frankfurt, Amsterdam and Zurich with response times under 20 milliseconds. For international projects, a CDN shortens load times for overseas visitors by 200 to 500 milliseconds. We implement CDN strategies that go beyond simple asset caching and intelligently cache dynamic content as well.
Static Asset Caching
CSS, JavaScript, images and fonts are served through edge servers. Content-hash-based cache busting enables aggressive cache headers with one-year lifetimes without delaying updates.
Dynamic Edge Caching
Even HTML pages can be cached at the edge when the invalidation strategy is sound. Cache tags and surrogate keys enable targeted invalidation of individual pages on content changes without flushing the entire cache.
Edge-Side Optimization
Modern CDNs offer image optimization, HTML minification and automatic protocol selection directly at the edge. This reduces load on the origin server and further shortens response times for end users.
Reverse Proxy: Varnish and nginx as Accelerators
A reverse proxy is the single most effective measure for reducing server response time. Varnish and nginx can hold pre-rendered HTML pages in memory and deliver them on repeated requests in under 5 milliseconds without even contacting the application. In our 50+ optimization projects, we regularly achieve cache hit rates exceeding 90 percent with reverse proxy configurations (project experience). This means nine out of ten page views are answered directly from the proxy cache.
PHP-FPM Tuning: More Performance from the Runtime
PHP-FPM (FastCGI Process Manager) manages the PHP processes that execute your application. Misconfiguration leads either to wasted resources (too many processes) or wait times under load (too few processes). We size PHP-FPM pools based on actual memory consumption per request and expected concurrency. A typical PHP process requires 30 to 80 megabytes of memory, depending on the application and loaded extensions.
- Process Manager Mode: We choose between static (fixed number of workers), dynamic (scales between minimum and maximum) and ondemand (starts workers only when needed) based on your website's load profile
- Worker Sizing: The maximum number of concurrent PHP processes is calculated as: available RAM divided by average memory consumption per process, minus reserves for operating system, database and cache
- Timeout Configuration: request_terminate_timeout limits the maximum execution time per request and prevents individual hanging requests from permanently blocking workers
- Slow Log Analysis: The PHP-FPM slow log records all requests exceeding a configurable threshold, including stack traces. This data systematically identifies the slowest code paths
- Pool Separation: For websites with different requirement profiles, we configure separate PHP-FPM pools, e.g., one pool for frontend requests with a low memory limit and another for admin operations with higher resources
OPcache: Compiled PHP in Memory
OPcache stores precompiled PHP bytecode in shared memory, eliminating the parsing and compilation step on every request. Without OPcache, PHP must re-read and compile all involved files on every page load. With properly configured OPcache, PHP execution time is reduced by 50 to 70 percent (php.net, 2024). We configure OPcache with sufficient memory for all PHP files in your application and enable validation checks only in development environments. In production, file checking is disabled and the cache is explicitly flushed only during deployments.
PHP 8.x additionally offers JIT compilation (Just-In-Time), which translates bytecode into native machine code at runtime. For typical web applications, the gain from JIT is limited since most bottlenecks lie in I/O operations. For compute-intensive tasks such as image processing, PDF generation or complex price calculations, JIT can deliver noticeable improvements. We evaluate the JIT benefit individually and enable it only when measurable advantages exist. Furthermore, we use PHP preloading to load frequently used classes and frameworks into memory when the PHP-FPM process starts, further reducing initialization time per request.
MariaDB and MySQL: Optimizing Database Performance
The database is the primary performance bottleneck in most web applications. A single inefficient query can delay the entire page response by seconds. We systematically analyze your database layer and implement optimizations that improve both individual query performance and overall throughput.
Slow Query Analysis
The slow query log records all queries above a configurable threshold. We analyze the most frequent and slowest queries, evaluate their execution plans with EXPLAIN and implement targeted index strategies and query rewrites.
InnoDB Configuration
The InnoDB buffer pool is the most important tuning parameter. It should comprise 70 to 80 percent of available RAM to keep the entire working dataset in memory. We size buffer pool, redo log, flush method and I/O capacity for your specific workload.
Index Optimization
Missing or suboptimal indices are the most common cause of slow queries. We create composite indices for frequent WHERE combinations, covering indexes for read-intensive queries and partial indexes for tables with selective filter criteria.
Connection Pooling
Every connection establishment to the database costs time. Connection pooling keeps a defined number of connections open and assigns them to incoming requests. This eliminates the overhead of repeated connection setups and stabilizes performance under load.
Read Replicas
For read-intensive applications, we distribute read operations to replica servers. The primary server is relieved for write operations, and read capacity scales horizontally. Automatic routing in the application directs queries to the appropriate instance.
Query Cache Strategies
MariaDB's built-in query cache is counterproductive for most workloads. Instead, we implement application-level caching with Redis for frequently queried datasets and configure intelligent invalidation strategies.
Redis and Memcached: In-Memory Caching
In-memory caches like Redis and Memcached store frequently queried data in memory and deliver responses in under one millisecond. Compared to database queries that typically take 5 to 50 milliseconds, a cache hit accelerates data access by a factor of 10 to 100. We decide on a project basis which system is the best choice and implement multi-tier caching strategies covering session management, application cache and fragment caching.
Horizontal and Vertical Scaling
When optimization of individual components reaches its limits, scaling becomes necessary. Vertical scaling (more CPU, RAM, faster disks) is simple to implement but hits physical and economic boundaries. Horizontal scaling (more servers) offers theoretically unlimited capacity but requires adjustments to the application architecture. We advise which scaling strategy suits your project and implement the infrastructure accordingly.
Vertical Scaling
More resources for existing servers: CPU upgrade, RAM expansion, NVMe SSDs. Ideal as a first step when current hardware is not fully utilized. No application code needs to change, and implementation often takes just hours.
Horizontal Scaling
Distribution of load across multiple identical servers behind a load balancer. Sessions are externalized to Redis and the application becomes stateless. Offers nearly unlimited scaling and resilience through redundancy.
Service Separation
Splitting monolithic applications into dedicated services: web server, worker server, database server, cache server and search server. Each component scales independently based on its specific load profile.
Container Orchestration
Docker and Kubernetes enable automated scaling, self-healing and zero-downtime deployments. We configure horizontal pod autoscalers, resource limits and readiness probes for reliable operation under varying load.
SSL/TLS Optimization for Fast Connections
TLS configuration affects both the security and performance of your website. We implement a TLS configuration that combines maximum security with minimal latency. TLS 1.3 reduces the handshake from two roundtrips to one. Session tickets and session caching speed up repeated connections. OCSP stapling avoids the separate certificate status check with the certificate authority. The choice of modern cipher suites with ECDHE key exchange and ChaCha20-Poly1305 or AES-GCM encryption minimizes CPU load.
For HTTP/3, we configure QUIC, which integrates the TLS negotiation into the connection setup and thereby saves a complete roundtrip. On reconnections, 0-RTT enables immediate data transfer without a preceding handshake phase. We activate early data with appropriate replay protection measures to further reduce latency on return visits. The entire configuration is regularly reviewed against current best practices and known vulnerabilities.
DNS Optimization: The Often Overlooked Factor
DNS resolution stands at the beginning of every connection yet is frequently neglected. A DNS lookup can take between 5 and 300 milliseconds depending on the provider, geographic distance and cache state. We optimize DNS configuration on multiple levels.
- Fast DNS Providers: Migration to DNS providers with global anycast networks that answer queries from the nearest location, achieving resolution times under 20 milliseconds
- Optimal TTL Values: DNS records with sensible time-to-live values that enable caching without unnecessarily delaying changes. For stable records we recommend TTLs of 3600 seconds, for dynamic records shorter values
- DNS Prefetching: Via
<link rel="dns-prefetch">, the browser resolves third-party domains before the resource is requested. This eliminates DNS latency for external scripts, fonts and API calls - Preconnect: Via
<link rel="preconnect">, the browser establishes the full connection (DNS + TCP + TLS) to critical domains in advance. Useful for CDN domains and analytics endpoints - DNS Failover: Configuration of secondary DNS servers with automatic failover to ensure reachability even when the primary DNS provider experiences an outage
Server Monitoring and Performance Tracking
Performance optimization is not a one-time project but a continuous process. New features, traffic spikes, data imports and software updates can affect performance at any time. As our reference projects demonstrate, we implement monitoring systems that detect bottlenecks early and enable proactive action before users are affected.
Real-Time Metrics
CPU utilization, memory consumption, disk I/O, network throughput and PHP-FPM worker status are captured in real time. Dashboards show the current state and historical trends at a glance.
Application Performance Monitoring
APM tools like Blackfire or New Relic profile every request end-to-end and identify slow function calls, inefficient queries and memory-intensive operations directly in the application code.
Alerting and Escalation
Automatic notifications on threshold violations: TTFB above 500 ms, CPU above 80 percent, disk space below 10 percent. Escalation chains ensure that critical issues are addressed immediately.
The Server Optimization Process
Infrastructure Audit
Inventory of the current server configuration: operating system, web server, PHP version, database, caching layers, DNS and TLS. Measurement of baseline performance with synthetic and real-world tests as part of our technical analysis.
Before and After: Typical Server Optimization Results
The following values show typical improvements from our server optimization projects. Actual results depend on the initial state and application complexity (project experience).
| Metric | Before Optimization | After Optimization |
|---|---|---|
| Time to First Byte (TTFB) | 800 - 3,500 ms | 50 - 200 ms |
| PHP execution time per request | 400 - 1,200 ms | 60 - 180 ms |
| Database queries per page load | 80 - 250 | 15 - 45 |
| Cache hit rate (reverse proxy) | 0% (no cache) | 85 - 95% |
| Concurrent users (stable) | 100 - 300 | 1,000 - 5,000+ |
| Transferred data (compressed) | 1.2 - 3.5 MB | 250 - 600 KB |
Server Optimization as the Foundation for Frontend Performance