Cloudflare's Gen 13: 2x Edge Compute via Software Rewrite
Alps Wang
Mar 24, 2026 · 1 views
The Cache vs. Cores Conundrum
Cloudflare's announcement of their Gen 13 servers, powered by AMD EPYC™ 5th Gen Turin processors, highlights a critical industry trend: the shift from cache-heavy architectures to core-dense designs for edge computing. The core insight is the successful mitigation of latency issues, previously a significant bottleneck when migrating to higher core counts with reduced L3 cache per core, through a complete rewrite of their request handling layer (FL1 to FL2) in Rust. This demonstrates a sophisticated understanding of hardware-software co-design, where architectural limitations of new hardware are overcome by fundamental software re-architecture rather than just incremental tuning.
The innovation lies in the FL2 rewrite, built on Pingora and Oxy frameworks, which fundamentally altered memory access patterns and reduced reliance on large L3 caches. This allowed Cloudflare to fully leverage the 2x core count of the Turin processors without the prohibitive latency penalties that plagued the NGINX/LuaJIT-based FL1. The resulting 2x throughput gain and 50% improvement in performance per watt are significant achievements for an infrastructure provider operating at Cloudflare's scale. The article effectively uses performance counters and profiling data to diagnose the problem and validate the solution, adding strong technical credibility.
However, a potential limitation is the significant engineering effort required for the FL2 rewrite. While justified by the performance gains and future-proofing, this level of re-architecture is not feasible for all organizations. The article also focuses heavily on Cloudflare's specific use case, and while transferable, the practical application of FL2's principles to other domains would require further investigation. The reliance on AMD's Turin architecture also means that potential users are tied to that specific hardware ecosystem. Nevertheless, the article provides a compelling case study for how organizations can push the boundaries of edge computing performance through deep hardware-software integration.
Key Points
- Cloudflare's Gen 13 servers leverage AMD EPYC™ 5th Gen Turin processors with a focus on increased core count over L3 cache.
- The legacy FL1 request handling layer (NGINX/LuaJIT) struggled with increased latency due to reduced cache per core on Gen 13 hardware.
- A complete rewrite of the request handling layer to FL2 in Rust, utilizing Pingora and Oxy frameworks, resolved the cache dependency.
- FL2 enables linear scaling of performance with core count, achieving 2x edge compute performance and 50% better performance per watt compared to Gen 12.
- This hardware-software co-design approach demonstrates a strategic shift to prioritize throughput and efficiency by adapting software to hardware advancements.

📖 Source: Launching Cloudflare’s Gen 13 servers: trading cache for cores for 2x edge compute performance
Related Articles
Comments (0)
No comments yet. Be the first to comment!
