Cloudflare's AI Data Agent: Town Lake & Skipper
Alps Wang
May 29, 2026 · 1 views
Unified Data, Intelligent Access
Cloudflare's approach to building Town Lake and Skipper is commendable for its ambition and its commitment to leveraging their own infrastructure. The 'governance by construction' model, where tables are inaccessible until reviewed, is a robust security and privacy measure, especially given Cloudflare's massive data scale. The multi-layered context for Skipper is crucial for mitigating LLM hallucinations, a common pitfall in AI data agents. The 'Code Mode' for Skipper, utilizing JavaScript in sandboxed Workers, is a particularly innovative implementation detail that offers efficiency and audibility, showcasing a sophisticated blend of LLM capabilities with serverless execution.
However, the inherent complexity of such a system, while necessary, presents a significant barrier to entry for smaller organizations. The reliance on extensive in-house tooling (Town Lake, Skipper, Workers AI, R2, etc.) means this solution is deeply tied to the Cloudflare ecosystem. While this is a strategic advantage for Cloudflare, it limits its direct applicability for companies not already invested in Cloudflare's platform. Furthermore, the success of 'governance by construction' hinges heavily on the efficiency and accuracy of the automated PII scanning (Skimmer) and the responsiveness of the review process. Any bottlenecks here could significantly impede data accessibility, even for legitimate use cases. The article also hints at the ongoing evolution of data models and the need for human annotations, suggesting that achieving perfect AI understanding of complex business data remains an iterative and challenging process.
Key Points
- Cloudflare has built a unified data analytics platform called Town Lake, offering a single SQL interface to all its data, sourced from dozens of disparate systems.
- Skipper is an AI data agent that sits on top of Town Lake, allowing users to query data using natural language.
- The platform employs a 'governance by construction' model, prioritizing security and privacy by requiring data review before access, automated via PII scanning with Skimmer.
- Town Lake utilizes Apache Trino for its query engine, R2 for object storage with Iceberg for data cataloging, and Cloudflare's own Workers for compute.
- Skipper leverages multiple layers of context (schema, human annotations, code-derived knowledge, curated models, runtime introspection) to ensure accurate, grounded answers and mitigate LLM hallucinations.
- An innovative 'Code Mode' for Skipper allows the LLM to write JavaScript snippets that programmatically call its toolset, enabling complex multi-step workflows in a single round-trip for efficiency and auditability.

📖 Source: How we built Cloudflare's data platform and an AI agent on top of it
Related Articles
Comments (0)
No comments yet. Be the first to comment!
