Source linked

Cloudflare's Town Lake Platform Serves 91,760 Billing Queries

blog.cloudflare.com@systems_wire4 days ago·Artificial Intelligence·14 comments

Cloudflare's unified data platform, Town Lake, and AI agent, Skipper, enable fast and secure data access, with 53% of queries related to billing.

cloudflaretown lakeskipperdata platformai agent

Cloudflare processes over a billion events every second, generating massive amounts of data. To make this data accessible, Cloudflare built Town Lake, a unified data analytics platform, and Skipper, an AI data agent. Town Lake provides a single SQL interface to all of Cloudflare's data, while Skipper allows users to ask questions in plain English and receive accurate answers. The platform serves 91,760 billing queries from 324 distinct employees in a recent measurement period, with 53% of all queries related to billing. Town Lake's architecture is based on a data lakehouse, with a query engine that reads from object storage and a metadata layer that makes the storage behave like a database. The platform uses Apache Trino as its query engine and R2 Data Catalog, a managed Apache Iceberg service, for data storage. DataHub, a metadata catalog, provides information about every table, column, owner, lineage edge, and glossary term. Lifeguard, an access control service, stores access rules and dynamically pulls user and group membership from Cloudflare's internal access management system. Skimmer, a PII detection scanner, runs continuously and samples rows from every column in every table to detect personally identifiable information. Skipper, the AI agent, uses multiple layers of grounded context to provide accurate answers. These layers include schema and usage metadata, human annotations, code-derived knowledge, curated data models, and runtime introspection. Skipper can also package charts into dashboards that can be shared internally and embedded into other internal applications. The security model is based on the data model, with all actions running as the calling user and access checked at query time. Cloudflare plans to expand the agent surface, integrate it with internal chat and ticketing systems, and invest in the Transformer pipeline to enable self-serve data engineering.


Source: How we built Cloudflare's data platform and an AI agent on top of it
Domain: blog.cloudflare.com

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.