Google Rebuilds Data Stack for AI Agents, Not Humans

Google announced the Agentic Data Cloud at Cloud Next, a rebuilt data architecture designed for AI agents taking autonomous actions rather than humans running scheduled queries. The platform consists of three components: Knowledge Catalog (which automates metadata curation using agents), a cross-cloud lakehouse enabling BigQuery to query Iceberg tables on AWS S3 via private networks with no egress fees, and a Data Agent Kit that lets engineers describe outcomes instead of writing pipelines. The shift reflects a fundamental move from human-scale to agent-scale operations, where data platforms must evolve from systems of intelligence into systems of action.
TL;DR
- →Google's Agentic Data Cloud redesigns enterprise data architecture for autonomous AI agents operating 24/7, not just human analysts running periodic queries
- →Knowledge Catalog automates semantic metadata curation across the full data estate using agents, eliminating manual data steward bottlenecks that previously limited coverage to curated subsets
- →Cross-cloud lakehouse via Apache Iceberg format allows BigQuery to query S3 data with no egress fees and comparable performance to native AWS warehouses, with bidirectional federation to Databricks, Snowflake, and AWS Glue
- →Data Agent Kit integrates into VS Code, Claude Code, and Gemini CLI to let engineers describe desired outcomes rather than write data pipelines, shifting from imperative to declarative data work
Why it matters
Enterprise data infrastructure was optimized for human decision-making workflows, but AI agents now operate autonomously at scale. Google's redesign addresses a real architectural mismatch: traditional catalogs require manual stewardship and don't scale to full data estates, federation APIs limit optimization, and pipeline-based workflows don't fit agent-driven action loops. This signals the industry is moving from reactive intelligence (humans interpret data) to active systems (agents execute decisions).
Business relevance
For operators and founders, this means data governance and activation bottlenecks that previously required large steward teams can now be automated, and multi-cloud data access becomes frictionless without egress penalties. Companies can deploy AI agents that act on real-time data across their entire infrastructure without rebuilding pipelines or maintaining separate data copies, reducing both operational overhead and latency in agent decision-making.
Key implications
- →Data catalog vendors face pressure to shift from manual curation to agentic automation, or risk becoming obsolete as enterprises adopt platforms that scale metadata management automatically
- →Cross-cloud data access without egress fees undermines AWS's cost advantage in data egress and forces competitors to match pricing and performance on interoperability
- →The shift from pipeline-writing to outcome-description represents a significant change in data engineering skill requirements, favoring engineers who can work with agents and natural language over those focused on ETL orchestration
What to watch
Monitor whether competitors (Databricks, Snowflake, AWS) announce similar agentic data platforms and how quickly they achieve feature parity on cross-cloud federation and automated governance. Watch adoption rates among enterprises with multi-cloud deployments and whether the Data Agent Kit gains traction as a development pattern, which would signal broader industry acceptance of declarative data work over imperative pipeline design.
vff Briefing
Weekly signal. No noise. Built for founders, operators, and AI-curious professionals.
No spam. Unsubscribe any time.



