VFF - The signal in the noise
News

Context Platforms Replace RAG as Agents Overwhelm Legacy Retrieval

Read original
Share
Context Platforms Replace RAG as Agents Overwhelm Legacy Retrieval

Redis launched Iris, a context and memory platform designed to handle the data retrieval demands of agentic AI systems. Unlike traditional RAG pipelines built for human-scale queries, Iris combines real-time data ingestion, semantic interfaces that auto-generate agent tools, and a flash-based storage engine to manage the orders of magnitude more data requests that AI agents generate compared to human users. The move reflects a broader market shift away from off-the-shelf RAG solutions toward custom, hybrid retrieval stacks as enterprises struggle with the structural mismatch between agent-scale workloads and legacy retrieval infrastructure.

  • Redis Iris addresses a structural problem: AI agents generate orders of magnitude more data requests than human users, overwhelming retrieval pipelines designed for single-query human interaction
  • The platform includes five components: Data Integration (GA), Context Retriever (preview), Agent Memory (preview), and Redis Flex, a rewritten storage engine running 99% of data on flash at one-tenth the cost of in-memory storage
  • Enterprise adoption data shows hybrid retrieval adoption tripling from 10.3% to 33.3% in Q1 2026, with custom in-house retrieval stacks rising from 24.1% to 35.6% as companies outgrow off-the-shelf options
  • The shift inverts classic RAG logic: instead of pre-fetching and stuffing data into pipelines, agents pull data at runtime through semantic interfaces with row-level access controls

The AI infrastructure market is experiencing a fundamental transition as agentic systems expose the limitations of RAG architectures designed for single-query workloads. This shift signals that enterprise AI deployment is moving beyond chatbots toward autonomous agents that require persistent memory, real-time data access, and orders of magnitude higher throughput than traditional retrieval systems can provide. The market data showing tripling adoption of hybrid retrieval and rising custom stack development indicates this is not a niche problem but a widespread infrastructure gap.

For operators and founders, this reflects a critical inflection point where off-the-shelf RAG solutions are becoming inadequate for production agent deployments. Companies building or deploying AI agents will need to invest in custom retrieval infrastructure or adopt platforms like Iris to avoid performance bottlenecks that prevent agents from functioning reliably. The cost advantage of flash-based storage (one-tenth the cost of in-memory) also creates a material economics shift that could reshape infrastructure spending priorities.

  • RAG as a standalone architectural pattern is insufficient for agentic AI, forcing a market-wide rearchitecture toward context platforms that handle agent-scale data demands
  • The shift from push-based (pre-fetching) to pull-based (runtime) data retrieval requires semantic data models and auto-generated interfaces, creating new dependencies on data modeling and MCP tool generation
  • Flash-based storage economics (99% on SSD, 1% in RAM) could become the default for agent memory systems, reducing infrastructure costs and enabling petabyte-scale deployments that were previously cost-prohibitive

Monitor whether other infrastructure vendors follow Redis in repositioning around agent context layers and how quickly enterprises adopt hybrid retrieval approaches. Watch for consolidation around semantic data modeling standards and MCP tool generation, as these become critical bottlenecks. Track whether custom in-house retrieval stacks continue rising as a percentage of enterprise deployments, which would indicate that platform solutions are still not meeting production requirements.

Share

Subscribe to the newsletter

The latest stories and analysis, delivered to your inbox.

Free. No spam. Unsubscribe any time.

Related stories

Alibaba cuts agent token use 99% with smarter tool routing
TrendingNews

Alibaba cuts agent token use 99% with smarter tool routing

Alibaba researchers developed SkillWeaver, a framework that reduces token consumption by over 99% when routing AI agents to the correct tools from large libraries. The system uses a three-stage process (decompose, retrieve, compose) combined with Skill-Aware Decomposition to iteratively fetch and evaluate relevant tools rather than exposing agents to entire tool catalogs. This addresses a core challenge in enterprise AI systems where agents must orchestrate multiple tools to complete complex, multi-step workflows.

by bendee983@gmail.com (Ben Dickson)· VentureBeat AI
Meta Launches Pocket App for AI-Generated Interactive Experiences
TrendingNews

Meta Launches Pocket App for AI-Generated Interactive Experiences

Meta has launched a new app called Pocket that lets users create and share interactive AI-generated experiences called 'gizmos' built from prompts. The app shares only a name with Mozilla's defunct read-it-later service Pocket, which shut down last year. The launch reflects CEO Mark Zuckerberg's stated vision of AI as the next evolution of social media, where users can build and distribute interactive AI-powered content.

by Jay Peters· The Verge AI
Zuckerberg: Meta's AI agents developing slower than expected
TrendingNews

Zuckerberg: Meta's AI agents developing slower than expected

Mark Zuckerberg told Meta staff at an internal meeting that the company's AI development efforts, particularly around AI agents, are progressing slower than he had anticipated. The statement signals a recalibration of expectations around a technology area Meta has invested heavily in. The disclosure comes as the AI industry broadly grapples with the gap between near-term capabilities and longer-term ambitions.

by Lucas Ropek· TechCrunch AI
Z.ai launches ZCode to undercut Cursor and Claude Code
TrendingNews

Z.ai launches ZCode to undercut Cursor and Claude Code

Z.ai, a Beijing-based AI lab, launched ZCode, a free desktop application designed as an agent-first development environment for its GLM-5.2 model. The tool competes directly with Cursor, Claude Code, GitHub Copilot, and Google's Antigravity in the AI coding market. ZCode's pricing undercuts competitors significantly, with plans starting at $16.20 per month, and includes features like remote control via WeChat and Feishu, reflecting the company's focus on the Chinese developer market.

by michael.nunez@venturebeat.com (Michael Nuñez)· VentureBeat AI