Context Platforms Replace RAG as Agents Overwhelm Legacy Retrieval

Redis launched Iris, a context and memory platform designed to handle the data retrieval demands of agentic AI systems. Unlike traditional RAG pipelines built for human-scale queries, Iris combines real-time data ingestion, semantic interfaces that auto-generate agent tools, and a flash-based storage engine to manage the orders of magnitude more data requests that AI agents generate compared to human users. The move reflects a broader market shift away from off-the-shelf RAG solutions toward custom, hybrid retrieval stacks as enterprises struggle with the structural mismatch between agent-scale workloads and legacy retrieval infrastructure.
TL;DR
- →Redis Iris addresses a structural problem: AI agents generate orders of magnitude more data requests than human users, overwhelming retrieval pipelines designed for single-query human interaction
- →The platform includes five components: Data Integration (GA), Context Retriever (preview), Agent Memory (preview), and Redis Flex, a rewritten storage engine running 99% of data on flash at one-tenth the cost of in-memory storage
- →Enterprise adoption data shows hybrid retrieval adoption tripling from 10.3% to 33.3% in Q1 2026, with custom in-house retrieval stacks rising from 24.1% to 35.6% as companies outgrow off-the-shelf options
- →The shift inverts classic RAG logic: instead of pre-fetching and stuffing data into pipelines, agents pull data at runtime through semantic interfaces with row-level access controls
Why it matters
The AI infrastructure market is experiencing a fundamental transition as agentic systems expose the limitations of RAG architectures designed for single-query workloads. This shift signals that enterprise AI deployment is moving beyond chatbots toward autonomous agents that require persistent memory, real-time data access, and orders of magnitude higher throughput than traditional retrieval systems can provide. The market data showing tripling adoption of hybrid retrieval and rising custom stack development indicates this is not a niche problem but a widespread infrastructure gap.
Business relevance
For operators and founders, this reflects a critical inflection point where off-the-shelf RAG solutions are becoming inadequate for production agent deployments. Companies building or deploying AI agents will need to invest in custom retrieval infrastructure or adopt platforms like Iris to avoid performance bottlenecks that prevent agents from functioning reliably. The cost advantage of flash-based storage (one-tenth the cost of in-memory) also creates a material economics shift that could reshape infrastructure spending priorities.
Key implications
- →RAG as a standalone architectural pattern is insufficient for agentic AI, forcing a market-wide rearchitecture toward context platforms that handle agent-scale data demands
- →The shift from push-based (pre-fetching) to pull-based (runtime) data retrieval requires semantic data models and auto-generated interfaces, creating new dependencies on data modeling and MCP tool generation
- →Flash-based storage economics (99% on SSD, 1% in RAM) could become the default for agent memory systems, reducing infrastructure costs and enabling petabyte-scale deployments that were previously cost-prohibitive
What to watch
Monitor whether other infrastructure vendors follow Redis in repositioning around agent context layers and how quickly enterprises adopt hybrid retrieval approaches. Watch for consolidation around semantic data modeling standards and MCP tool generation, as these become critical bottlenecks. Track whether custom in-house retrieval stacks continue rising as a percentage of enterprise deployments, which would indicate that platform solutions are still not meeting production requirements.
vff Briefing
Weekly signal. No noise. Built for founders, operators, and AI-curious professionals.
No spam. Unsubscribe any time.



