Lana Zhang

2 articles on VFF - The signal in the noise

AWS Details Modular Voice Agent Design for Production Scale

Amazon has published a technical guide on building scalable voice agents using Nova Sonic, a speech-to-speech foundation model, combined with Bedrock AgentCore Runtime and the open source Strands Agents framework. The post outlines three architectural patterns: tool-driven agents, sub-agents acting as tools, and session segmentation strategies that decompose large assistants into specialized, reusable components. The approach addresses common production challenges like latency, real-time audio management, and multi-agent coordination by leveraging serverless hosting, bidirectional WebSocket streaming, microVM-level isolation, and persistent memory across sessions.

by Lana Zhang2 months ago· AWS Machine Learning Blog

Source

Voice & Video AINews

Text Agents and Voice Agents Are Different Problems

AWS published guidance on migrating text-based AI agents to voice assistants using Amazon Nova 2 Sonic, emphasizing that the two require fundamentally different architectural approaches. The post details key differences across user input handling, response style, latency requirements, turn-taking mechanics, and transport protocols, then provides design patterns and a reusable skill for developers to automate the conversion process. Voice agents demand real-time bidirectional streaming, ultra-low latency, natural turn-taking with interruption support, and concise spoken responses, whereas text agents tolerate higher latency and deliver rich formatted content.

by Lana Zhang3 months ago· AWS Machine Learning Blog

Source