Alibaba cuts agent token use 99% with smarter tool routing

Alibaba researchers developed SkillWeaver, a framework that reduces token consumption by over 99% when routing AI agents to the correct tools from large libraries. The system uses a three-stage process (decompose, retrieve, compose) combined with Skill-Aware Decomposition to iteratively fetch and evaluate relevant tools rather than exposing agents to entire tool catalogs. This addresses a core challenge in enterprise AI systems where agents must orchestrate multiple tools to complete complex, multi-step workflows.
TL;DR
- SkillWeaver breaks complex user queries into sub-tasks, retrieves candidate tools via embedding comparison, and composes them into executable plans as directed acyclic graphs
- Skill-Aware Decomposition uses a feedback loop to iteratively fetch and vet tool candidates rather than selecting tools in a single pass
- Token consumption drops over 99% compared to exposing agents to entire tool libraries, while accuracy increases
- The framework addresses compositional skill routing, where real-world business requests require sequencing multiple tools rather than selecting one
Why It Matters
Enterprise AI agents increasingly need to coordinate across hundreds of tools and skills to complete multi-step workflows. Exposing agents to entire tool libraries is inefficient, consumes hundreds of thousands of tokens, and overwhelms context limits. SkillWeaver's approach to iterative tool retrieval and composition directly solves this scaling bottleneck.
Business Impact
For organizations deploying AI agents in production, token efficiency directly impacts operational costs and latency. The 99% reduction in token consumption while improving accuracy makes multi-tool orchestration economically viable at scale. This enables agents to autonomously handle complex business operations like data pipeline management and report generation without manual intervention.
Key Implications
- Task decomposition granularity emerges as the primary bottleneck in tool routing accuracy, shifting focus from single-tool selection to compositional planning
- Iterative retrieval and vetting of tool candidates outperforms one-shot tool selection approaches, suggesting future frameworks should incorporate feedback loops
- Compatibility checking between tools becomes critical as agents sequence multiple skills, requiring systems to validate inter-skill data flow
What to Watch
Monitor adoption of SkillWeaver and similar compositional routing frameworks in enterprise AI deployments. Watch for how organizations implement task decomposition strategies and whether iterative tool retrieval becomes standard practice. Track whether token efficiency gains translate to measurable cost reductions and performance improvements in production AI agent systems.
Subscribe to the newsletter
The latest stories and analysis, delivered to your inbox.
Free. No spam. Unsubscribe any time.


