NewsTrending

NVIDIA Blackwell Leads First Agentic AI Benchmark

Shruti KoparkarJun 13, 2026 · about 2 months ago

Artificial Analysis released AgentPerf, the first benchmark designed specifically for agentic AI workloads, showing NVIDIA's Blackwell Ultra NVL72 platform delivering 20x more agents per megawatt than Hopper-based systems. The benchmark reflects the fundamentally different performance characteristics of agentic AI, which chains dozens to hundreds of LLM calls with tool execution rather than single-turn completions. Results are based on real coding agent trajectories across 12+ programming languages, providing infrastructure providers and enterprises with direct metrics for deployment decisions.

TL;DR

AgentPerf is the first benchmark built specifically for agentic AI, measuring concurrent agent capacity and responsiveness rather than single LLM call speed
NVIDIA GB300 NVL72 runs up to 20x more agents per megawatt than HGX H200 systems on DeepSeek V4 Pro workloads
Agentic AI differs fundamentally from conversational AI: agents chain dozens to hundreds of LLM calls with tool calls, creating multiplicative complexity rather than additive
Benchmark methodology uses real coding agent trajectories from public repositories, with tool calls simulated to isolate accelerated computing performance

Why It Matters

Existing AI inference benchmarks measure single LLM calls and were not designed for agentic workloads, where chained calls, tool delays, and growing context create fundamentally different performance stresses. AgentPerf fills this gap by measuring what actually matters for production agentic AI: concurrent agent capacity and responsiveness at scale. This enables infrastructure providers and enterprises to make informed deployment decisions based on real-world agentic patterns.

Business Impact

For enterprises deploying AI agents at scale, infrastructure efficiency directly impacts cost per concurrent agent and power consumption. AgentPerf translates benchmark results into actionable metrics: how many concurrent agentic tasks can run per accelerator and per megawatt of power. NVIDIA's 20x advantage on this benchmark could significantly influence infrastructure purchasing decisions for agentic AI deployments.

Key Implications

Agentic AI performance cannot be accurately assessed using conversational AI benchmarks, creating demand for specialized measurement tools and potentially invalidating prior infrastructure comparisons
NVIDIA's Blackwell architecture appears optimized for agentic workloads through rack-scale GPU coordination, CUDA kernel optimization for expert distribution, and TensorRT LLM efficiency gains
Infrastructure decisions for agentic AI deployments will increasingly be based on concurrent agent capacity and power efficiency rather than raw inference speed metrics

What to Watch

Monitor whether other accelerator providers publish AgentPerf results and how their performance compares to NVIDIA's baseline. Watch for adoption of AgentPerf as an industry standard for agentic AI infrastructure evaluation. Track whether the 20x efficiency advantage translates into actual market share gains for Blackwell in agentic AI deployments.

AI Hardware AI Agents Infrastructure

Subscribe to the newsletter

The latest stories and analysis, delivered to your inbox.

Free. No spam. Unsubscribe any time.

Chinese AI Startup Moonshot Trained K3 on Restricted Nvidia Chips

Beijing-based AI startup Moonshot has trained its Kimi K3 model, the world's largest open-source model with 2.8 trillion parameters, using Nvidia's advanced Blackwell chips despite U.S. export restrictions on such technology to Chinese firms. The company is now seeking additional Blackwell chips to develop Kimi K4, a significantly larger successor model. The situation highlights the tension between U.S. chip export controls and Chinese AI development capabilities.

by The Information Staffabout 3 hours ago· The Information

AI HardwareTrendingNews

NVIDIA Jetson Brings Generative AI to Handheld Robotics

NVIDIA is promoting its Jetson platform for edge AI and robotics as a compact, portable solution for developers building AI-powered robots and autonomous systems. The Jetson Orin Nano Super, highlighted by venture capitalist Sarah Guo, delivers 67 trillion operations per second of AI performance in a handbag-sized form factor. The platform targets students, researchers, and developers across classrooms, labs, and makerspaces with tools for computer vision, AI agents, and edge deployment.

by Matthew Leibabout 3 hours ago· NVIDIA Blog (AI)

AI HardwareNews

Chinese RAM Maker CXMT Debuts at $484B Valuation

ChangXin Memory Technologies (CXMT), a China-based RAM manufacturer, debuted on the Shanghai stock exchange with shares surging 466 percent on its first day, reaching a $484 billion valuation and becoming the most valuable Chinese company listed there. The company aims to compete with Samsung, Micron, and SK Hynix in the global memory market, capitalizing on demand from AI companies straining the industry and pushing device makers to seek cost alternatives.

by Emma Roth1 day ago· The Verge AI

AI HardwareNews

Nvidia Invests $1B in Naver to Expand South Korea AI Data Centers

Nvidia is investing $1 billion in South Korean internet company Naver to expand the country's AI data center infrastructure. The investment will finance Naver's plans to increase its AI data center capacity from 55 megawatts to 200 megawatts. The deal reflects Nvidia's strategy to build AI compute capacity outside the United States and positions South Korea as a regional hub for AI infrastructure.

by Henry Siu1 day ago· The Information