NewsTrending

Dell and NVIDIA Target Agentic AI Inference Economics

NVIDIA WritersMay 19, 2026 · about 2 months ago

Dell and NVIDIA announced new AI infrastructure at Dell Technologies World, positioning enterprise AI deployments at scale. Dell's updated AI Factory lineup includes the PowerEdge XE9812 with NVIDIA Vera Rubin NVL72 GPUs, claiming 10x lower cost-per-token for agentic AI inference compared to Blackwell, plus new CPU-based servers with NVIDIA Vera processors optimized for data pipelines and agent workloads. The announcements reflect a shift from AI pilots to production agentic deployments, with Dell projecting global AI infrastructure spending could reach 3-4 trillion dollars by 2030 and token consumption growing 3,400% in the same period.

TL;DR

Dell PowerEdge XE9812 with NVIDIA Vera Rubin NVL72 delivers 10x lower cost-per-token for agentic AI inference versus Blackwell
New PowerEdge servers with NVIDIA Vera CPUs complete agentic workloads 50% faster than x86 processors, with 3x faster SQL query throughput via Starburst data engine
Dell PowerRack integrates compute, networking, and storage as unified system with liquid cooling and co-packaged optics for enterprise-scale AI
5,000 enterprises including Lilly, Samsung, and Honeywell already running AI workloads on Dell AI Factories with NVIDIA

Why It Matters

Enterprise AI has moved beyond proof-of-concept into production agentic deployments, creating new infrastructure demands. The focus on cost-per-token efficiency and inference optimization signals that the market is shifting from training-centric to inference-centric workloads, where enterprises need to run agents and autonomous systems continuously at scale. This reflects a maturing AI market where operational efficiency and real-world deployment economics matter more than raw model capability.

Business Impact

For operators and founders building AI products, this infrastructure refresh directly impacts unit economics of agentic AI services. Lower cost-per-token and faster inference mean tighter margins can support more complex agent behaviors, while faster data query performance reduces latency in agent decision loops. Enterprises evaluating AI infrastructure now have clearer performance benchmarks and cost models for planning multi-year deployments.

Key Implications

Agentic AI inference is becoming a distinct workload category with different optimization requirements than training, driving specialized hardware and software stacks
Cost-per-token efficiency is now a primary competitive metric for AI infrastructure, shifting focus from peak performance to sustained operational economics
Integrated systems like PowerRack that bundle compute, networking, and storage may reduce deployment friction for enterprises, lowering barriers to scaling AI factories

What to Watch

Monitor whether the claimed 10x cost-per-token improvement and 50% performance gains on Vera hold up in independent benchmarks and real customer deployments. Track adoption rates among the 5,000 enterprises mentioned and watch for competitive responses from other infrastructure providers on inference optimization. Also observe whether agentic AI workloads actually drive the projected 3,400% token consumption growth or if that estimate proves conservative or optimistic.

Related Video

AI Hardware AI Agents Infrastructure

Subscribe to the newsletter

The latest stories and analysis, delivered to your inbox.

Free. No spam. Unsubscribe any time.

Nvidia Backs Neocloud Startups as Market Crowds

SoftBank announced a U.S. neocloud venture on Thursday, adding to hundreds of firms now competing in the AI server rental market. Together AI raised $800 million at an $8.3 billion valuation, while Nvidia said it will provide financial backing to younger cloud firms in exchange for a revenue share. The moves highlight intense competition in the sector, though Nvidia's backstop offer raises questions about the actual strength of demand for computing capacity.

by Martin Peers1 day ago· The Information

AI HardwareTrendingNews

Anthropic Pursues Custom AI Chip With Samsung

Anthropic is in early-stage talks with Samsung Electronics to manufacture a custom AI chip, according to sources with direct knowledge of the project. The move mirrors OpenAI's strategy of developing proprietary chips to reduce dependence on external computing infrastructure and control costs. Google, Amazon Web Services, Meta, and Microsoft have all developed their own chips, while OpenAI unveiled Jalapeno, an inference chip designed for large-language models, last month.

by Qianer Liu2 days ago· The Information

AI HardwareTrendingNews

NVIDIA Opens Compute Access via Revenue-Share Model

NVIDIA is introducing a revenue-sharing partnership model that allows AI cloud providers to procure its infrastructure and resell services to startups, enterprises, and research organizations. The model addresses capital constraints that have historically limited emerging AI companies' access to large-scale compute. Early partners Sharon AI and Firmus are deploying tens of thousands of NVIDIA GPUs through this arrangement.

by Colette Kress3 days ago· NVIDIA Blog (AI)

AI HardwareNews

Tesla and SpaceX Already Operating as One, Org Chart Shows

Tesla and SpaceX are operating with significant organizational overlap, with multiple executives holding senior roles at both companies, including in the $55 billion Terafab semiconductor manufacturing project. The cross-company collaboration suggests the two entities are already functioning as an integrated operation in key areas, even as speculation grows about a potential formal merger. This structural integration spans materials engineering, AI software, and vehicle software divisions.

by Grace Kay3 days ago· The Information