VFF - The signal in the noise
NewsTrending

Dell and NVIDIA Target Agentic AI Inference Economics

Read original
Share
Dell and NVIDIA Target Agentic AI Inference Economics

Dell and NVIDIA announced new AI infrastructure at Dell Technologies World, positioning enterprise AI deployments at scale. Dell's updated AI Factory lineup includes the PowerEdge XE9812 with NVIDIA Vera Rubin NVL72 GPUs, claiming 10x lower cost-per-token for agentic AI inference compared to Blackwell, plus new CPU-based servers with NVIDIA Vera processors optimized for data pipelines and agent workloads. The announcements reflect a shift from AI pilots to production agentic deployments, with Dell projecting global AI infrastructure spending could reach 3-4 trillion dollars by 2030 and token consumption growing 3,400% in the same period.

  • Dell PowerEdge XE9812 with NVIDIA Vera Rubin NVL72 delivers 10x lower cost-per-token for agentic AI inference versus Blackwell
  • New PowerEdge servers with NVIDIA Vera CPUs complete agentic workloads 50% faster than x86 processors, with 3x faster SQL query throughput via Starburst data engine
  • Dell PowerRack integrates compute, networking, and storage as unified system with liquid cooling and co-packaged optics for enterprise-scale AI
  • 5,000 enterprises including Lilly, Samsung, and Honeywell already running AI workloads on Dell AI Factories with NVIDIA

Enterprise AI has moved beyond proof-of-concept into production agentic deployments, creating new infrastructure demands. The focus on cost-per-token efficiency and inference optimization signals that the market is shifting from training-centric to inference-centric workloads, where enterprises need to run agents and autonomous systems continuously at scale. This reflects a maturing AI market where operational efficiency and real-world deployment economics matter more than raw model capability.

For operators and founders building AI products, this infrastructure refresh directly impacts unit economics of agentic AI services. Lower cost-per-token and faster inference mean tighter margins can support more complex agent behaviors, while faster data query performance reduces latency in agent decision loops. Enterprises evaluating AI infrastructure now have clearer performance benchmarks and cost models for planning multi-year deployments.

  • Agentic AI inference is becoming a distinct workload category with different optimization requirements than training, driving specialized hardware and software stacks
  • Cost-per-token efficiency is now a primary competitive metric for AI infrastructure, shifting focus from peak performance to sustained operational economics
  • Integrated systems like PowerRack that bundle compute, networking, and storage may reduce deployment friction for enterprises, lowering barriers to scaling AI factories

Monitor whether the claimed 10x cost-per-token improvement and 50% performance gains on Vera hold up in independent benchmarks and real customer deployments. Track adoption rates among the 5,000 enterprises mentioned and watch for competitive responses from other infrastructure providers on inference optimization. Also observe whether agentic AI workloads actually drive the projected 3,400% token consumption growth or if that estimate proves conservative or optimistic.

Related Video

Share

Subscribe to the newsletter

The latest stories and analysis, delivered to your inbox.

Free. No spam. Unsubscribe any time.

Related stories

Nvidia Backs Neocloud Startups as Market Crowds

Nvidia Backs Neocloud Startups as Market Crowds

SoftBank announced a U.S. neocloud venture on Thursday, adding to hundreds of firms now competing in the AI server rental market. Together AI raised $800 million at an $8.3 billion valuation, while Nvidia said it will provide financial backing to younger cloud firms in exchange for a revenue share. The moves highlight intense competition in the sector, though Nvidia's backstop offer raises questions about the actual strength of demand for computing capacity.

by Martin Peers· The Information
Anthropic Pursues Custom AI Chip With Samsung
TrendingNews

Anthropic Pursues Custom AI Chip With Samsung

Anthropic is in early-stage talks with Samsung Electronics to manufacture a custom AI chip, according to sources with direct knowledge of the project. The move mirrors OpenAI's strategy of developing proprietary chips to reduce dependence on external computing infrastructure and control costs. Google, Amazon Web Services, Meta, and Microsoft have all developed their own chips, while OpenAI unveiled Jalapeno, an inference chip designed for large-language models, last month.

by Qianer Liu· The Information
NVIDIA Opens Compute Access via Revenue-Share Model
TrendingNews

NVIDIA Opens Compute Access via Revenue-Share Model

NVIDIA is introducing a revenue-sharing partnership model that allows AI cloud providers to procure its infrastructure and resell services to startups, enterprises, and research organizations. The model addresses capital constraints that have historically limited emerging AI companies' access to large-scale compute. Early partners Sharon AI and Firmus are deploying tens of thousands of NVIDIA GPUs through this arrangement.

by Colette Kress· NVIDIA Blog (AI)
Tesla and SpaceX Already Operating as One, Org Chart Shows

Tesla and SpaceX Already Operating as One, Org Chart Shows

Tesla and SpaceX are operating with significant organizational overlap, with multiple executives holding senior roles at both companies, including in the $55 billion Terafab semiconductor manufacturing project. The cross-company collaboration suggests the two entities are already functioning as an integrated operation in key areas, even as speculation grows about a potential formal merger. This structural integration spans materials engineering, AI software, and vehicle software divisions.

by Grace Kay· The Information