VFF - The signal in the noise
NewsTrending

Definity Embeds Agents Inside Spark to Prevent Pipeline Failures

Read original
Share
Definity Embeds Agents Inside Spark to Prevent Pipeline Failures

Definity, a Chicago-based data pipeline operations startup, has raised $12 million in Series A funding to embed autonomous agents directly inside Spark and DBT pipelines. Rather than monitoring failures after they occur, Definity's JVM agent runs inline during pipeline execution, detecting and preventing data quality issues, resource bottlenecks, and stale data in real time. Early customers report identifying 33% of optimization opportunities in the first week and resolving complex Spark issues up to 10x faster, addressing a critical gap for agentic AI systems that depend on clean, timely data.

  • Definity embeds agents inside Spark pipeline execution layers via JVM instrumentation, catching failures during runs rather than after completion
  • The agent captures query execution behavior, memory pressure, data skew, and infrastructure utilization in real time, with ability to modify resource allocation or stop jobs mid-run
  • Series A round of $12 million led by GreatPoint Ventures, with participation from Dynatrace, StageOne Ventures, and Hyde Park Venture Partners
  • Early customer cut troubleshooting effort by 70% and identified 33% of optimization opportunities in first week of deployment

Agentic AI systems are only as reliable as their data pipelines. Silent failures or stale data don't just break dashboards, they break AI systems that depend on clean, timely inputs. Definity's in-execution approach addresses a fundamental architectural gap: existing monitoring tools detect problems after pipelines have already run and propagated bad data downstream, whereas inline agents can prevent failures before they reach dependent systems.

Data engineering teams currently spend significant effort manually tracing and fixing pipeline failures after the fact. Definity's approach reduces troubleshooting overhead by 70% and enables faster issue resolution, directly lowering operational costs and reducing downtime for mission-critical data infrastructure. For companies deploying agentic AI, this translates to more reliable autonomous systems and reduced risk of cascading failures.

  • The shift from post-execution monitoring to in-execution intervention represents a new architectural pattern for data reliability, with potential to reshape how teams approach pipeline observability
  • Existing monitoring vendors like Datadog, Databricks, and Unravel Data may face pressure to move detection and intervention earlier in the execution lifecycle
  • As agentic AI adoption accelerates, data pipeline reliability becomes a critical dependency, creating market opportunity for solutions that prevent rather than just detect failures

Monitor whether other data infrastructure vendors adopt in-execution agent patterns or acquire similar capabilities. Watch for adoption rates among companies running mission-critical agentic AI systems, as this will signal whether in-execution intervention becomes table stakes for data operations. Also track whether Definity's approach influences how Databricks, Apache Spark, and DBT communities approach observability and control.

Share

Our Briefing

Weekly signal. No noise. Built for founders, operators, and AI-curious professionals.

No spam. Unsubscribe any time.

Related stories

NVIDIA Blackwell Leads First Agentic AI Benchmark
TrendingNews

NVIDIA Blackwell Leads First Agentic AI Benchmark

Artificial Analysis released AgentPerf, the first benchmark designed specifically for agentic AI workloads, showing NVIDIA's Blackwell Ultra NVL72 platform delivering 20x more agents per megawatt than Hopper-based systems. The benchmark reflects the fundamentally different performance characteristics of agentic AI, which chains dozens to hundreds of LLM calls with tool execution rather than single-turn completions. Results are based on real coding agent trajectories across 12+ programming languages, providing infrastructure providers and enterprises with direct metrics for deployment decisions.

by Shruti Koparkar· NVIDIA Blog (AI)
PixelRAG bypasses text parsing, cuts RAG costs 10x

PixelRAG bypasses text parsing, cuts RAG costs 10x

Researchers from UC Berkeley, Princeton, EPFL, and Databricks introduced PixelRAG, a retrieval system that bypasses traditional text parsing by rendering web pages as screenshots and indexing them directly for vision-language models. Tested on 30 million Wikipedia screenshot tiles, PixelRAG improved accuracy by up to 18.1% over text-based RAG systems and reduced token costs by 10x. The approach addresses fundamental information loss in conventional HTML-to-text conversion pipelines.

· VentureBeat AI
NanoClaw and JFrog Block Malicious Code from AI Agents
TrendingNews

NanoClaw and JFrog Block Malicious Code from AI Agents

NanoClaw and JFrog have launched an integration that routes autonomous AI agents through vetted software registries to block malicious code downloads. The system acts as an automated immune system, intercepting compromised packages and guiding agents to approved alternatives. The partnership offers free access for open-source users and commercial licensing for enterprises, addressing a growing security gap as AI agents autonomously install packages without human oversight.

by carl.franzen@venturebeat.com (Carl Franzen)· VentureBeat AI
Google's 'Faithful Uncertainty' Lets LLMs Hedge Instead of Hallucinate
TrendingNews

Google's 'Faithful Uncertainty' Lets LLMs Hedge Instead of Hallucinate

Google researchers propose 'faithful uncertainty,' a technique that allows large language models to express qualified guesses rather than either confidently hallucinating or refusing to answer. The approach reframes hallucinations as 'confident errors' and enables models to hedge responses appropriately, preserving utility while maintaining trustworthiness. This addresses a core tradeoff in LLM deployment where eliminating factual errors typically forces models to abstain from answering questions they actually know.

by bendee983@gmail.com (Ben Dickson)· VentureBeat AI