News

Self-Improving Agents: Shanghai Lab Cuts Manual Tuning

bendee983@gmail.com (Ben Dickson)Jun 22, 2026 · about 2 hours ago

Researchers at Shanghai Artificial Intelligence Laboratory have introduced Self-Harness, a framework that enables LLM-based agents to automatically improve their own operating rules by analyzing execution traces and applying empirical edits. The system achieves performance improvements up to 60 percent without requiring manual tuning or stronger external models. This addresses a key bottleneck in agent development: the reliance on ad hoc human debugging rather than systematic feedback loops.

TL;DR

Self-Harness enables agents to autonomously refine their harnesses (system prompts, tools, memory, verification rules, runtime policies) by analyzing their own execution failures
The framework uses a three-stage loop: weakness mining to detect failure patterns, harness proposal to generate targeted modifications, and proposal validation through regression testing
Performance improvements reach up to 60 percent, with the system trading manual intuition-based engineering for empirical evidence-driven updates
The approach eliminates dependency on human engineers or stronger external models, making harness engineering more scalable as new LLMs are released rapidly

Why It Matters

Agent harness engineering is a critical but underexplored bottleneck in LLM deployment. Most agent failures stem not from the base model but from the surrounding system that controls context, tools, and execution logic. Current approaches rely on manual, intuition-driven debugging that cannot keep pace with the rapid release cycle of new models, making systematic self-improvement a significant operational advantage.

Business Impact

Enterprises cannot build their own frontier models but can and should customize agent harnesses for specific use cases. Self-Harness reduces the engineering overhead required to maintain and adapt agents as models evolve, enabling teams to deploy robust custom agents that continuously improve without ongoing manual intervention or reliance on expensive external models.

Key Implications

Harness engineering shifts from manual, ad hoc debugging to systematic, empirical optimization, reducing dependency on domain expertise and intuition
Enterprises can maintain agent performance across model updates and versions without proportional increases in engineering resources
The framework may accelerate adoption of LLM-based agents in production environments by lowering the operational burden of customization and maintenance

What to Watch

Monitor whether Self-Harness or similar self-improving frameworks become standard practice in agent deployment platforms and whether performance gains hold across diverse task types and model architectures. Watch for adoption by major agent frameworks like SWE-agent, Claude Code, and OpenHands, and track whether the approach scales to more complex harness configurations and multi-agent systems.

Research LLMs AI Agents AI for Business

Subscribe to the newsletter

The latest stories and analysis, delivered to your inbox.

Free. No spam. Unsubscribe any time.

Los Alamos National Laboratory is deploying three new supercomputers, Mission, Vision, and Veritas, built with HPE and NVIDIA hardware including the NVIDIA Vera CPU to accelerate scientific discovery and agentic AI research. Early testing shows the Vera CPU delivers 7x higher performance on URSA (Universal Research and Scientific Agent) workloads and over 3x performance on Monte Carlo simulations compared to the previous Crossroads x86 supercomputer. The systems, expected operational in 2027, will support classified national security work, fundamental science research, and testing of AI agents that can autonomously form hypotheses, run simulations, and refine experiments.

by Chris Porterabout 2 hours ago· NVIDIA Blog (AI)

ResearchNews

NVIDIA Accelerates Scientific Computing with Real-Time AI Tools

NVIDIA introduced new AI software tools at ISC Hamburg designed to accelerate scientific research across chemistry, materials discovery, and astronomy. The tools, including DAQIRI, ALCHEMI NIM microservices, and cuPhoton reference code, deliver GPU-accelerated pipelines that reduce processing times from hours or days to real-time. Early results show cuPhoton achieved 14,900x speedup in loading FITS astronomical data and 8,400x faster signal processing on NVIDIA GB200 NVL72 systems.

by Chris Porterabout 2 hours ago· NVIDIA Blog (AI)

ResearchTrendingNews

JUPITER Shows Exascale Computing's Real-World Impact

JUPITER, Europe's first exascale supercomputer at Germany's Forschungszentrum Jülich, is running four major science projects that demonstrate the practical capabilities of exascale computing. These projects span brain mapping at cellular resolution, global climate simulation at 1-kilometer resolution, AI for wireless networks, and quantum computing simulation. The work shows that problems previously intractable are now solvable with exascale hardware and software.

by Chris Porterabout 3 hours ago· NVIDIA Blog (AI)

ResearchNews

Neuromorphic Chip Achieves 5x Energy Efficiency Gain

Researchers led by Pengfei Sun have developed a spiking neural network with dual memory pathways that was co-designed with a custom neuromorphic chip. The system achieves over 4x throughput improvement and 5x energy efficiency gains while reducing parameters by 40-60% compared to existing implementations. The work demonstrates the value of algorithm-hardware co-design in neuromorphic computing.

by Pengfei Sunabout 10 hours ago· Nature Machine Intelligence

Self-Improving Agents: Shanghai Lab Cuts Manual Tuning

TL;DR

Why It Matters

Business Impact

Key Implications

What to Watch

Subscribe to the newsletter

Los Alamos Deploys NVIDIA Vera CPUs for Agentic AI Science

NVIDIA Accelerates Scientific Computing with Real-Time AI Tools

JUPITER Shows Exascale Computing's Real-World Impact

Neuromorphic Chip Achieves 5x Energy Efficiency Gain

Related stories

Los Alamos Deploys NVIDIA Vera CPUs for Agentic AI Science

NVIDIA Accelerates Scientific Computing with Real-Time AI Tools

JUPITER Shows Exascale Computing's Real-World Impact

Neuromorphic Chip Achieves 5x Energy Efficiency Gain