vff — the signal in the noise
Research

MIT's AromaGen Generates Custom Scents from Text Using LLMs

Yunge Wen, Awu Chen, Jianing Yu, Jas Brooks, Hiroshi Ishii, Paul Pu LiangRead original
Share
MIT's AromaGen Generates Custom Scents from Text Using LLMs

Researchers at MIT and collaborators have developed AromaGen, an AI-powered wearable that generates custom scents from text or image inputs using a multimodal language model. The system maps semantic descriptions to mixtures of 12 base odorants released through a neck-worn dispenser, and users can refine results through natural language feedback. In a 26-person study, AromaGen matched human-composed aromas in zero-shot generation and significantly outperformed them after iterative refinement, achieving median similarity scores of 8/10 to real food scents while reducing perceived artificiality.

TL;DR

  • AromaGen uses multimodal LLMs to generate custom aromas from free-form text or visual inputs in real time
  • The system combines 12 carefully selected base odorants and allows iterative refinement through natural language feedback
  • User study results show the system matches human-composed mixtures immediately and surpasses them after refinement cycles
  • Addresses a major constraint in olfactory AI: the scarcity of large-scale olfactory datasets by leveraging latent knowledge in LLMs

Why it matters

This work demonstrates a practical application of multimodal LLMs to a domain where AI has been severely limited by data scarcity and hardware constraints. By mapping semantic inputs to structured odorant mixtures rather than attempting to generate novel scents from scratch, AromaGen sidesteps the need for massive olfactory datasets while showing that language models contain sufficient latent knowledge to guide scent composition. The result is a working system that bridges the gap between AI capability and real-world sensory experience.

Business relevance

Olfactory interfaces represent an emerging category in immersive technology and consumer hardware, with applications in food, wellness, entertainment, and remote communication. AromaGen's approach of using LLMs to enable general-purpose aroma generation from text or images could lower barriers to entry for companies building scent-enabled products, reducing dependence on fixed cartridge libraries or manual composition. The iterative refinement loop also suggests a model for personalized scent experiences, relevant to luxury goods, hospitality, and metaverse applications.

Key implications

  • Multimodal LLMs can effectively encode domain-specific knowledge even in data-scarce modalities, opening pathways for AI in other sensory or specialized domains
  • Wearable olfactory interfaces are moving from prototype to functional systems, potentially enabling new interaction paradigms in AR/VR and physical spaces
  • The ability to refine outputs through natural language feedback in real time suggests a template for interactive AI systems in non-visual domains

What to watch

Monitor whether AromaGen or similar systems move beyond research into commercial products, and track adoption in immersive media, food tech, or wellness applications. Watch for expansion of the base odorant palette and whether the approach scales to more complex or novel scent compositions. Also observe whether other sensory modalities (taste, texture) adopt similar LLM-based generation strategies.

Share

vff Briefing

Weekly signal. No noise. Built for founders, operators, and AI-curious professionals.

No spam. Unsubscribe any time.

Related stories

Lightweight Model Beats GPT-4o at Robot Gesture Prediction
Research

Lightweight Model Beats GPT-4o at Robot Gesture Prediction

Researchers have developed a lightweight transformer model that generates co-speech gestures for robots by predicting both semantic gesture placement and intensity from text and emotion signals alone, without requiring audio input at inference time. The model outperforms GPT-4o on the BEAT2 dataset for both gesture classification and intensity regression tasks. The approach is computationally efficient enough for real-time deployment on embodied agents, addressing a gap in current robot systems that typically produce only rhythmic beat-like motions rather than semantically meaningful gestures.

3 days ago· ArXiv (cs.AI)
AWS Launches G7e GPU Instances for Cheaper Large Model Inference
TrendingModel Release

AWS Launches G7e GPU Instances for Cheaper Large Model Inference

AWS has launched G7e instances on Amazon SageMaker AI, powered by NVIDIA RTX PRO 6000 Blackwell GPUs with 96 GB of GDDR7 memory per GPU. The instances deliver up to 2.3x inference performance compared to previous-generation G6e instances and support configurations from 1 to 8 GPUs, enabling deployment of large language models up to 300B parameters on the largest 8-GPU node. This represents a significant upgrade in memory bandwidth, networking throughput, and model capacity for generative AI inference workloads.

6 days ago· AWS Machine Learning Blog
Anthropic Launches Claude Design for Non-Designers
Model Release

Anthropic Launches Claude Design for Non-Designers

Anthropic has launched Claude Design, a new product aimed at helping non-designers like founders and product managers create visuals quickly to communicate their ideas. The tool addresses a gap for early-stage teams and individuals who need to share concepts visually but lack design expertise or resources. Claude Design integrates with Anthropic's Claude AI platform, leveraging its capabilities to streamline the visual creation process. The launch reflects growing demand for AI-powered design tools that lower barriers to entry for non-technical users.

7 days ago· TechCrunch AI
Google Splits TPUs Into Training and Inference Chips

Google Splits TPUs Into Training and Inference Chips

Google is splitting its eighth-generation tensor processing units into separate chips optimized for AI training and inference, a shift the company says reflects the rise of AI agents and their distinct computational needs. The training chip delivers 2.8 times the performance of its predecessor at the same price, while the inference processor (TPU 8i) achieves 80% better performance and includes triple the SRAM of the prior generation. Both chips will launch later this year as Google continues its effort to compete with Nvidia in custom AI silicon, though the company is not directly benchmarking against Nvidia's offerings.

5 days ago· Direct