\dm_csml_event_details
Speaker |
Tatiana Shavrina |
|---|---|
Affiliation |
FAIR |
Date |
Wednesday, 18 March 2026 |
Time |
12:30-13:30 |
Location |
Ground floor lecture theatre, Sainsbury Wellcome Center, 25 Howland St, W1T 4JG |
Link |
https://ucl.zoom.us/j/99748820264 |
Event series |
Jump Trading/ELLIS CSML Seminar Series |
Abstract |
Large Language Model (LLM) agents are poised to transform the landscape of scientific research by automating complex, multi-stage workflows. To accelerate progress in this domain, AI Research Agents are applied to more and more modeling sciences, the first being AI Research itself. The talk will cover recent works in the domain, including frontier scaffolds, LLMS, and benchmarks. We will look at how the current agentic tasks encompass a broad spectrum of scientific challenges, including language modeling, mathematics, bioinformatics, and time series forecasting. We will rigorously evaluate agentic capabilities across the entire research lifecycle—spanning ideation, experimental analysis, and iterative refinement—without providing baseline code. Our empirical results reveal that while agents surpass human state-of-the-art (SOTA) performance in four tasks, they fall short in sixteen, and even the best-performing agents do not reach the theoretical task ceilings. These findings highlight that AIRS-Bench remains unsaturated, offering significant headroom for future advancements. |
Biography |
My team and me at FAIR, we focus on AI Agents Accelerating AI Research itself. Ablation automation, neural architecture search, long-running experimental iterations. |