News

Talks
Ep. 45: John Yang, SWE-Bench Lead Author and Stanford CS PhD Student
2025-11-17 • Delta Institute | Ankit Gupta
AI Evals w: John Yang: Evaluating and training software engineering agents
2025-10-17 • alphaXiv x Vals AI
SWE-smith: Scaling Data for Software Engineering Agents | John Yang | Stanford University
2025-08-12 • Open AGI Summit
Forward Future Live August 8th, 2025
2025-08-08 • Forward Future | Matthew Berman
Few Shot Code Generation to Autonomous Software Engineering Agents // John Yang
2024-12-02 • MLOps.community
SWE-bench with John Yang and Carlos E. Jimenez - Weaviate Podcast #107!
2024-10-30 • Weaviate Podcast | Connor Shorten
John Yang - SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
2023-11-03 • University of Toronto | Rohan Alexander

Press
Introducing GPT-5.1 for Developers
2025-11-13 • OpenAI
Laude Institute Announces First Batch of Slingshots AI Grants
2025-11-06 • TechCrunch | Russell Brandom
Introducing Claude Sonnet 4.5
2025-09-29 • Anthropic
Meta's newest world model research project
2025-09-26 • DeepLearning.AI | Andrew Ng
Stanford and Alibaba Build Bug-Fixing Dataset and Pipeline to Train AI
2025-08-13 • DeepLearning.AI | Andrew Ng
Claude Opus 4.1
2025-08-05 • Anthropic
A new AI coding challenge just published its first results — and they aren't pretty
2025-07-23 • TechCrunch | Russell Brandom
Warp scores 71% on SWE-bench Verified
2025-06-23 • Warp
Introducing Claude 4
2025-05-22 • Anthropic
How to Build a Better AI Benchmark
2025-05-08 • MIT Technology Review | Russell Brandom
AI Models Still Struggle to Debug Software, Microsoft Study Shows
2025-04-10 • TechCrunch | Kyle Wiggers
#1 open-source agent on SWE-Bench Verified by combining Claude 3.7 and O1
2025-03-31 • Augment Code
Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet
2025-01-06 • Anthropic
Building Effective Agents
2024-12-09 • Anthropic
Introducing SWE-bench Verified
2024-08-13 • OpenAI
The AI-Powered Future of Coding Is Near
2024-07-18 • Wired | Will Knight
Coding Agents Are Evolving From Novelties to Widely Useful Tools
2024-06-19 • DeepLearning.AI | Andrew Ng
Coding Agents Proliferate
2024-04-10 • DeepLearning.AI | Andrew Ng
AI Agent Automatically Codes WITH TOOLS - SWE-Agent Tutorial ("Devin Clone")
2024-04-05 • Matthew Berman
SWE-bench Technical Report
2024-03-15 • Cognition