John Y

Talks

[State of Code Evals] After SWE-bench, Code Clash & SOTA Coding Benchmarks recap — John Yang
2025-12-31 • Latent Space Podcast

Ep. 45: John Yang, SWE-Bench Lead Author and Stanford CS PhD Student
2025-11-17 • Delta Institute | Ankit Gupta

AI Evals w: John Yang: Evaluating and training software engineering agents
2025-10-17 • alphaXiv x Vals AI

SWE-smith: Scaling Data for Software Engineering Agents | John Yang | Stanford University
2025-08-12 • Open AGI Summit

Forward Future Live August 8th, 2025
2025-08-08 • Forward Future | Matthew Berman

Few Shot Code Generation to Autonomous Software Engineering Agents // John Yang
2024-12-02 • MLOps.community

SWE-bench with John Yang and Carlos E. Jimenez - Weaviate Podcast #107!
2024-10-30 • Weaviate Podcast | Connor Shorten

John Yang - SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
2023-11-03 • University of Toronto | Rohan Alexander

Press

Project Glasswing: Securing critical software for the AI era
2026-04-10 • Anthropic

Introducing GPT-5.1 for Developers
2025-11-13 • OpenAI

Laude Institute Announces First Batch of Slingshots AI Grants
2025-11-06 • TechCrunch | Russell Brandom

Introducing Claude Sonnet 4.5
2025-09-29 • Anthropic

Meta's newest world model research project
2025-09-26 • DeepLearning.AI | Andrew Ng

Stanford and Alibaba Build Bug-Fixing Dataset and Pipeline to Train AI
2025-08-13 • DeepLearning.AI | Andrew Ng

Claude Opus 4.1
2025-08-05 • Anthropic

A new AI coding challenge just published its first results — and they aren't pretty
2025-07-23 • TechCrunch | Russell Brandom

Warp scores 71% on SWE-bench Verified
2025-06-23 • Warp

Introducing Claude 4
2025-05-22 • Anthropic

How to Build a Better AI Benchmark
2025-05-08 • MIT Technology Review | Russell Brandom

AI Models Still Struggle to Debug Software, Microsoft Study Shows
2025-04-10 • TechCrunch | Kyle Wiggers

#1 open-source agent on SWE-Bench Verified by combining Claude 3.7 and O1
2025-03-31 • Augment Code

Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet
2025-01-06 • Anthropic

Building Effective Agents
2024-12-09 • Anthropic

Introducing SWE-bench Verified
2024-08-13 • OpenAI

The AI-Powered Future of Coding Is Near
2024-07-18 • Wired | Will Knight

Coding Agents Are Evolving From Novelties to Widely Useful Tools
2024-06-19 • DeepLearning.AI | Andrew Ng

Coding Agents Proliferate
2024-04-10 • DeepLearning.AI | Andrew Ng

AI Agent Automatically Codes WITH TOOLS - SWE-Agent Tutorial ("Devin Clone")
2024-04-05 • Matthew Berman

SWE-bench Technical Report
2024-03-15 • Cognition