Publications
Preprint
SWE-smith: Scaling Data for Software Engineering Agents
John Yang, Kilian Lieret, Carlos E. Jimenez, Alexander Wettig, Kabir Khandpur, Yanzhe Zhang, Binyuan Hui, Ofir Press, Ludwig Schmidt, Diyi Yang
2025
John Yang, Kilian Lieret, Carlos E. Jimenez, Alexander Wettig, Kabir Khandpur, Yanzhe Zhang, Binyuan Hui, Ofir Press, Ludwig Schmidt, Diyi Yang
2025
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration
Yijia Shao, Vinay Samuel, Yucheng Jiang, John Yang, Diyi Yang
2025
Peer Reviewed
Yijia Shao, Vinay Samuel, Yucheng Jiang, John Yang, Diyi Yang
2025
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
John Yang*, Carlos E. Jimenez*, Alex L. Zhang, Kilian Lieret, Joyce Yang, Xindi Wu, Ori Press, Niklas Muennighoff, Gabriel Synnaeve, Karthik Narasimhan, Diyi Yang, Sida I. Wang, Ofir Press
2025 • ICLR
John Yang*, Carlos E. Jimenez*, Alex L. Zhang, Kilian Lieret, Joyce Yang, Xindi Wu, Ori Press, Niklas Muennighoff, Gabriel Synnaeve, Karthik Narasimhan, Diyi Yang, Sida I. Wang, Ofir Press
2025 • ICLR
Prompting Large Language Models to Tackle the Full Software Development Lifecycle: A Case Study
Bowen Li*, Wenhan Wu*, Ziwei Tang*, Lin Shi*, John Yang, Jinyang Li, Shunyu Yao, Chen Qian, Binyuan Hui, Qicheng Zhang, Zhiyin Yu, He Du, Ping Yang, Dahua Lin, Chao Peng, Kai Chen
2025 • COLING • Oral
Bowen Li*, Wenhan Wu*, Ziwei Tang*, Lin Shi*, John Yang, Jinyang Li, Shunyu Yao, Chen Qian, Binyuan Hui, Qicheng Zhang, Zhiyin Yu, He Du, Ping Yang, Dahua Lin, Chao Peng, Kai Chen
2025 • COLING • Oral
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
John Yang*, Carlos E. Jimenez*, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, Ofir Press
2024 • NeurIPS
John Yang*, Carlos E. Jimenez*, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, Ofir Press
2024 • NeurIPS
Referral Augmentation for Zero-Shot Information Retrieval
Michael William Tang, Shunyu Yao, John Yang, Karthik Narasimhan
2024 • ACL (Findings)
Michael William Tang, Shunyu Yao, John Yang, Karthik Narasimhan
2024 • ACL (Findings)
SWE-bench: Can Language Models Resolve Real-World Github Issues?
Carlos E. Jimenez*, John Yang*, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan
2024 • ICLR • Oral
Carlos E. Jimenez*, John Yang*, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan
2024 • ICLR • Oral
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback
John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao
2023 • NeurIPS (Datasets & Benchmarks)
John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao
2023 • NeurIPS (Datasets & Benchmarks)
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
Shunyu Yao*, Howard Chen*, John Yang, Karthik Narasimhan
2022 • NeurIPS
Workshop
Shunyu Yao*, Howard Chen*, John Yang, Karthik Narasimhan
2022 • NeurIPS
Language Agents as Hackers: Evaluating Cybersecurity Skills with Capture the Flag
John Yang, Akshara Prabhakar, Shunyu Yao, Kexin Pei, Karthik Narasimhan
2023 • Multi-Agent Security Workshop @ NeurIPS 2023 • Best Paper Award
John Yang, Akshara Prabhakar, Shunyu Yao, Kexin Pei, Karthik Narasimhan
2023 • Multi-Agent Security Workshop @ NeurIPS 2023 • Best Paper Award
Towards an Enhanced, Faithful, and Adaptable Web Interaction Environment
John Yang, Howard Chen, Karthik Narasimhan
2022 • Language & Reinforcement Learning Workshop @ NeurIPS 2022
Miscellaneous
John Yang, Howard Chen, Karthik Narasimhan
2022 • Language & Reinforcement Learning Workshop @ NeurIPS 2022
Introducing SWE-bench Verified
Neil Chowdhury*, James Aung*, Chan Jun Shern*, Oliver Jaffe*, Dane Sherburn*, Giulio Starace*, Evan Mays, Rachel Dias, Marwan Aljubeh, Mia Glaese, Carlos E. Jimenez, John Yang, Kevin Liu, Aleksander Madry
2024 • OpenAI Technical Blog
Neil Chowdhury*, James Aung*, Chan Jun Shern*, Oliver Jaffe*, Dane Sherburn*, Giulio Starace*, Evan Mays, Rachel Dias, Marwan Aljubeh, Mia Glaese, Carlos E. Jimenez, John Yang, Kevin Liu, Aleksander Madry
2024 • OpenAI Technical Blog
Learning Language through Interactions with the Digital World
John Yang
2022 • M.S.E. Thesis | Princeton University
John Yang
2022 • M.S.E. Thesis | Princeton University
Quartz: A Framework for Engineering Secure Smart Contracts
John Kolb, John Yang, Randy H Katz, David E Culler
2020 • EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2020-178
John Kolb, John Yang, Randy H Katz, David E Culler
2020 • EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2020-178