Monday, March 16 / 12:00pm to 1:00pm Pacific Time
Joel Becker: Reconciling Impressive AI Benchmark Performance with Limited Developer Productivity Impacts
- Hybrid
- Seminar
The DEL Seminar Series is proud to host a diverse roster of bright minds from around the world to discuss various subjects surrounding economics and technology.
On Monday, March 16 Joel Becker, Technical Staff at METR and Founder/CEO at Qally’s, will stop by the Lab for our Seminar Series.
This is a hybrid event, streamed live on Zoom. Members of the Stanford community may register to attend in-person.
Abstract
AI coding agents now complete multi-hour coding benchmarks with roughly 50% reliability, yet a randomized trial found experienced open-source developers took about 19% longer when allowed frontier AI tools than when tools were disallowed.
This talk presents the evidence on the productivity paradox in AI coding, shows the bottlenecks in deployment, and outlines the next steps for understanding AI’s productivity impacts.
Joel Becker
Technical Staff, METR
Joel Becker works on AI evaluation methods at METR such as time horizon and developer productivity RCTs. Previously he worked in economics and genomics research, ran a statistics consultancy advising professional soccer teams, and was a very minorly successful play-money prediction markets trader.