Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's behind Opus 4.7 in SWE-Bench Pro, if you care about that kind of thing. It seems on-trend, even though benchmarks are less and less meaningful for the stuff we expect from models now.

Will be interesting to try.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: