Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

GLM 5.1, widely held up as the model at the heals, perhaps ever surpassing western models....

Gets 5% on ARC-AGI2 private set.

Chinese models are suspiciously good a benchmarks.



I mean, I could say the same about Gemini. 3.1 Pro tops a bunch of benchmarks out there but any practical use I've put it to it's underperforming both other proprietary and open weight models. Benchmarks are suspicious in general.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: