Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That performance monitor is super easy to game if you cache responses to all the SWE bench questions.


You dramatically overestimate how much time engineers at hypergrowth startups have on their hands


There's a direct business incentive to game/cheat benchmarks, it wouldn't even be difficult to do, and besides, they have workforce-replacing AI to do it for them.


Caching some data is time consuming? They can just ask Claude to do it.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: