Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It did a better job in explaining that there is ambiguity in the question, but still went ahead with an arbitrary assumption in order to answer it. I think it is fair to say it is right, but so was the other attempt. Each interpretation is quite valid.

"Most right" would have been to ask questions about what is being asked instead of trying to answer an incomplete question. But rarely is the human even willing to do that as it is bizarrely seen as a show of weakness or something. An LLM is only as good as its training data, unfortunately.



I agree both got it right, in the sense that it wouldn't be a stupid thing for a human to do. If there's a follow-up from someone, I'm sure the more basic llm would have been able to adjust.

Regardless I think it's good showing that models are increasingly able to solve these "gotcha" questions, even though I think it's not hugely useful. Partly because I think it's a poor compliant and an easy shutdown.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: