It did a better job in explaining that there is ambiguity in the question, but s...

IanCal · on Sept 30, 2024

I agree both got it right, in the sense that it wouldn't be a stupid thing for a human to do. If there's a follow-up from someone, I'm sure the more basic llm would have been able to adjust.

Regardless I think it's good showing that models are increasingly able to solve these "gotcha" questions, even though I think it's not hugely useful. Partly because I think it's a poor compliant and an easy shutdown.