Agreed, 2.5 flash too. I analyze a large json document of metrics for pricing decisions. Typically around 200k, occtionallly up to 1M, Gemini 2.5 significantly outperforms for my task. It isn't 100%, but role playing gets close. I suppose that's a form of inference time compute.