Is it? Or is it the task you're trying to do? Opus 4.6 has been staggeringly good for me this last week, both inside Claude Code and through Antigravity until I used up my quota.
I think some of this comes down to undeclared A/B testing. I've had the worst week of interactions I have ever had using Claude Code. The whole week whenever I have a session that isn't failing miserably I seem to get tapped for a session survey but on any that are out and out shitting the bed it never asks. It has felt a little surreal. I'd love to see a product wide stats graph for swearing, I would 100% believe that it is hitting an all time high but maybe I'm just a victim of a bad A/B round.
Oh I’ve been getting a lot more of those too lately even though I dismiss it every time. Wonder if I should report not satisfied every time so that I get routed to something better…