> But you don't fire a table saw because it doesn't know when to stop cutting, right?
If I purchased a table saw and that table saw irregularly and unpredictably jumped past its safeties -as we've plenty of evidence that LLMs [0] do-, then I would [1] immediately stop using that saw, return it for a refund, alert the store that they're selling wildly unsafe equipment, and the relevant regulators that a manufacturer is producing and selling wildly unsafe equipment.
[0] ...whether "agentic" or not...
[1] ...after discovering that yes, this is not a defective unit, but this model of saw working as designed...
> But that's the thing: the table saw has safeties. Someone put them there.
You noticed that I mentioned that this hypothetical table saw has poorly-designed, entirely inadequate safeties? Things like Opus treating the data it presents to the user as commands that it should execute [0] is definitely [1] a sign of solid, well-designed safety mechanisms.
You might choose to retort "Well, that's because the user isn't running the tool in the mode that makes it wait for confirmation before doing anything of consequence!". In reply, I would point in the general direction of the half-squillion studies indicating that a system whose safety requires an operator to remain vigilant when presented with a large volume of irregularly-presented decision points (nearly all of which can be safely answered with a "Yes, do it.") does not make for a safe system. [2] It -in fact- makes for a system that's designed [3] to be unsafe.
You might also choose to retort "That's never happened to me, or anyone that I know about.". Intermittent failures of built-in safeties that happen under unpredictable circumstances are far, far worse than predictable failures that happen under known ones. I hope you understand why.
[2] I would also -somewhat wryly- note that "An AI Agent that does all of your scutwork, but whose every decision you have to carefully scrutinize, because it will irregularly plan to do something irreversibly destructive to something you care about." is not at all the picture that "AI" boosters paint of these tools.
Just to drive home the "These things have poorly-designed, entirely inadequate safeties", here [0] is a report from three weeks ago of the then-latest version of Claude Code being commanded to enter into the "Don't modify anything" mode, reporting to the user that it was in the "Don't modify anything" mode, and then proceeding to modify things as if it was not actually in the "Don't modify anything" mode.
I'm sure if I dug around, I would find hundreds of reports of these tools [1] jumping over their safeties to do things that are unexpected, and not-infrequently hazardous. I expect that such reports will continue, because "building robust, effective, and reliable safeties" has very, very clearly not been a significant priority for the major LLM companies. But, I've more than proven my point, so I'll leave the small pile of evidence at this.
If I purchased a table saw and that table saw irregularly and unpredictably jumped past its safeties -as we've plenty of evidence that LLMs [0] do-, then I would [1] immediately stop using that saw, return it for a refund, alert the store that they're selling wildly unsafe equipment, and the relevant regulators that a manufacturer is producing and selling wildly unsafe equipment.
[0] ...whether "agentic" or not...
[1] ...after discovering that yes, this is not a defective unit, but this model of saw working as designed...