*A well-trained LLM that lacks any malevolent data...* The scale needed to produ...

Dylan16807 · 2025-07-17T05:06:12 1752728772

So that's all true but the same argument works if you say a low percentage of malevolent data, and that's far from impossible.

joe_the_user · 2025-07-17T06:10:00 1752732600

It's remarkable how many people are uncritically talking of "malevolent data" as it is was a well-defined concept that everyone knows is the source of bad things.

A simple good search reveals ... this very thread as a primary source on the topic of "malevolent data" (ha, ha). But it should be noted that all other sources mentioning the phrase define it as data intentionally modified to produce a bad effect. It seems clear the problems of badly behaved LLMs don't come from this. Sycophancy, notably, doesn't just appear out of "sycophantic data" cleverly inserted by the association of allied sycophants.

Dylan16807 · 2025-07-17T06:22:53 1752733373

I don't find it very remarkable that when one person makes up a term that's pretty easy to understand, other people in the same conversation use the same term.

In the context of this conversation, it was a response to someone talking about malevolent human therapists, and worried about AIs being trained to do the same things. So that means it's text where one of the participants is acting malevolently in those same ways.

joe_the_user · 2025-07-17T19:49:25 1752781765

I suppose remarkable or not depends on viewpoint.

For me, hearing this fantastical talk of "malevolent data" is like hearing people who know little about chemistry or engines saying "internal combustion cars are fine long as we don't run them on 'carbon-filled-fuel'". Otherwise, see my comment above.

Dylan16807 · 2025-07-17T19:53:03 1752781983

I love your example there. Because there is an answer. There are ICE cars that run on hydrogen.

The thing they're talking about is hard but it's not impossible.

joe_the_user · 2025-07-18T01:32:28 1752802348

Sure, it's not literally impossible. There are ICE cars that run on hydrogen. But you can't practically adapt an existing gasoline car to run on hydrogen. My point is that mobilizing terminology gives people with no knowledge of details the illusion they can speak reasonably about the topic.

Dylan16807 · 2025-07-18T04:11:16 1752811876

> But you can't practically adapt an existing gasoline car to run on hydrogen.

You can do it pretty practically. Figuring out a supply is probably worse than the conversion itself.

> My point is that mobilizing terminology gives people with no knowledge of details the illusion they can speak reasonably about the topic.

"mobilizing terminology"? They just stuck two words together so they wouldn't have to say "training data that has the same features as a conversation with a malevolent therapist" or some similar phrase over and over. There's no expertise to be had, and there's no pretense of expertise either.

And the idea of filtering it out is understandable to a normal person: straightforward and a ton of work.