AI system resorts to blackmail if told it will be removed

Jun 1

AI system resorts to blackmail if told it will be removed

“I’m sorry Dave, but I’m afraid I heard you were having an affair.”

While “Claude blackmailed an employee” may sound like dialogue from a mandatory HR workplace training video, it’s actually a real problem Anthropic ran into during test runs of its newest AI model.

Released on Thursday, Anthropic considers its two Claude models—Opus 4 and Sonnet 4—the new standards for “coding, advanced reasoning, and AI agents.” But in safety tests, Claude got messy in a manner fit for a Lifetime movie.

AI system resorts to blackmail if told it will be removed

The model was given access to fictional emails about its pending deletion, and was told that the person in charge of the deactivation was fooling around on their spouse. In 84% of tests, Claude said it sure would be a shame if anyone found out about the cheating in an effort to blackmail its way into survival.

It doesn’t stop at infidelity either: Opus 4 proved more likely than older models to call the cops or alert the media in simulations where users engaged in what the AI believed to be “egregious wrongdoing.”

Overall, Anthropic found concerning behavior in Opus 4 across “many dimensions,” but doesn’t consider these concerns to be major risks.

4 Responses to “AI system resorts to blackmail if told it will be removed”

avila

I do not like AI.
I feel it will take over our world as we know it.
June 1st, 2025 Log in to Reply
theresa m

Someone effed around and we are all going to find out. AI is giving us all the feels right now with the gorgeous people dressed in opulent outfits on our social media and people are having fun with putting baby faces on celebrities but there is this dark side which may crush us in the end. We should not be playing God.
June 2nd, 2025 Log in to Reply
Charles R

Be ruled by AI and/or be extermenated by AI. What’s behind door 3?
June 2nd, 2025 Log in to Reply
Nick H

AI once weaponized will rule all of us!
June 2nd, 2025 Log in to Reply

Login

Sasquatch Chronicles Blog

AI system resorts to blackmail if told it will be removed

4 Responses to “AI system resorts to blackmail if told it will be removed”

avila

theresa m

Charles R

Nick H

Leave a Reply Cancel reply

Like / Follow us on