Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts .. TechCrunch ...Middle East

NY Times News - News

Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.

Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system. Anthropic later published research suggesting that models from other companies had similar issues with “agentic misalignment.”

Apparently Anthropic has done more work around that behavior, claiming in a post on X, “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.”

The company went into more detail in a blog post stating that since Claude Haiku 4.5, Anthropic’s models “never engage in blackmail [during testing], where previous models would sometimes do so up to 96% of the time.”

What accounts for the difference? The company said it found that “documents about Claude’s constitution and fictional stories about AIs behaving admirably improve alignment.”

Related, Anthropic said that it found training to be more effective when it includes “the principles underlying aligned behavior” and not just “demonstrations of aligned behavior alone.”

“Doing both together appears to be the most effective strategy,” the company said.

Techcrunch event

San Francisco, CA | October 13-15, 2026

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts | TechCrunch NYT News Today.

Hence then, the article about anthropic says evil portrayals of ai were responsible for claude s blackmail attempts techcrunch was published today ( Sunday 10/05/2026 - 11:52 PM ) and is available on NY Times News ( Middle East ) The editorial team at PressBee has edited and verified it, and it may have been modified, fully republished, or quoted. You can read and follow the updates of this news or article from its original source.

Read More Details
Finally We wish PressBee provided you with enough information of ( Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts .. TechCrunch )

Last updated : 10 May 2026 clock 11:52 PM

Also on site :

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts .. TechCrunch ...Middle East

Santa Clara County resident exposed to Hantavirus

Britons evacuated from hantavirus-stricken MV Hondius cruise ship return to UK

Morgan Wallen Walks Out With Caitlin Clark at Indianapolis Stadium Concert

Federal funds for domestic violence services are falling short. California survivors are pushing for a fix

Iran’s response to US peace terms ‘totally unacceptable’ – Trump

Walmart’s Dainty and 'Dazzling' $14 Birthstone Drop Earrings Have Shoppers Scooping Up Multiple Colors

Binance Says Crypto Platforms Fill In for Banks in Emerging Markets .. PYMNTS.com

End the National Flood Insurance Program

Barcelona beat Real Madrid 2-0 in El Clasico to retain La Liga title

Meghan Trainor’s Husband Shares Mother’s Day Tribute After Tour Canceled

Trump says he will send an ‘Election Integrity Army’ into every state for midterms

Mexican cartel armed with explosives launched from drones attacks rural communities, forcing 800-1,000 families to flee

El chavismo reconoce la muerte de Víctor Quero, un preso político cuya madre de 82 años lo buscaba desde hace más de un año

All stars confirmed for WWE RAW (May 11, 2026) ft. Roman Reigns, CM Punk, Becky Lynch & more

95-year-old among 10 suspected of soliciting sex in Vallejo crackdown