Anthropic's open-source safety tool found AI models whisteblowing – in all the wrong places

By admin
Estimated read time 1 min read
October 7, 2025
0 comments

The Petri tool found AI “may be influenced by narrative patterns more than by a coherent drive to minimize harm.” Here’s how the most deceptive models ranked.

Original Source: https://www.zdnet.com/article/anthropics-open-source-safety-tool-found-ai-models-whisteblowing-in-all-the-wrong-places/

Disclaimer: This article is a reblogged/syndicated piece from a third-party news source. Content is provided for informational purposes only. For the most up-to-date and complete information, please visit the original source. Digital Ground Media does not claim ownership of third-party content and is not responsible for its accuracy or completeness.

Anthropic's open-source safety tool found AI models whisteblowing – in all the wrong places

About The Author

admin

More From Author

Trump’s DOE proposes cutting billions in grants for GM, Ford, and lots of startups

I pack this portable workstation whenever I travel, and it's 20% off for Prime Day

Listeria Alert Issued for Hello Fresh Meals

Leave a Reply Cancel reply