Natural language processing

Anthropic tests its “next-generation system for AI safety mitigations”

Anthropic tests its “next-generation system for AI safety mitigations”


Anthropic is expanding its bug bounty program to test its “next-generation system for AI safety mitigations.” The program focuses on identifying and defending against “universal jailbreak attacks.” Anthropic is prioritizing critical vulnerabilities in high-risk areas like chemical, biological, radiological and nuclear (CBRN) defense and cybersafety. Participants get early access to Anthropic’s latest safety systems before public release. Their task is to find vulnerabilities or ways to bypass safety measures. Anthropic is offering rewards up to $15,000 for discovering new universal jailbreak attacks.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:

Anthropic tests its "next-generation system for AI safety mitigations"

Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.

Anthropic tests its "next-generation system for AI safety mitigations"

Source link