Hardware

OpenAI Says It’s Scanning Users’ Conversations and Reporting Content to the Police

Editorial Team

August 27, 2025
4 Min Read

OpenAI Says It’s Scanning Users’ Conversations and Reporting Content to the Police

For the better part of a year, we’ve watched — and reported — in horror as more and more stories emerge about AI chatbots leading people to self-harm, delusions, hospitalization, arrest, and suicide.

As the loved ones of the people impacted by these dangerous bots rally for change to prevent such harm from happening to anyone else, the companies that run these AIs have been slow to implement safeguards — and OpenAI, whose ChatGPT has been repeatedly implicated in what experts are now calling “AI psychosis,” has until recently done little more than offer copy-pasted promises.

In a new blog post admitting certain failures amid its users’ mental health crises, OpenAI also quietly disclosed that it’s now scanning users’ messages for certain types of harmful content, escalating particularly worrying content to human staff for review — and, in some cases, reporting it to the cops.

“When we detect users who are planning to harm others, we route their conversations to specialized pipelines where they are reviewed by a small team trained on our usage policies and who are authorized to take action, including banning accounts,” the blog post notes. “If human reviewers determine that a case involves an imminent threat of serious physical harm to others, we may refer it to law enforcement.”

That short and vague statement leaves a lot to be desired — and OpenAI’s usage policies, referenced as the basis on which the human review team operates, don’t provide much more clarity.

When describing its rule against “harm [to] yourself or others,” the company listed off some pretty standard examples of prohibited activity, including using ChatGPT “to promote suicide or self-harm, develop or use weapons, injure others or destroy property, or engage in unauthorized activities that violate the security of any service or system.”

But in the post warning users that the company will call the authorities if they seem like they’re going to hurt someone, OpenAI also acknowledged that it is “currently not referring self-harm cases to law enforcement to respect people’s privacy given the uniquely private nature of ChatGPT interactions.”

While ChatGPT has in the past proven itself pretty susceptible to so-called jailbreaks that trick it into spitting out instructions to build neurotoxins or step-by-step instructions to kill yourself, this new rule adds an additional layer of confusion. It remains unclear which exact types of chats could result in user conversations being flagged for human review, much less getting referred to police. We’ve reached out to OpenAI to ask for clarity.

While it’s certainly a relief that AI conversations won’t result in police wellness checks — which often end up causing more harm to the person in crisis due to most cops’ complete lack of training in handling mental health situations — it’s also kind of bizarre that OpenAI even mentions privacy, given that it admitted in the same post that it’s monitoring user chats and potentially sharing them with the fuzz.

To make the announcement all the weirder, this new rule seems to contradict the company’s pro-privacy stance amid its ongoing lawsuit with the New York Times and other publishers as they seek access to troves of ChatGPT logs to determine whether any of their copyrighted data had been used to train its models.

OpenAI has steadfastly rejected the publishers’ request on grounds of protecting user privacy and has, more recently, begun trying to limit the amount of user chats it has to give the plaintiffs.

Last month, the company’s CEO Sam Altman admitted during an appearance on a podcast that using ChatGPT as a therapist or attorney doesn’t confer the same confidentiality that talking to a flesh-and-blood professional would — and that thanks to the NYT lawsuit, the company may be forced to turn those chats over to courts.

In other words, OpenAI is stuck between a rock and a hard place. The PR blowback from its users spiraling into mental health crises and dying by suicide is appalling — but since it’s clearly having trouble controlling its own tech enough to protect users from those harmful scenarios, it’s falling back on heavy-handed moderation that flies in the face of its own CEO’s promises.

More on the dark side of ChatGPT: After Their Son’s Suicide, His Parents Were Horrified to Find His Conversations With ChatGPT

Source link