Natural language processing

OpenAI is not consistently candid about its relationship to the open web

OpenAI is not consistently candid about its relationship to the open web


One year after ChatGPT’s launch, OpenAI shared its approach to data and AI, but skirted the most important question: what happens to the open web if it wins?

OpenAI has launched a project to better control the use of web content by AI with a tool called “Media Manager”.

Last year, OpenAI introduced robots.txt commands. Site owners can use these to prevent OpenAI crawlers from crawling their sites and using them for training data or displaying content in ChatGPT.

But that’s not enough, according to OpenAI. Content is often quoted, changed, remixed, reposted, and used as inspiration in many ways, and content creators can’t always control where their content shows up. The media manager is meant to let content owners say exactly how their content can be used.



But opting out of OpenAI’s content ecosystem won’t be the main issue if its chatbot succeeds on the scale of Google Search. The main problem will be opting in and still making money.

OpenAI is picking and choosing who can benefit from this new ecosystem through deals with big publishers like the Financial Times and Axel Springer, creating an uneven playing field.

It’s essentially setting up the next Prisoner’s Dilemma after Google Search, without even a hint of an idea on how to answer the most pressing question: how to ensure a fair distribution of economic benefits among ALL content creators.

Google has already successfully split the publishing industry with similar lobbying efforts like News Showcases. That was nothing compared to what OpenAI is doing now. For the AI company to cite its publisher deals as a positive example of its own involvement in the content sector is either incredibly brazen or simply naive.

Today’s ad system, which largely pays publishers and creators, is relatively transparent and open – with all the drawbacks that brings, especially the focus on reach. But publishers own their reach and can sell it to the highest bidder.


If chatbots get all the attention in the future, companies like OpenAI will own all the reach – and have publishers in their grip.

The fact that publishers can technically opt out of OpenAI’s content ecosystem will mainly help OpenAI – in court, just to look a little better, and probably because it’ll be mandatory anyway. And it might even help OpenAI polish its datasets by getting rid of duplicates.

Today, few publishers can afford to exclude Google Search, even though they technically could. As AI platforms dominate the Internet of the future, this dependency will only grow.

While OpenAI says it wants to “empower creators and publishers and enhance the user experience” by getting rid of the “attention economy built for advertisers,” its actions so far make it clear that OpenAI is the company that decides who gets empowered.

It’s not even just about money. It’s also about which publishers OpenAI chooses, and by what standards. The deals so far are completely murky.

OpenAI is just getting started and is already worse than Google.

Several major publishers, including the New York Times, are currently suing OpenAI for copyright infringement.

OpenAI is not consistently candid about its relationship to the open web

Source link