OpenAI Faces Renewed Pressure From News Publishers Over Training Data

A fresh wave of complaints in early April underscores how unresolved licensing questions continue to complicate the company’s push into media and entertainment

ChatGPT, CC0, via Wikimedia Commons

OpenAI’s expansion into media and entertainment is colliding again with a familiar obstacle: the question of how its models are trained. In early April, a group of major news publishers renewed pressure on the company, arguing that their content had been used without sufficient licensing or compensation, according to reporting in the Financial Times. While the dispute centers on journalism, its implications extend directly into Hollywood.

The issue is not new, but it has gained urgency as OpenAI pushes further into video, audio, and multimodal content. Systems like Sora rely on large datasets to generate realistic outputs, and the origins of that data remain a point of contention across multiple industries. For publishers, the concern is economic. For Hollywood, it is structural.

Studios and agencies are watching closely because the same principles apply. If text-based content can be incorporated into training data without clear licensing frameworks, the question becomes whether visual and performance-based material could follow a similar path. That possibility is what has driven much of the resistance from actors, writers, and rights holders over the past year.

OpenAI has attempted to address these concerns through partnerships and licensing deals with selected publishers. However, those agreements have not resolved broader industry skepticism. Many companies remain unclear on how training data is sourced, how it is attributed, and how value flows back to original creators.

The renewed push from publishers highlights the gap between private agreements and public expectations. Even as some companies strike deals, others are calling for more comprehensive frameworks that would apply across the industry. Without those standards, each negotiation becomes a standalone case, limiting scalability.

For Hollywood, the outcome of this dispute could set an important precedent. If publishers succeed in establishing stronger licensing requirements, similar models could be applied to film, television, and performance data. That would reinforce the role of studios and agencies as gatekeepers of valuable content.

If they do not, the industry may face a more fragmented landscape, where rights are negotiated inconsistently and enforcement varies by jurisdiction. That uncertainty complicates long-term planning, particularly for companies investing in AI-driven production tools.

The broader tension is between innovation and control. OpenAI’s technology depends on access to large volumes of data, while content industries depend on maintaining ownership over their work. Reconciling those priorities will require more than individual deals.

The conversation is no longer about whether AI will be used in media.

It is about the terms under which it will be allowed to operate.

Previous
Previous

UK Government Signals Stronger AI Copyright Enforcement Ahead

Next
Next

Meta Expands AI Video Tools Across Instagram and Facebook