OpenAI Says Creating A.I. Would Be 'Impossible' Without Copyrighted Material Amid NYT Lawsuit

OpenAI claims that a copyright lawsuit brought against it and Microsoft Corp. by The New York Times is “without merit.”

The organization behind Chat-GPT and text-to-text-image generator DALL-E, OpenAI, has made a controversial request to be given legal access to copyrighted content for A.I. development purposes.

GERMANY-US-INTERNET-AI-ARTIFICIAL-INTELLIGENCE
A photo taken on November 23, 2023 shows the logo of the ChatGPT application developed by US artificial intelligence research organization OpenAI on a laptop screen (R) and the letters AI on a smartphone screen in Frankfurt am Main, western Germany. Sam Altman's shock return as chief executive of OpenAI late on November 22 -- days after being sacked -- caps a chaotic period that highlighted deep tensions at the heart of the Artificial Intelligence community. The board that fired Altman from his role as CEO of the ChatGPT creator has been almost entirely replaced following a rebellion by employees, cementing his position at the helm of the firm. KIRILL KUDRYAVTSEV/AFP via Getty Images

In an article published by ARTNET, it is revealed that developer OpenAI has recently been inundated with copyright infringement lawsuits from the New York Times as well as authors like George R.R. Martin, Jodi Picoult, and Jonathan Franzen.

Artnet went on to report that in order to "train" A.I. systems, companies like OpenAI have scraped vast amounts of data, including copyrighted material, from the internet in a process known as "text and data mining." Towards the end of last year, a document circulated online that outlined how the company Midjourney had built a database of artists to train its text-to-image generator. A lengthy list of creators whose work had been used for this purpose included major contemporary artists like Anish Kapoor, Gerhard Richter, Damien Hirst, Banksy, and Yayoi Kusama, among many others.

Last month, the New York Times sued OpenAI and Microsoft, a leading investor in OpenAI, accusing them of "unlawful use" of its work to create their products. In a submission obtained by The Guardian, to the House of Lords communications and digital select committee, OpenAI said it could not train large language models such as its GPT-4 model -- the technology behind ChatGPT -- without access to copyrighted work.

What Does OpenAI Have To Say?

"Because copyright today covers virtually every sort of human expression - including blog posts, photographs, forum posts, scraps of software code, and government documents - it would be impossible to train today's leading AI models without using copyrighted materials," said OpenAI in its submission, first reported by the Telegraph.

OpenAI said it complies with all copyright laws when training its models and that "we believe that legally copyright law does not forbid training." Nonetheless, OpenAI and other similar rival companies have been accused of illegally free-riding on authors' and artists' work. The New York Times has demanded the company destroy any systems that were trained using its work.

However, OpenAI has signed deals with publishers such as the Associated Press and Axel Springer, the German media giant that also owns Politico and Business Insider, to gain access to their content.

Despite this, Artificial Intelligence claims the company appears unwilling to fundamentally alter its data collection and training processes given the "impossible" constraints self-imposed copyright limits would bring. Instead, they hope to rely on broad interpretations of fair use allowances to legally leverage vast swathes of copyrighted data.

For now, OpenAI is betting against copyright fanatics in favor of limitless copying to drive ongoing AI development.

Tags
Artificial intelligence, New York Times, Microsoft, Lawsuit, Copyright, Publishing
Real Time Analytics