The organization behind Chat-GPT and text-to-text-image generator DALL-E, OpenAI, has made a controversial request to be given legal access to copyrighted content for A.I. development purposes.
In an article published by ARTNET, it is revealed that developer OpenAI has recently been inundated with copyright infringement lawsuits from the New York Times as well as authors like George R.R. Martin, Jodi Picoult, and Jonathan Franzen.
Artnet went on to report that in order to "train" A.I. systems, companies like OpenAI have scraped vast amounts of data, including copyrighted material, from the internet in a process known as "text and data mining." Towards the end of last year, a document circulated online that outlined how the company Midjourney had built a database of artists to train its text-to-image generator. A lengthy list of creators whose work had been used for this purpose included major contemporary artists like Anish Kapoor, Gerhard Richter, Damien Hirst, Banksy, and Yayoi Kusama, among many others.
Last month, the New York Times sued OpenAI and Microsoft, a leading investor in OpenAI, accusing them of "unlawful use" of its work to create their products. In a submission obtained by The Guardian, to the House of Lords communications and digital select committee, OpenAI said it could not train large language models such as its GPT-4 model -- the technology behind ChatGPT -- without access to copyrighted work.
What Does OpenAI Have To Say?
"Because copyright today covers virtually every sort of human expression - including blog posts, photographs, forum posts, scraps of software code, and government documents - it would be impossible to train today's leading AI models without using copyrighted materials," said OpenAI in its submission, first reported by the Telegraph.
OpenAI said it complies with all copyright laws when training its models and that "we believe that legally copyright law does not forbid training." Nonetheless, OpenAI and other similar rival companies have been accused of illegally free-riding on authors' and artists' work. The New York Times has demanded the company destroy any systems that were trained using its work.
However, OpenAI has signed deals with publishers such as the Associated Press and Axel Springer, the German media giant that also owns Politico and Business Insider, to gain access to their content.
Despite this, Artificial Intelligence claims the company appears unwilling to fundamentally alter its data collection and training processes given the "impossible" constraints self-imposed copyright limits would bring. Instead, they hope to rely on broad interpretations of fair use allowances to legally leverage vast swathes of copyrighted data.
For now, OpenAI is betting against copyright fanatics in favor of limitless copying to drive ongoing AI development.