The New York Times alleges that OpenAI and Microsoft have unauthorizedly used millions of articles published by the newspaper to train their artificial intelligence models. Jussi Ilvonen, an IP lawyer at Kolster, speculates that copyright disputes related to artificial intelligence will be resolved worldwide for a long time.
In late 2023, The New York Times (NYT) filed a lawsuit against OpenAI and Microsoft for copyright infringements. According to NYT, the companies have copied and used copyrighted content from the newspaper without permission, utilizing them for the training of their AI models, such as ChatGPT. The newspaper sees AI applications as potential competitors and believes that its substantial investments in journalism are now being used to create substitute products.
In its blog post on 8 January 2024, OpenAI argues that the legal basis for NYT's lawsuit is without merit. The company claims to have acted in accordance with the fair use principle, a concept present in U.S. copyright law. The lawsuit followed failed negotiations between the parties that broke down at the end of 2023.
The case is anticipated turmoil and a consequence of the limited regulation of AI use in the USA and elsewhere, with virtually no established legal precedent in the field. As of this writing , there are also other copyright lawsuits related to generative AI pending in the United States and the United Kingdom. For instance, in the UK, the image bank service Getty Images accuses Stability AI of using copyrighted material in the training data of its Stable Diffusion AI application without permission.
In NYT's case, the pivotal question seems to be whether the court deems the actions of the companies developing AI to be in line with the fair use principle. The fair use doctrine allows, in certain situations, the use of copyrighted material without permission. Four factors are considered on a case-by-case basis: the purpose of use, the nature of the copyrighted work, the extent of use in relation to the work, and the impact on the market and value of the work.
In the U.S., fair use has been applied as a legal basis for unauthorized use of works, as seen, notably, in the Google Books case in 2015. The court ruled that digitizing, searching, and creating excerpts from works were not copyright-infringing uses but fell under fair use. One argument was that the activities did not compete directly with the copyright holders' previous use.
It is possible that the fair use defense may apply in the NYT vs. OpenAI case. It is also possible that the parties reach a settlement during the legal process.
As of now, there are no known lawsuits related to AI and copyrights in the European Union. However, legal disputes are likely to arise in the future.
In the EU level and Finland, legislation does not include a fair use-like principle. Specific copyright or other AI regulations are not yet in effect in the EU.
In 2019, the EU adopted the Copyright Directive (2019/790), which includes the so-called Text and Data Mining (TDM) exception. TDM exception may justify, under certain conditions, the use of works without the copyright holders' permission for AI training purposes.
However, the application of the TDM exception to various generative AI training situations may bring surprises when assessed in EU courts. Yet it is clear that the exception does not apply if the works used for training were not legally accessible or if the copyright holder has expressly and appropriately – particularly in machine-readable form – prohibited the use of their works for text and data mining purposes.
The EU is setting an example for the world with its AI Act, aiming to restrict the rapidly evolving use of artificial intelligence. This would be the first legislation of its kind.
In December 2023, EU decision-makers agreed on a new comprehensive law addressing the regulation of AI use. The final content of the regulation has not been disclosed, so it is unknown whether it will address copyright issues. According to a parliamentary statement, the regulation includes an obligation to respect copyrights. The regulation will next be subject to approval by the European Parliament and EU member states and is expected to be applicable in the EU in 2026.
The legal disputes involving The New York Times and Getty Images are significant pioneering cases from a copyright perspective. However, predicting their impact on the formation of legal practices related to generative AI and copyrights is challenging.
For now, disputes related to generative AI will likely be resolved through the interpretation of existing legislation, primarily copyright laws. This may persist for a considerable time. High courts in various countries, especially, have the opportunity to shape rules in this field over an extended period. I firmly believe that many negotiated solutions will emerge, easing the difficult task for legislators to manage the changes as AI develops.