Clash of the titans
There is uncertainty among copyright holders as to how they can create a legally effective reservation of rights within the rapidly evolving age of artificial intelligence. Mark Hyland is not a robot
The EU’s much anticipated AI Act came into force on 1 August 2024. As the world’s first comprehensive law on AI, it would be difficult to overstate its importance. Not only is it a pioneering piece of legislation, but it is also very quickly becoming a compelling legal template for other jurisdictions.
From the perspective of the intertwined themes of the race to regulate AI and AI geopolitics, the EU is now very much in the vanguard. In addition, the fact that the AI Act has extraterritorial effect further underscores its truly global legal reach.
While the act is very much concerned with fostering trustworthy AI in Europe and beyond, this article will focus on the copyright provisions in this landmark law.
Maze of the Minotaur
Like all EU secondary legislation, the AI Act aims to harmonise laws across the EU 27, in this case, the laws concerning the use and supply of AI systems in the EU.
Despite its common-law sounding title, the AI Act is, in fact, an EU regulation. This means that it will have general application, be binding in its entirety, and be directly applicable in all 27 EU member states.
The formal title of the AI Act is Regulation (EU) 2024/1689 of the European Parliament and of the Council laying down harmonised rules on artificial intelligence.
Article 1 of the AI Act sets out the purpose of the new law. It is “to improve the functioning of the internal market and promote the uptake of human-centric and trustworthy AI, while ensuring a high level of protection of health, safety, fundamental rights enshrined in the Charter, including democracy, the rule of law and environmental protection, against the harmful effects of AI systems in the union and supporting innovation”.
It is worth noting that, in the context of the EU Charter of Fundamental Rights, intellectual property is protected by article 17(2) as a ‘subset’ of property.
Between them, recitals 4 and 5 of the act provide a balanced ‘assessment’ of AI. Recital 4 states that AI contributes to “a wide array of economic, environmental, and societal benefits across the entire spectrum of industries and social activities”.
The recital then sets out a large number of industries/sectors that benefit tangibly from AI – for example, healthcare, transport and logistics, and environmental monitoring, to name but a few.
Recital 5 takes a more circumspect approach and asserts that “AI may generate risks and cause harm to public interests and fundamental rights that are protected by union law”.
The recital states that the harm might be material or immaterial, including physical, psychological, societal, or economic in nature.
Reading these two compelling recitals together, it is clear that AI is a double-edged sword.
The Odyssey
As part of promoting trustworthy AI, the AI Act takes a risk-based approach, categorising AI systems according to four levels of risk: unacceptable risk, high risk, limited risk, and minimal risk.
Where an AI system falls into the unacceptable risk category, it will be prohibited. A good example would be real-time biometric identification in publicly accessible places (subject to certain exceptions). Such an AI system would pose an unacceptable risk to individuals’ safety, rights, or fundamental values.
At the other end of the spectrum, a basic email filter classifying messages as spam would fall into the ‘minimal risk’ category.
Article 3 of the act is the definitions provision. Reflecting its comprehensiveness and complexity, 68 different terms are defined.
For our current purposes, the following definitions are important: ‘AI system’, ‘general-purpose AI model’, and ‘general-purpose AI system’.
Drawing inspiration from the OECD, the term ‘AI system’ is defined as “a machine-based system that is designed to operate with varying levels of autonomy and that may exhibit adaptiveness after deployment, and that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments”.
Of relevance to the IP component of the AI Act is the term ‘general purpose AI model’. This is defined as “an AI model, including where such an AI model is trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable of competently performing a wide range of distinct tasks regardless of the way the model is placed on the market, and that can be integrated into a variety of downstream systems or applications, except AI models that are used for research, development, or prototyping activities before they are placed on the market”.
A practical example of a general-purpose AI model is ChatGPT, which exploded into public life on 30 November 2022 and has had an impact in many different industries and professions.
The golden fleece
While it is clear that the AI Act is not a copyright-specific law, it is important to note that it contains provisions that protect the interests of copyright owners in an AI world.
Significantly, article 53 lays down obligations for providers of general-purpose AI models that relate to technical documentation, information sharing, compliance with copyright law, and public disclosure.
Interestingly, article 53(b) is broader than just copyright law, in that it refers to the need to observe and protect intellectual-property rights and confidential business information or trade secrets in accordance with union and national law.
Article 53(1)(c) obliges providers of general-purpose AI models to put in place a policy to comply with EU law on copyright and related rights and, in particular, to identify and comply with (including through state-of-the-art technologies) a reservation of rights (or use of works) expressed pursuant to article 4(3) of Directive (EU) 2019/790 on copyright and related rights in the digital single market (the CDSM Directive).
Where a rightsholder reserves his/her rights under article 4(3), the reservation will take that rightsholder’s copyright content outside the text and data-mining (TDM) copyright exception.
By opting out of the TDM copyright exception, the rightsholder prevents mining of their reserved copyright works. Thereafter, a third party will only be able to mine such content with the rightsholder’s authorisation.
Article 4(3) states that the reservation should be effected “in an appropriate manner, such as machine-readable means in the case of content made publicly available online”.
Recital 18 of the CDSM Directive illuminates this requirement by stating that metadata should be incorporated into the machine-readable means, and that the reservation of rights could be effected by way of “terms and conditions of a website or a service”.
Article 53(1)(d) of the AI Act obliges the provider of general-purpose AI models to draw up and make publicly available a sufficiently detailed summary about the content used for training of the general-purpose AI model, according to a template provided by the AI Office.
Besides the all-important article 53, there are several recitals in the act that are relevant to copyright. They are recitals 104 to 109 (inclusive).
While recitals may not be legally binding, they have persuasive effect and are often used by the European Commission and EU courts to interpret or illuminate an ambiguous substantive provision.
Recital 105 is compelling, as it underscores the importance of rightsholder authorisation in the context of TDM where a copyright exception/limitation does not apply.
The recital relates to general-purpose AI models, and it acknowledges that large generative AI (‘GenAI’) models present unique innovation opportunities, but they also present challenges to artists, authors, and other creators in terms of the way their creative content is created, distributed, used, and consumed.
The recital states that the development and training of such models requires access to vast amounts of text, images, videos, and other data. TDM techniques may be used extensively for the retrieval/analysis of such content, which may be protected by copyright and related rights.
Recital 105 reiterates the requirement of rightsholder authorisation where a third party wishes to use copyright-protected content unless, of course, a copyright exception or limitation applies.
Referring to the CDSM Directive, the recital speaks of the TDM copyright exception and the fact that rightsholders may choose to reserve their rights over their works to prevent TDM.
The recital then sets out the logical conclusion that, where a rightsholder has opted out “in an appropriate manner”, then providers of general-purpose AI models need to obtain an authorisation from that particular rightsholder if they wish to mine the reserved content.
Daedalus and Icarus
Given the clear nexus between TDM and GenAI, it is unsurprising that there is a significant copyright crossover between the AI Act and the CDSM Directive.
From a rightsholder’s perspective, the all-important reservation of rights (article 4(3), CDSM Directive) is copper-fastened by way of article 53 of the AI Act.
This is important as GenAI continues to rapidly evolve, driven by fierce competition among the tech companies, for example, Microsoft/ OpenAI, Meta (Llama 3), Apple (Apple Intelligence), and Google (DeepMind).
From a purely practical viewpoint, there is some uncertainty among rightsholders as to how they can create a legally effective reservation of rights.
Currently, there are no generally accepted protocols or standards concerning reservation of rights. While certain approaches are beginning to emerge, it remains unclear whether they will be acceptable to the major AI model providers. This uncertainty is complicating the important reservation-of-rights issue for rightsholders.
To dispel this legal uncertainty, the European Commission should intervene with clear guidance on the issue of reservation of rights. This guidance should focus on the practicalities of creating a reservation of rights/works under the CDSM Directive that will pass muster from both a legal perspective and a technological perspective (‘machine-readable means’).
Intriguingly, in parallel with the important legislative reiteration of the reservation of rights at EU level, there is high-profile litigation concerning GenAI and copyright-protected content in a number of different jurisdictions.
Unsurprisingly, the litigation relates to several different industries. As regards the music sector, there is the Tennessee case (recently transferred to the courts of California) Concord Music Group et al v Anthropic PBC and the very recently instituted proceedings in RIAA v Suno AI and Udio AI.
This case concerns AI audio, with the Suno AI-related litigation taking place in Massachusetts while the Udio AI-related case is being heard in New York.
In the publishing sector, there is the ongoing case between The New York Times and OpenAI/Microsoft being heard in the federal courts of New York.
Lastly, there are also legal proceedings in the visual-media sector. Getty Images is suing Stability AI in the English High Court, alleging illegal use of its images by the defendant in the training of its Stable Diffusion AI system.
Indications are that the case may proceed to trial in summer 2025. Should these cases go to trial, they will generate important judicial precedent for the common-law world.
As there is no TDM copyright exception in the US, American courts will have to examine and assess the alleged illegal activity through the prism of the fair-use doctrine.
Dr Mark Hyland is lecturer at the Faculty of Business, Technological University Dublin.
Mark Hyland
Mark Hyland
Dr Mark Hyland is lecturer at the Faculty of Business, Technological University Dublin.