Which Country’s Copyright Law Governs the Training and Development of Generative AI for Commercial Purposes? A Stress Test for Copyright Territoriality

The author of this post is Michiel Poesen, co-director of the Centre for Private International Law and Transnational Governance and lecturer of private international law at the University of Aberdeen.

This post considers which country’s copyright law governs the training and development (“T&D”) of generative AI (“GenAI”) for commercial purposes. It does so from a European Union perspective.

It first sets out the starting position under Article 8(1) of the Rome II Regulation: the T&D of GenAI is governed by the copyright laws of the country where the relevant training and development activities took place. It argues that this default position allows developers of AI models that underpin GenAI to offshore T&D activities to jurisdictions with lenient copyright protection laws. Then, this post critically engages with the potential extraterritorial application of EU copyright law to T&D activities that took place outside of the EU.

The post concludes that some new ideas are needed should policy makers want to extend EU copyright law to T&D activities that take place outwith the EU.

One of GenAI’s Many Copyright Problems

GenAI relies on complex AI models (usually LLMs – Large Language Models) that are able to generate various types of output, such as images, text, and/or video. To be able to generate that output, AI models are trained on massive datasets, which comprise data scraped off the internet by developers and/or included in existing databases. Often those datasets contain materials that are protected by copyright. If no relevant copyright exception applies, the developer of an AI model intended for commercial use would ideally obtain a license allowing them to use protected materials to train and develop their model. In reality, though, training datasets may include copyrighted materials that are downloaded, stored, and used for T&D purposes without the copyright holder’s permission.

Arguably, the use of copyrighted work to train and develop an AI model might constitute a copyright infringement. There is a plethora of copyright infringement lawsuits around the world that centre on exactly this issue (for instance Getty Images v Stability AI in the UK, discussed here). Whether there is any merit to an infringement case is not the focus of this blogpost: answering this question is left to substantive copyright (for examples where a court accepted that a copyright exception applied, see for instance LAION v Knesche – where the use of copyrighted work was justified for research purposes). This being said, an arguably more fundamental procedural issue arises before we can decide whether the use of copyrighted work for T&D purposes constitutes a copyright infringement or not: which country’s copyright law should we apply to decide whether the use of copyrighted work to train and develop an AI model constitutes an infringement?

Copyright Territoriality: Applying the Law of the Country Where Training & Development Took Place

In the EU (and in most other legal systems, too), the starting position is relatively straightforward. T&D is governed by the copyright law of the country where it took place. This approach relies on two consecutive logical steps. As detailed in the next section, however, this fairly clear two-step approach has been muddled in the EU by the Artificial Intelligence Act (“AI Act”).

First, the basic principle to determine the law applicable to non-contractual copyright infringement is lex loci protectionis. If a copyright holder relies on copyright protection under the law of country X, then the copyright law of country X will apply to determine whether an infringement took place. This principle can be found in Article 8(1) of the Rome II Regulation. But this does not mean that copyright protection is automatically granted by the law of X.

In the second step, the law of the country thus identified decides whether copyright protection is afforded against the alleged act(s) of infringement. Crucially, this assessment has a territorial component: protection will only be afforded against infringements occurring on the territory of the country under whose copyright law the copyright holder claims protection. This follows from the territorial nature of copyright laws, according to which protection is only granted against infringements that occur on the territory of the country on whose copyright laws the copyright holder relies. If this principle is applied to the T&D of an AI model, then protection is only afforded if the T&D activities (i.e. the relevant act(s) of infringement) took place in the country under whose copyright law a copyright holder claims protection. To illustrate, if copyrighted materials were downloaded and stored for T&D purposes on a server in country Y yet the copyright holder claims protection under the law of country X, the copyright law of X will not afford protection. The copyright holder might be able to rely on copyright protection in country Y, though, if they can show that their work is granted copyright protection there.

It should be noted that T&D can comprise multiple acts of infringements. An excellent overview of the various steps involved in the T&D of an AI model is given here by the U.S. Copyright Office. In general terms, the law governing each individual act of infringement should be determined according to the approach outlined above. However, some jurisdictions may accept that secondary infringements are subjected to the law applicable to the primary infringement. Whether certain infringing acts occurring during T&D ought to be qualified as secondary infringements is subject to debate and ultimately is a matter of how the claimant chooses to frame their claim under substantive copyright law.

The two-step approach discussed here has a mayor flaw: it facilitates “forum shopping”. AI developers can exploit it to offshore servers and/or physical devices used during the T&D process to jurisdictions with weak copyright protection. Once the model is trained, it can then be introduced in one or more countries that have stronger copyright protection. To illustrate, a developer in country Z may decide to train and develop a model in country X to avoid strict copyright laws in Z which put stringent conditions on the use of copyrighted work for T&D purposes. Once T&D is completed, the model can then be placed onto the market in Z.

The forum shopping problem is not merely hypothetical. Forum shopping may be attractive given that copyright law is characterised by a high degree of national and regional diversity. To illustrate, EU law gives copyright holders the right to object against the use of their work by developers to train and develop an AI model, through a machine readable opt-out declaration (see Article 4 of the CDSM Directive). The approach described above would allow a developer to circumvent the EU’s opt-out mechanism by offshoring T&D activities to a third country that does not allow copyright holders to opt out as such (for example the US, where a more general “fair use” test is applied).

A Better Solution? Copyright Extraterritoriality

By now, it has become clear that the forum shopping problem is important, for it allows developers of AI models to avoid copyright protection by strategically relocating their T&D activities in a jurisdiction that has more favourable copyright laws. A growing number of scholars argue that the AI Act helps to address the forum shopping problem through the “market entry requirement” (a term used by Abbamonte (2024) 46 European Intellectual Property Review 479, 485). According to this approach, EU copyright law (including the CDSM Directive’s opt-out mechanism) applies to models that were trained and developed outside of the EU yet placed onto the Union market. The suggested approach is by no means new. It is also adopted in similar guises in other EU instruments, such as the General Data Protection Regulation (GDPR), Article 3(2)(a) or the Digital Services Act, Article 2(1). However, the “market entry requirement” rests on a somewhat intricate argument when applied to conflicts of copyright laws.

Various scholars (e.g. Peukert, 2024; Stieper and Denga, 2024; Rosati, 2024) have argued that the “market entry requirement” follows from Article 53(1)(c) of the AI Act, according to which providers of “general-purpose AI models” or “GPAI” (which includes LLMs used in GenAI systems) should “put in place a policy to comply with Union law on copyright and related rights”. This obligation applies if a GPAI model is place onto the Union market per Article 2(1)(a) of the AI Act. Therefore, if a US developer seeks to place a LLM model onto the EU market, then they should draft a policy that explains how they will comply with EU copyright law. Scholars argue that the obligation to put in place such a policy would be meaningless unless providers of AI models also must comply with EU copyright law if they place a model onto the EU market. In other words, models that are placed on the EU market yet trained outside of the EU should comply with EU copyright law.

While I agree that the forum shopping problem is highly relevant, I for one am not sure that the AI Act helps to address it. I am not convinced that the Article 53(1)(c) of the AI Act extends EU copyright law to T&D that took place outside of the EU. The obligation to put in place a copyright policy is just that. It does not touch upon the territorial scope of application of EU copyright law. Admittedly, Recital 106 of the AI Act lends some support to the “market entry requirement”, yet the text of the AI Act is perfectly clear and does not leave room for interpretation as to its meaning. It is my view therefore (which I share with Quintais, 2025 and 2024) that GenAI model training and development is subject to the copyright laws of the country where the relevant T&D activities took place. Which reading is the right one will need to be clarified sooner rather than later, since the issue is of great importance to the AI industry and creators whose work is used for T&D.

There may be alternatives for the “market entry requirement” to address the forum shopping problem. For example, one could argue that instead of the law of the country/countries where T&D took place, the law of the country where the server on which the protected material that was used to train and develop an AI model is stored should be applied. For instance, if a copyright holder’s work is stored on a server in country X, then the copyright laws of country X will determine whether the work can legally be used to train and develop an AI model in country Y. Peukert called this the lex scraping approach, as it subject T&D to the law of the country where the server is located on which the scraped material was stored. Another alternative would be the country of origin approach, according to which T&D is governed by the copyright law of the country where the infringer is resident (in the EU, habitual residence could be used).

The drawbacks of both the lex scraping approach and the country of origin approach is that they too merely recreate for forum shopping in a different form. The copyright holder might place their work on a server in a jurisdiction that has favourable copyright protection laws; the developer may relocate its habitual residence to a jurisdiction with lenient copyright laws.

New Ideas Are Needed

To conclude, the current territorial application of copyright law allows developers of GenAI systems to forum shop by offshoring the T&D of their AI models to jurisdictions with lenient copyright protection laws. The AI Act does not seem to alleviate this forum shopping problem, although several scholars have argued the opposite. Other approaches, such as the lex scraping approach or country of origin approach merely reproduce the forum shopping in a different form.

Some new ideas seem to be needed should policy makers aims to address the forum shopping problem. Territoriality as currently understood seems to be inadequate. It is conceived as geographical territoriality, since it is built on the assumption that the law of the country where one or more relevant acts of infringement take place have a legitimate claim to regulate an infringement. However, T&D activities lack any real geographical touchdown. The servers and/or physical devices used can be relocated essentially anywhere and often are located in more than one country. Instead, it might be more fruitful to conceive of territoriality in terms of the effects of an infringement in a country. For instance, if a model is trained outside the EU with work that is protected by copyright in the EU and consequently marketed in the EU, the effects of the infringement are likely to be felt in the EU. Alternatively, one could also allow courts to determine the applicable law by putting relevant connecting factors in the balance, such as the infringer’s habitual residence and place of business, the location of infringing activities, or the place where damage was caused. For an example of such a balancing approach, see Article 3:603 of the Principles for Conflict of Laws in Intellectual Property (“CLIP Principles”).

If policymakers in the EU intend to address the forum shopping problem that is raised by the T&D of AI models, some new ideas are needed about the connecting factors used to determine which country’s copyright laws are applicable.

Which Country’s Copyright Law Governs the Training and Development of Generative AI for Commercial Purposes? A Stress Test for Copyright Territoriality

One of GenAI’s Many Copyright Problems

Copyright Territoriality: Applying the Law of the Country Where Training & Development Took Place

A Better Solution? Copyright Extraterritoriality

New Ideas Are Needed

Like this:

Related

Related

One of GenAI’s Many Copyright Problems

Copyright Territoriality: Applying the Law of the Country Where Training & Development Took Place

A Better Solution? Copyright Extraterritoriality

New Ideas Are Needed

Share:

Like this:

Related

Discover more from EAPIL