To Mine Or Not To Mine - Training of (Generative) AI Models under the TDM Exception in EU Copyright Laws

For the training of generative AI models, developers usually have to gather large amounts of mostly copyrighted data (texts, images, code, music etc.). Datasets are provided online for download or can be assembled individually via crawling & scraping.

In both cases, copies and sometimes further adjustments of data are necessary in preparation for AI training - acts relevant to copyright law, usually requiring a license. Since June 2021, the "text and data mining exception", introduced via the DSM directive, is a part of national copyright laws in the member states of the EU, allowing for license-free copies for the purposes of text and data mining (TDM). It is since being discussed whether the TDM exception can be applied to the training of (generative) AI models, and how respective reservations can be made in a "machine-readable format".

The talk will present the first German court decision (09/2024) on the application of the TDM exception to AI analysis and provide an overview of the discussion regarding the applicability of the TDM exception to AI training in Germany. Participants are also invited to join a constructive discussion regarding the feasibility of a machine-readable reservation and the importance of international / EU-wide (copyright) rules on AI training.