Cover von: Das Urheberrecht als (KI‑)Innovationsbremse in der Rechtswissenschaft?
Tristan Radtke

Das Urheberrecht als (KI‑)Innovationsbremse in der Rechtswissenschaft?

Rubrik: Aufsätze
Jahrgang 17 (2025) / Heft 1, S. 1-52 (52)
Publiziert 05.03.2025
DOI 10.1628/zge-2025-0002
Beschreibung
The Article explores the interplay between copyright law and the use of large language models (LLMs) in legal research. Both, the training of LLMs and their use for research and evaluation of linguistic works involve reproductions, bringing subsequent data analysis indirectly under the purview of copyright law. This dynamic raises concerns about the potential inhibitory effect of copyright on innovation in legal research involving LLMs, especially given the concentration of rights ownership in Germany and potentially in other countries among a few legal database providers. Access to data compatible with the use of LLMs in the field of legal research is thus contingent not only on legal database subscriptions but also on the (limited) functionality offered by these databases, such as search tools and interfaces. The Article identifies three main strategies to mitigate these dependencies and foster innovation within the bounds of copyright law. First, the use of public domain judicial decisions and open access publications including works licensed under Creative Commons can facilitate scraping and text and data mining (TDM) strategies for training LLMs or as input for LLM-assisted research. Second, strategies could enable LLM training without copyrightrelevant reproductions by relying on derivative components of works that fall outside the scope of copyright protection. Such approaches highlight the LLM's reliance on linguistic syntax and general cultural knowledge rather than specific copyrighted works, raising questions about whether these elements require independent legal protection. Third, German copyright provisions implementing Art. 3 DSM-Directive and Art. 5, 6 InfoSoc-Directive, particularly §§ 44a and 60d of the Copyright Act (UrhG), combined with § 95b UrhG on the circumvention of technological protection measures, provide legal pathways for detaching from limitations in functionality of the legal databases and to allow crawling and scraping as prerequisite for TDM in connection with LLMs. These provisions allow researchers on the basis of the text and data mining exception to use LLMs for innovative purposes, such as automated analysis of legal texts or extracting legal arguments, provided the research pertains to specific projects and avoids data hoarding. While the legal framework protects the interests of rights holders, it imposes significant restrictions on the use of copyrighted works in LLM-related research. Lawful access remains a prerequisite, with further TDM-based use potentially subject to licensing fees unless the materials are open access. Given the market dominance of a few database providers, adjustments in pricing models are likely. The advantages of § 60d UrhG thus have to be seen in reducing transaction costs and enabling effective crawling of legal databases under § 95b UrhG. These advantages can contribute to unlocking innovative LLM applications in legal scholarship. Future legislative responses to the evolving tensions between copyright and LLMs will likely shape this emerging field and warrant continued academic scrutiny.