Navigating Copyright Challenges in AI Training: The Case of DeepSeek AI and More

Understanding AI, Copyright, and Legal Implications
As artificial intelligence (AI) technology grows rapidly, significant legal and ethical questions are emerging, particularly concerning how AI models are trained. Recently, allegations have surfaced that DeepSeek AI, a newer entrant in the AI landscape, might have utilized outputs from existing AI models, like those from OpenAI, in its training process. If a lawsuit arises from these claims, it could mark a unique case in the legal system, as there hasn’t been a similar precedent involving two AI companies dealing with these specific issues.
Key Legal Questions About AI-Generated Content
A central question in these allegations revolves around copyright: Can AI-generated outputs be copyrighted? If they can, does using such outputs to train another AI model infringe on copyright laws?
The answer is complex and varies across different countries. In some places, there is hesitation at the policy level to grant copyright protection to outputs produced solely by AI. In jurisdictions where copyright necessitates human authorship, works created entirely by computers are usually not protected. The rationale is that the control exercised by the person prompting the AI isn’t enough for copyright to apply. Nevertheless, AI-assisted creations can still fall under copyright protection. For instance, the U.S. Copyright Office recently confirmed that an artist who modified parts of an AI-generated image could claim copyright for their work.
On the other hand, the UK does acknowledge copyright for computer-generated works. However, discussions are ongoing about whether purely AI-generated content should maintain this protection in the future. Even in regions where such protection is available, AI model creators may face challenges asserting their rights unless they establish ownership over the outputs via licensing agreements or other legal measures.
A Closer Look at Licensing Issues
Besides copyright concerns, there is also the possibility that using OpenAI’s ChatGPT to train another AI model could violate its user license agreements. Many companies, including OpenAI, enforce terms of service that restrict their AI services from being used for training competing AI models.
However, proving a breach of these licensing terms is notably challenging. Since AI models utilize vast datasets to generate outputs, it’s unlikely they will produce exact replicas of any single source material. Additionally, a secondary model might be trained on a parent model’s outputs through repeated queries, further distancing itself from the original content. Unless an AI generates outputs that are identical to those of another model, demonstrating that it was trained using content from the other model would be difficult.
Showing that outputs are similar may also not be enough. Unless identifiable traits are consistently found in an AI model’s results—traits that distinctly correspond to a competing model—it can be argued that models trained on alike datasets would produce similar responses. The expansive quantity of AI-generated content makes it increasingly difficult for a claimant to prove substantial copying.
Navigating the Risks of Litigation
The intricate nature of proving AI copyright infringement, coupled with the risks associated with ongoing litigation—like detrimental publicity and heightened regulatory scrutiny—means that many disputes may be settled confidentially rather than reaching the courtroom. Nevertheless, a pioneering case in this field, where one AI company is accused of copying another’s outputs, could establish a crucial legal precedent. If a court provides a definitive ruling on these matters, it could significantly influence how AI companies tackle training approaches and copyright compliance in the future.
The situation surrounding DeepSeek AI exemplifies the evolving legal hurdles in AI training and copyright law, indicating that current laws are struggling to adapt to the swift changes in technology. Beyond copyright concerns, contractual obligations and claims from third parties introduce even more complexity to these discussions.
As AI continues to transform various industries, legal and regulatory frameworks will need to evolve to reconcile innovation with the need to protect intellectual property rights. The outcome of disputes like DeepSeek AI versus OpenAI will likely play a pivotal role in shaping the future of AI development and intellectual property legislation.