AI Litigation: Copyright Theft, Deepfakes, and Who Can Be Sued

1. What Ai Litigation over Copyright Theft Involves and Why Training Data Cases Are Accelerating

Copyright theft in AI litigation refers to the practice of scraping copyrighted text, images, code, and other creative works from the internet without license or compensation to train commercial AI models, and the plaintiffs bringing these cases argue that the resulting models were built on stolen property regardless of how the AI companies characterize the process.

The New York Times filed suit against Microsoft and OpenAI in December 2023 alleging that millions of Times articles were used to train AI models that now reproduce and compete with Times journalism, directly substituting for the publication's own paid content. Getty Images has sued Stability AI in Delaware alleging that over 12 million Getty photographs were scraped without license to train Stable Diffusion, with the model producing outputs that reproduce recognizable Getty watermarks as artifacts of the unlicensed training. A separate class action, Andersen v. Stability AI, was filed by visual artists alleging the same training pipeline violated their copyrights in thousands of individual works. These cases share a common factual core: AI companies used other people's creative work to build valuable commercial products without asking, paying, or crediting the creators.

The AI companies' primary defense is fair use under 17 U.S.C. § 107, asserting that training an AI model is transformative use because the model learns statistical patterns rather than reproducing the original works. The plaintiffs counter that the models demonstrably reproduce training content when prompted, that active licensing markets for AI training data already exist and were bypassed, and that the effect on the market for original creative work is severe because the AI outputs directly compete with human creators. An attorney who handles copyright infringement lawsuits and AI training data matters can evaluate where a company's specific training practices fall on the infringement-to-fair-use spectrum before litigation forces that evaluation publicly.

How the Fair Use Defense Works in Training Data Cases and Where Courts Are Finding It Insufficient

Fair use under 17 U.S.C. § 107 requires courts to balance four factors: the purpose and character of the use including its transformativeness, the nature of the copyrighted work, the amount and substantiality of the portion used, and the effect on the potential market for the original. AI training data cases are testing each factor in ways no prior copyright case contemplated.

The transformativeness argument is the AI companies' strongest card and their most contested one. The Second Circuit's 2015 decision in Authors Guild v. Google, 804 F.3d 202, found that scanning books to create a searchable index was transformative because it did not substitute for reading the books. AI developers argue that training a model is similarly transformative because the model learns patterns rather than storing copies. Rights holders argue that the analogy fails because AI outputs, unlike a search index, can reproduce the original work's substance, style, and in some cases verbatim passages when prompted, making the use more substitutive than transformative.

The market harm factor has emerged as the decisive battleground because licensing markets for AI training data have developed rapidly, creating a market the AI companies bypassed by scraping. When the Associated Press, stock photo agencies, and publishers have entered paid AI training data agreements, the argument that unlicensed scraping caused no market harm becomes substantially weaker, because the plaintiff can demonstrate that a market existed and that the AI company simply chose not to pay for access to it. An attorney who handles copyright laws and AI fair use analysis matters can evaluate the specific training data sources used, the model's output characteristics, and the existing licensing market to assess whether the fair use defense is viable.

Ai Copyright Claim	What Was Taken	Model Defendant	Market Harm Theory
Text and journalism	Published articles and books	OpenAI, Anthropic	AI outputs substitute for paid journalism and books
Visual art and photography	Images, illustrations, photographs	Stability AI, Midjourney	AI image generators replace commissioned art
Source code	Open source and proprietary code	GitHub Copilot (Microsoft)	AI code generators reduce demand for programmers
Music and audio	Recordings, lyrics, compositions	Suno, Udio	AI music generation competes with licensed music

2. How Deepfakes Create Ai Litigation Claims Across Multiple Legal Theories Simultaneously

Deepfake AI litigation arises when generative AI creates realistic fabricated images, audio, or video depicting a real person in situations they never experienced, and the victim's legal options span defamation, right of publicity, false endorsement under the Lanham Act, and biometric privacy claims depending on how the deepfake was created and how it was used.

The right of publicity, recognized under state law in most states and providing individuals the exclusive right to control commercial use of their name, image, and likeness, is the most direct legal theory for deepfakes that appropriate a person's appearance for commercial purposes without consent. A deepfake advertisement that places a celebrity's face on a product endorsement they never agreed to, a deepfake video that uses a performer's likeness in content they would never have appeared in, and a deepfake audio clip that uses a musician's voice to generate new songs the musician never recorded each violate the right of publicity as recognized in states including California, New York, and Tennessee. An attorney who handles rights of publicity and AI deepfake matters can evaluate which state's right of publicity law applies to the specific deepfake use and what remedies including injunctive relief and damages are available.

Defamation claims are available when a deepfake falsely depicts the plaintiff engaging in conduct that harms their reputation, such as fabricated video of a public figure committing a crime, making statements they never made, or participating in activity inconsistent with their character. The defamation analysis requires identifying the publisher of the false depiction for purposes of fault and damages, which in AI deepfake cases implicates the platform that distributed the content, the tool that generated it, and in some cases the person who prompted its creation. An attorney who handles AI deepfake and defamation matters can evaluate which party in the creation and distribution chain carries primary liability under the applicable state's defamation standard.

How Deepfake-Specific Legislation Is Expanding Victim Rights Beyond Common Law Claims

Common law defamation, right of publicity, and privacy tort claims existed before deepfakes, but their elements were designed for situations involving human creators making deliberate choices, not AI systems generating synthetic content at scale, which is why state legislatures have enacted deepfake-specific statutes that create new rights and remedies the common law did not provide.

California, Texas, Georgia, Virginia, and more than a dozen other states have enacted laws specifically addressing deepfakes, with the most significant focusing on three categories: non-consensual intimate deepfakes depicting real people in sexual content without consent, political deepfakes that falsely depict candidates or public officials making statements they never made during election periods, and commercial deepfakes that use a person's likeness for advertising or product promotion without authorization. These statutes create civil causes of action with specific damages provisions, some including statutory damages that do not require proof of actual harm, and in several states include criminal penalties for the most serious violations.

The Tennessee ELVIS Act, effective March 2024, specifically addresses AI voice cloning, prohibiting the use of AI to replicate a person's voice for commercial purposes without consent and creating civil remedies for voice cloning violations independent of existing right of publicity claims. The law's passage reflects the music industry's particular vulnerability to AI voice cloning, which allows AI systems to generate new recordings in the style and voice of specific artists without license or compensation. An attorney who handles AI deepfake and digital rights matters can evaluate which state's deepfake statute applies, whether the specific content triggers statutory damages, and how the statutory claim interacts with underlying common law theories.

Biometric privacy violations represent a significant AI litigation category that intersects directly with deepfake cases: facial recognition AI systems that collect and process facial geometry to identify individuals or to generate deepfakes from existing images trigger biometric privacy obligations under the Illinois Biometric Information Privacy Act and similar statutes. BIPA imposes liquidated damages of $1,000 per negligent violation and $5,000 per intentional violation without requiring proof of actual harm, and class actions under BIPA against AI companies that processed faces from large image datasets without consent have produced billion-dollar potential liability calculations. An attorney who handles biometric privacy violations and AI facial recognition matters can evaluate whether the AI system's image processing triggers BIPA compliance obligations and what consent and data handling practices reduce the exposure.

3. Who Can Be Sued in Ai Litigation and How the Liability Chain Determines the Right Defendant

The question of who can be sued in AI litigation is often as contested as the underlying legal theory, because the AI supply chain involves multiple actors, including the company that built the foundation model, the company that fine-tuned or deployed it in a product, the platform that distributed the output, and in some cases the user who prompted the creation of the harmful content.

In training data copyright cases, the foundation model developer is the primary defendant because the infringement occurs at the training stage, and the companies that built the models on unlicensed data bear the primary copyright exposure regardless of who deployed the resulting model. The deployer who used a third-party model in its product may also face claims when it had reason to know the model was trained on unlicensed data or when the specific deployment amplified the infringement, such as by marketing the AI's ability to reproduce the style or content of specific creators. Contractual indemnification provisions in AI API agreements typically run from the deployer to the developer, not the other way, leaving deployers exposed to downstream claims from creators without contractual protection from their AI vendor.

In deepfake cases, the defendant analysis shifts based on the specific use: the platform that hosts and distributes deepfake content faces potential liability where Section 230's immunity does not clearly apply to AI-generated content the platform generated or substantially directed; the AI tool provider that sold access to a deepfake generation system faces right of publicity and potentially contributory copyright liability when the tool was used to create unauthorized content; and the individual who prompted and distributed the deepfake bears direct liability for defamation, harassment, and in states with deepfake-specific statutes, statutory violations. An attorney who handles AI-related fields and technology liability matters can evaluate which defendant in the specific AI harm's creation chain carries primary liability and what evidence establishes each party's role.

How Companies That Deploy Ai Tools Inherit Liability They Did Not Create

Companies that integrate third-party AI into their products face litigation exposure they did not create but cannot entirely avoid, because deployment decisions, safeguard choices, and user disclosures each create independent legal obligations that the AI vendor's terms of service do not satisfy.

A company that deploys an image-generation AI in a consumer product has made specific decisions about which prompts to permit, what content filtering to implement, whether to include copyright attribution, and how to warn users about the AI's limitations. Each decision point creates potential liability: permitting prompts that generate deepfakes creates right of publicity and defamation exposure; failing to filter outputs that reproduce copyrighted training content creates secondary copyright infringement exposure; and failing to warn users that AI-generated content may be inaccurate creates consumer protection exposure when users rely on that content.

The deployer's exposure is not eliminated by the fact that the underlying AI was built by someone else, because the deployer is the entity that put the AI in front of users, collected revenue from its use, and made the specific product choices that determined what harm was possible. Terms of service between the AI developer and the deployer that disclaim liability for third-party claims do not bind those third parties, and the deployer faces direct claims from injured creators, depicted individuals, and deceived users regardless of contractual language between it and its AI vendor. An attorney who handles data privacy litigation and AI deployment liability matters can evaluate the specific deployment decisions that create the greatest exposure and identify what product changes and disclosures most effectively reduce the risk.

4. Frequently Asked Questions about Ai Litigation

AI litigation questions arrive from creators who discovered their artwork appearing in AI-generated outputs without credit or compensation, from individuals whose face or voice was used in deepfake content they never authorized, and from companies evaluating whether the AI tools they have already deployed expose them to copyright or biometric privacy claims from the people whose data those tools processed. Those situations generate the following questions.

What Is Ai Litigation and What Are Its Two Most Common Starting Points?

AI litigation encompasses lawsuits arising from artificial intelligence systems' development, training, deployment, and outputs. Its two most active categories are copyright theft claims by creators whose work was used without license to train AI models, and deepfake claims by individuals whose face, voice, or likeness was used in AI-generated content without consent. Copyright theft cases are brought primarily against the companies that built foundation models on scraped training data. Deepfake cases are brought against the platforms, tools, and individuals responsible for creating and distributing AI-generated fabrications of real people.

Can I Sue an Ai Company for Training Its Model on My Copyrighted Work?

Yes, and many creators already have. The legal theory is copyright infringement under 17 U.S.C. § 106, and class action cases filed by visual artists, authors, and news organizations against Stability AI, OpenAI, and other AI developers are currently pending in federal courts. The AI companies assert fair use as their primary defense, arguing that training a model is transformative use. Plaintiffs counter that the models reproduce training content in their outputs, that licensing markets for training data exist and were deliberately bypassed, and that the AI outputs directly substitute for and compete with the original creative work. The outcome of these cases will determine whether the foundational legal basis for most commercial AI products was lawful.

What Legal Options Exist for Deepfake Victims?

Deepfake victims have multiple legal options depending on the content and how it was used. Right of publicity claims are available when the deepfake uses the person's likeness for commercial purposes without consent. Defamation claims are available when the deepfake falsely depicts the person engaging in conduct that harms their reputation. Intentional infliction of emotional distress is available when the deepfake was created to harass or harm. State deepfake-specific statutes in California, Texas, Tennessee, Virginia, and more than a dozen other states create additional civil remedies, including statutory damages that do not require proof of actual harm, and some states have enacted criminal penalties for the most serious deepfake violations. An attorney who handles AI deepfake and digital rights matters can evaluate which theories apply and which defendants are reachable.

Is Section 230 a Complete Defense for Platforms That Host Ai-Generated Deepfakes?

Section 230 immunity, which protects platforms from liability for third-party content, does not clearly apply to AI-generated content for several reasons. When the platform's AI system generated the deepfake rather than a human user submitting content, the platform is arguably the author rather than a neutral conduit, eliminating the third-party content protection that § 230 provides. When the platform has been notified of specific deepfake content and fails to remove it, some courts have found that continued hosting transforms the platform from a passive conduit into an active participant. State deepfake-specific statutes also may preempt § 230 to the extent federal law permits. The § 230 protection that reliably protected platforms from user-generated content liability is being actively tested and narrowed in the AI-generated content context.

Who Is Liable When an Ai Tool Is Used to Create a Deepfake: the Tool Provider or the User?

Both may face liability under different theories. The user who prompted and distributed the deepfake bears direct liability for defamation, right of publicity violations, and applicable state deepfake statute violations. The AI tool provider faces potential contributory liability when it designed the tool in a way that facilitated or encouraged unauthorized deepfake creation, failed to implement reasonable safeguards against misuse, or marketed the tool in ways that invited the harmful use. The platform that hosted and distributed the deepfake faces liability to the extent Section 230 does not apply. An attorney who handles rights of publicity and AI deepfake matters can evaluate which party in the chain of creation and distribution carries the primary damages exposure for the specific content.

How Does Biometric Privacy Law Connect to Ai Deepfake and Facial Recognition Cases?

AI systems that process facial images to identify individuals, to generate deepfakes from existing photographs, or to train models on datasets containing real people's faces trigger biometric privacy compliance obligations in states with biometric privacy laws. Illinois BIPA requires written consent before collecting facial geometry and imposes liquidated damages of $1,000 to $5,000 per violation without requiring proof of harm. Class actions under BIPA against AI companies that processed millions of faces without consent have produced massive theoretical liability exposure. Companies using any AI system that processes human faces, whether for security, advertising analytics, content generation, or model training, need to evaluate BIPA and similar state statutes before deployment. An attorney who handles biometric privacy violations and AI compliance matters can audit whether the specific AI application triggers consent and data handling obligations and what remediation steps most effectively reduce existing exposure.

01 Jun, 2026

Ai Litigation: Copyright Theft, Deepfakes, and Who Can Be Sued

Contents