ARTICLE
26 December 2025

AI Intellectual Property And Transactions Digital Advertising Playbook

LS
Lowenstein Sandler

Contributor

Lowenstein Sandler is a national law firm with over 350 lawyers working from five offices in New York, Palo Alto, New Jersey, Utah, and Washington, D.C. We represent clients in virtually every sector of the global economy, with particular strength in the areas of technology, life sciences, and investment funds.
Intellectual property concerns are central to the evolving legal framework surrounding generative AI and machine learning.
United States Technology
Lowenstein Sandler LLP’s articles from Lowenstein Sandler are most popular:
  • within Technology topic(s)
  • with Inhouse Counsel
  • with readers working within the Technology and Law Firm industries

Introduction

Intellectual property concerns are central to the evolving legal framework surrounding generative AI and machine learning. Training large language models involves significant volumes of copyrighted text, images, audio, and video data, which raises unresolved questions about direct and vicarious copyright infringement, as well as the potential reproduction of expressive elements in outputs.

Moreover, in order to provide accurate and current answers to user prompts, many AI systems rely upon up-to-date information that is retrieved from the web in an automated, bot-based process referred to as retrieval augmented generation ("RAG") and grounding. RAG and grounding work together by scraping the web of current information and anchoring an AI system's output to that retrieved information so responses remain accurate, verifiable, and contextually relevant rather than relying solely on pre-trained model data.

In addition to concerns about copyrighted materials, the use of confidential, proprietary, healthcare, personal, and other sensitive information introduces further challenges related to data provenance, lawful sourcing, and the extent to which businesses can control how their information is ingested, transformed, or reused within AI systems. As AI tools increasingly rely on both licensed and unlicensed datasets to produce accurate and up-to-date outputs, these combined IP and confidentiality risks shape the contractual protections, governance practices, and compliance obligations necessary for responsible deployment.

Courts are currently assessing whether the ingestion of copyrighted material constitutes fair use and whether AI-generated outputs should be treated as derivative works, especially when they reflect stylistic or substantive similarities to protected source material. Some early U.S. federal court decisions have blessed the concept of AI model training on copyrighted content as a 'fair-use' in creating "spectacularly transformative" technology, while cases regarding infringing AI outputs are still in the early stages. These disputes reflect a broader concern among creators, publishers, and rights holders that unlicensed training diminishes the value of their work and undermines established licensing markets.

INTENDED AUDIENCE

This playbook is intended primarily for legal and procurement professionals – both in-house and external – who evaluate contracts and agreements related to the use of data and intellectual property by AI-powered products. In the digital advertising context, this often appears in the form of content licensing agreements between publishers and AI developers, as well as terms of service and related contracts for tools, platforms, and services that rely on or enable AI within the advertising ecosystem.

HOW TO USE THIS PLAYBOOK

The aim of this playbook is to provide readers with practical guidance for navigating the contractual, technical, and intellectual property issues that arise from the use of AI. Although it is impossible to cover every use case, the agreements that are explored below may serve as a useful guide to drafting, amending, or negotiating agreements and terms of service that concern AI. More specifically, this playbook can be used in the following ways:

  • As a drafting reference when drafting or amending content licenses, data use or data processing agreements, or terms and conditions. In general, this is a useful reference for any agreement governing an engagement that involves, or could involve, AI ingestion, scraping restrictions, model training, or grounding/RAG access.
  • As a negotiating guide to understand, evaluate, and negotiate various provisions implicated by the unique issues concerning AI (e.g., system improvement rights and output ownership disclaimers) and determine when additional protections are needed.
  • As a benchmark for legal, privacy, product, and commercial teams, who need a shared framework for understanding inputs, outputs, derivative works, and reuse rights in the AI context.

DISCLAIMER

Interactive Advertising Bureau, Inc. ("IAB") provides this playbook as a practical guide and resource for general information. Please be aware that this playbook does not constitute legal advice, and if you have any legal questions, please consult your attorney. Although IAB has made efforts to assure the accuracy and currentness of the material in this playbook, it should not be treated as a basis for formulating business and legal decisions without individualized legal advice.

IAB makes no representations or warranties, express or implied, as to the completeness, correctness, currentness, or utility of the information contained in this playbook and assumes no liability of any kind whatsoever resulting from the use or reliance upon its contents.

Legal and Technical Overview

AI's wide availability and the manner in which AI models are trained have generated considerable legal discussion around how existing IP protections can be applied to a technology that mirrors, and in many contexts outpaces, human intelligence. The principle legal issue concerning generative AI pertains to the fair-use doctrine, which states that the unlicensed use of a copyrighted work is permitted if, among other criteria, the use of the work is sufficiently transformative. AI developers contend (and some courts have agreed in limited circumstances) that the ingestion of copyrighted works by an AI system is akin to humans learning how to read for purposes of writing their own original books.

Since 2020, numerous publishers, writers, and other content creators have filed lawsuits against AI developers alleging extensive, unauthorized use of their copyrighted works. Related litigation pertaining to web scraping and RAG has exposed the risks of web scraping to the publishing and digital advertising industries. Notably, in LinkedIn v. hiQ, a federal circuit court found that scraping publicly-available information did not violate the Computer Fraud and Abuse Act, which criminalizes unauthorized access to "protected computers."

RELEVANT CASE LAW

Bartz v. Anthropic. Multiple authors and publishers sued Anthropic for the use of their work in training Anthropic's LLMs. In an important ruling, the court found that Anthropic's use of copyrighted materials obtained lawfully was fair-use in a partial decision on the merits, but that its use of pirated material was not. The parties ultimately settled the matter prior to resolution of the remaining issues. As such, the fair-use decision was not appealed. The major takeaways are:

  • The finding was limited to the AI system's training materials as opposed to any outputs.
  • The court found that Anthropic's use of copyrighted works to train its AI system was "spectacularly transformative."
  • The court reasoned that such training on copyrighted material was akin to humans learning how to read, internalizing the content, and creating new works of authorship later.
  • The court found that the purpose of the training was not to reproduce the works, but to generate new, novel text.

Eschewing litigation, many publishers have elected to enter into headline-grabbing licensing deals with AI developers. The New York Times entered into a multi-year licensing agreement with Amazon; News Corp and Associated Press both entered into licensing agreements with OpenAI; and Reddit struck a $60 million per year licensing deal with Google. Although the agreements have not been made public, some important provisions have been reported. For example, the News Corp agreement with OpenAI has a five-year term and provides the media company with cash and credits to use OpenAI's products.

These cases – together with a growing number of pending cases – demonstrate a legal environment that is unpredictable and ambiguous with respect to IP issues. The interest of the federal government in fostering AI innovation and economic growth may also impact the direction these legal issues take and how businesses approach AI-related agreements. Further, although unlikely, Congress could amend the Copyright Act or create a new federal AI law that definitizes these uncertain IP issues. Foreign jurisdictions have taken steps to modify their own copyright laws to create a competitive technical edge and encourage innovation. Notably, South Korea has announced plans to ease copyright rules to support AI development, and the EU's AI Office has issued guidance on how AI system developers can innovate while complying with EU copyright law. In the meantime, multiple courts in the United States are currently navigating legal issues related to the use of AI, with publishers and content creators spearheading this litigation.

ADDITIONAL RESOURCES

At the time of this writing, courts are actively deciding, among many other related issues, whether the training on copyrighted material by the purveyors of AI models constitutes fair use and whether AI developers can be liable for the wrongdoing of their tools. How courts decide these issues will fundamentally alter how AI is developed, augmented, and used.

IAB has published several playbooks and white papers, which, taken together, can be used to create a common framework and language to discuss issues related to AI use in digital advertising. These resources include:

  • Legal Issues and Business Considerations When Using Generative AI in Digital Advertising
  • AI Governance and Risk Management Playbook
  • AI in Advertising Primer
  • AI in Advertising Use Case Map
  • AI Personalization Playbook

MANAGING CRAWLERS AND SCRAPERS

AI agents and AI-driven search are cutting publisher traffic and ad revenue, while fueling a surge in bot scraping. These practices are placing significant strain on the economic sustainability of the open web. Due, in part, to the absence of any defined legal frameworks around AI, many organizations are evaluating technical solutions to help mitigate some of the financial impact caused by generative AI's massive ingestion of website content. To protect their bottom line, publishers are increasingly turning to multiple solutions to manage AI access, monetize content use for training purposes, and ensure accurate brand representation through standardized APIs and integration frameworks.

For decades, publishers (and website owners generally) have relied on the robots.txt protocol to control the access of bots and web crawlers. However, robots.txt is a voluntary protocol and works only if bots choose to honor it. As of the second quarter of 2025, research shows that many bots continue to simply ignore publishers' preferences regarding bot crawlers. In addition, robots.txt operates on an all-or-nothing basis (i.e., bots are either blocked or allowed), so publishers cannot set conditions to access, such as licensing terms, usage restrictions, or monetization requirements.

At present, there are a variety of technical solutions that publishers can add to their tech stack to protect their IP from being used for unauthorized training and scraping. These solutions empower the shift away from passive blocking toward active licensing and monetization by offering scrapers and crawlers an approved process for licensing the publisher's content for AI training or RAG.

REALLY SIMPLE LICENSING ("RSL") STANDARD

Rather than relying solely on the robots.txt protocol, the RSL standard enables publishers to embed machine-readable licensing terms directly into their web infrastructure. RSL terms specify when attribution is required, whether fees are associated with the content being scraped, and whether payment is on a pay-per-crawl or pay-per-inference basis. The RSL standard grants publishers and website owners the ability to:

  1. Define terms around usage and compensation;
  2. Automate licensing workflows via an Open Licensing Protocol; and
  3. Set a license fee for content that is otherwise ingested at scale without compensation.

DATA MARKETPLACES

Other providers have created data marketplaces that can be used to make publisher content available for licensing in the context of AI training. Notably, some companies (such as Cloudflare) have rolled out tools that more effectively monitor and block bots and crawlers, which in turn allows them to enforce content licensing agreements and set pay policies for AI bots.

The IAB Tech Lab recently launched its LLM Content Ingest API Initiative (aka the AI Content Monetization Protocols / CoMP) to provide a technical framework that can help publishers control access to their content, enforce attribution, manage bot access, and negotiate monetization or licensing terms with AI systems.

Deciding the right tool for your business will depend on many factors, but many large publishers have turned to a balanced set of technical solutions and licensing agreements. As bots and other web crawlers ingest greater amounts of web content for model training and RAG purposes, the importance of reclaiming, or at least redefining how content is protected and valued, becomes paramount. Technical controls place limits on who can access protected content, and how. License agreements do the same while providing additional clarity on compensation and ownership. Taken together, these two approaches can help balance the need for innovation with the economic realities of the internet.

Pre-Transaction Considerations

Conducting due diligence on your vendors is a critical first step that precedes any contract negotiation. Many industry vendors have already implemented processes and procedures to protect confidential business information, intellectual property, and personal information, which often serve as a foundation to AI-related due diligence.

Pre-transaction due diligence begins when business or marketing teams request to onboard or develop a new product, tool, or service. AI-specific gating questions should be built into existing requests for information, requests for proposal, privacy impact assessments (PIAs), and any other assessments that are designed to assess organizational risk.

Building an organization's information gathering checklists for vendor due diligence can be a critical step to both avoiding wasted efforts negotiating with vendors that do not meet the organization's standards and enhancing the organization's ability to negotiate for protections the vendor should be able to agree upon based on its responses.

Enterprise vs. Public Models. A central question during pre-transaction due diligence concerns whether you will use a public or enterprise model. Below is a chart depicting some of the pros and cons of each model type.

Model Type Pros Cons
Enterprise Model
  • Enterprise-grade security
  • Enhanced scalability
  • Custom admin controls
  • Greater IP protection and control over inputs
  • Higher cost
  • Greater governance burden
Public Model
  • Often cheaper than enterprise models or free of charge
  • Greater training inputs from a higher volume of users
  • Potential data and IP leakage
  • Little to no ability to control or audit data use
  • One-sided terms of service

To view the full pdf, click here.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More