ARTICLE
11 November 2025

Getty Images v Stability AI – The Most Important AI Legal Decision To Date

WF
William Fry

Contributor

William Fry is a leading corporate law firm in Ireland, with over 350 legal and tax professionals and more than 500 staff. The firm's client-focused service combines technical excellence with commercial awareness and a practical, constructive approach to business issues. The firm advices leading domestic and international corporations, financial institutions and government organisations. It regularly acts on complex, multi-jurisdictional transactions and commercial disputes.
In a judgment that will reverberate through courtrooms and boardrooms for years to come, Justice Joanna Smith DBE has delivered the most impactful legal decision...
Ireland Technology
Barry Scannell’s articles from William Fry are most popular:
  • with Senior Company Executives, HR and Inhouse Counsel
  • with readers working within the Aerospace & Defence, Consumer Industries and Law Firm industries

In a judgment that will reverberate through courtrooms and boardrooms for years to come, Justice Joanna Smith DBE has delivered the most impactful legal decision yet on the nature of artificial intelligence and copyright law.

The case, Getty Images v Stability AI, began with sweeping allegations that an AI company had committed wholesale copyright infringement on a staggering scale. It ended with a ruling that may fundamentally reshape our understanding of what an AI model actually is under the law.

Getty Images, one of the world's largest visual content libraries, alleged that Stability AI had trained its Stable Diffusion image-generation model on millions of Getty's copyrighted photographs. When Stable Diffusion was released publicly by Stability, a model card was published, stating that the model was a latent diffusion model that utilised a fixed, pre-trained text encoder trained on a large-scale dataset of nearly six billion image URLs, known as LAION-5B.

By the time the judgment was handed down, Getty had abandoned most of its claims. What remained was a narrow dispute over trade mark infringement and a novel legal theory that sought to apply decades-old copyright concepts, designed initially to combat bootleg VHS tapes sold at flea markets, to the cutting edge of machine learning technology.

The court's conclusion was unequivocal. Model weights, the mathematical parameters that constitute the essence of an AI system, are not copies of training data. They store nothing. They reproduce nothing. They are, in the judge's words, "purely the product of the patterns and features which they have learnt over time during the training process."

What follows is an analysis of the judgment and its implications for the collision between nineteenth-century legal concepts and the cutting edge of twenty-first-century technology.

Initial Observations and Technical Background

At the outset of the judgment, Smith J expressed frustration that shortly before closing submissions, the claimants, Getty Images, had abandoned various aspects of their claim. While this narrowed the issues to be determined by the court, the judge commented on how it rendered large parts of the opening submissions and evidence irrelevant.

The judge stated that AI models use an artificial neural network architecture designed to approximate the structure of synaptic connections in the brain. However, this is somewhat of an oversimplification that could be potentially

unhelpful. While biological neural networks inspire artificial neural networks, they do not approximate the structure of biological neural networks in any meaningful biological sense.

The Collapse of Getty's Main Claims

Training and Development Claims Withdrawn

The judge stated that, notwithstanding the pleaded case, it was now acknowledged by Getty Images that there was no evidence to support the claim that the training and development of Stable Diffusion took place in the UK, commonly referred to as the Training Development Claim. It was also acknowledged that the type of prompts which were alleged to have been used to generate the examples of infringement output from the model in evidence had been blocked by Stability. This meant that the relief to which Getty Images would have been entitled in respect of their allegations of primary infringement had actually been achieved. The so-called Output Claim was therefore also abandoned. Given its inherent link to the Training Development Claim and the Output Claim, a claim for database right infringement could no longer be advanced either.

The primary reason for dropping these claims was that Getty Images was unable to prove that the training and development of Stable Diffusion took place in the UK. This would have involved, it was alleged, unlawful copying, and would have been the basis of the claim.

The critical jurisdictional point was that Stable Diffusion was originally released following an agreement between certain parties, where Stability AI provided access to cloud hosting and processing services made available to Stability by Amazon Web Services (AWS). Critically, this AWS cluster was located outside of the UK. Stability stated that it utilised the AWS cluster to promote the development and growth of open-source machine learning models.

These dropped claims were, in essence, the major crux of this case, and many believed that the judgment resulting from them would be inconsequential. This may not be the case now.

What Remained

All that was left, then, after Getty Images withdrew those claims, was that the normal use of Stable Diffusion by users in the UK would, in some cases, generate synthetic images bearing Getty Images' own trade marks, contrary to trade marks law, and that this could constitute misrepresentation or passing off in UK and Irish law. The other remaining claim was that the actual Stable Diffusion model was an infringing copy of the copyright works. Getty Images contended that Stability AI had imported into the UK an article, namely the model Stable Diffusion, which was an infringing copy of the copyright works.

Getty Images did not actually say that Stable Diffusion itself was a copy or that it stores any of the copyright works. But they contended that the model was an infringing copy under the UK copyright legislation because the making of its model weights would have constituted infringement of the copyright works had it been carried out in the UK.

Getty Images asserted copyright infringement during the training and development of the model by Stability AI, in respect of millions of copyrighted works owned or held as an exclusive licensee by Getty Images. However, this was not addressed by the judge, as it did not occur in the UK and was subsequently withdrawn.

Rapid Technological Evolution

Something particularly interesting about the speed at which this technology is developing was highlighted in the judge's discussion of its development. The academic paper that led to the image-generating AI was first published in December 2021, concerning latent diffusion, and represented the "Attention is all you Need" paper for image generation (the "Attention is all you Need" paper was the research paper that introduced the world to transformer architectures, which underlies the large language models we now have today). The concept of latent diffusion, first conceived in late 2021, was commercially rolled out as a text-to-image diffusion model in a remarkably short period. What is striking from reviewing the generated images submitted as evidence in the case is how poor those early images were, and how extraordinarily quickly the technology has improved in a few short years.

Model Versions and Training Data

The difference between the model versions was important. In November 2022, Stability launched Stable Diffusion v2.0. The judge noted that the Stability GitHub page explained that v2.0 had been trained from scratch. In other words, it did not utilise any of the model weights obtained from training the v1 models. It was a fresh start using a fresh data set and a different text encoder model.

Interestingly, v2.1 was trained on a subset of the LAION-5B dataset, which contains adult, violent, and sexual content, but in contrast to the v1 model cards, the v2 model card went on to say that to mitigate the problems with the LAION dataset, they filtered the dataset using LAION's own NSFW (not safe for work) detector, which was set to produce a P-unsafe score of 0.1, which was conservative.

However, from the Stable Diffusion XL model card onwards (SDXL v1.0), it did not record how the models were trained.

The LAION Dataset and Copyright Litigation

The judge stated that it was common ground that all versions of Stable Diffusion were trained using various subsets of the LAION-5B dataset. This dataset was assembled by LAION, which is a not-for-profit organisation registered in Hamburg. It comprises nearly six billion text-to-image URL pairings (e.g., teaching a child words using flashcards, such as "apple" paired with a picture of an apple).

Interestingly, LAION was subject to entirely separate proceedings in a lower court in Hamburg, where a German photographer sued it for copyright infringement. LAION assembled this dataset of scraped images from the internet based on URLs and was sued for copyright infringement, but actually won its case by successfully arguing that it could rely on Article 3 of the Copyright Directive. Article 3 of the Copyright Directive permits text and data mining for research purposes through the research exception.

However, the issue with the research exception is that one cannot use text and data mining material for commercial purposes. Even though one is not allowed to use it for commercial purposes, and even though Stability AI used the LAION dataset to train its model, which was a commercial model, an interesting element of the case was where the judge stated that Stability admitted that when the LAION-5B dataset was published, Stability AI donated support to LAION in the form of hosting services comprising of access to the AWS cluster. Information contained in the model cards from various iterations of Stable Diffusion indicates that the models were trained on the LAION datasets.

The Critical Question of Copying

Paragraph 48: The Materialisation Process

Paragraph 48 of the judgment is absolutely critical. The judge stated that it was common ground that, for training purposes, it would have been necessary for Stability AI to download the images from the URLs in the LAION subsets, a process known as materialisation. Stability AI itself stated that the training process involves downloading and storing copies of each image obtained from the URLs in the relevant dataset on Amazon's AWS cluster, then retrieving those images and making temporary copies of them in the VRAM of the GPUs performing training on the AWS cluster. This is why Paragraph 48 is critical, as the judge here sets out the actual infringing act that occurs in AI training. This is where the copies of copyrighted material are made. This is where the reproduction requiring a licence takes place. However, as previously stated, Getty Images withdrew its training infringement claim, and (unfortunately for AI/copyright enthusiasts) the judge did not have to make a decision on that point.

Stability's Admissions

Interestingly, Stability did acknowledge that at least some of the LAION datasets contained URLs referencing images on Getty websites, and that these were likely used during the training of Stable Diffusion. Stability also accepted that Stable Diffusion may be used to generate synthetic images which include marks in the form of Getty Images watermarks, but it argued that where a user generated such images, it was the result of third-party use of Stable Diffusion, and that any generation of watermarks did not amount to use of any sign or trade mark in the course of trade, and that one could only really get watermark synthetic outputs with the wilful contrivance of the user.

The Sample Works

Even though Getty Images claimed that Stable Diffusion was trained on millions of Getty Images' works, they chose to rely on 11 sample works, identified as works A-K, for the purpose of establishing subsistence and ownership of copyright.

In the recitals to an order in January 2025, Stability confirmed that, for the purposes of the proceedings, it would not seek to challenge the case advanced by Getty, and that certain sample works were used in the training of some of the Stable Diffusion models.

The CSAM Dispute

An interesting element of the trial was that on the first day of the trial, Stability AI sought clarification as to whether an allegation in the claim to the effect that Stable Diffusion could be used to create images that contain pornography, violent imagery, and propaganda, and that any association with that type of content would tarnish the reputation of Getty Images trade marks, could be properly read as referencing child sexual abuse materials (CSAM). There was considerable legal argument surrounding this issue, but the judge ultimately determined that the pleadings did not include a reference to CSAM. This decision was subsequently appealed to the Court of Appeal. The Court of Appeal agreed with the High Court, however. Getty Images then attempted to amend their claim to include a specific reference to CSAM, but this was dismissed and not appealed.

The Trade mark Infringement Case

The Core of Getty's Remaining Claim

Regarding the trade mark infringement aspect, this was all that remained in the claim after Getty Images withdrew the copyright infringement and database infringement elements. Getty Images alleged that Stability committed acts of trade mark infringement because the outputs from Stable Diffusion contained watermarks that were essentially the same as Getty Images' watermarks. Stability argued that it was not correct in a trade mark infringement context to rely on the use of the models per se rather than the use of an infringing sign. However, the judge felt that this was the only way to address the trade mark issue.

Watermark Generation Experiments

Stability admitted in their defence that it may be possible to generate synthetic image outputs which feature watermarks from Getty Images, and in their closing they accepted that the Getty watermark experiments (these were experiments that both sides carried out to see if Stable Diffusion could produce watermarks) showed that it was possible to push the models to generate watermarks, but Stability argued that these experiments were contrived and that the experiments failed on certain models. Importantly, and this is very relevant for the ultimate judgment, Stability stated that Getty Images could not and were not able to prove at trial that any UK user of the various models in issue had ever been presented with a watermark on a Stable Diffusion synthetic image within the UK.

Understanding Memorisation

The Technical Concept

In the agreed technical primer for the case, the experts discussed what is called memorisation, which is a major feature of the New York Times' claim against OpenAI. As the judge explained in simple terms, this involves reproducing an existing work through AI output. In the New York Times case, it concerned the reproduction of New York Times articles; however, in this instance, the analysis pertained to the potential reproduction of images used in training.

The technical primer noted the following:

"The network's weights are optimised on the training data, but its goal is to perform well on previously unseen data. In the context of Stable Diffusion, unseen data means new random noise patterns and/or new text inputs. To work reasonably on such new data, the network must be able to 'generalise': to recognise and understand the general patterns and rules in the training data and be able to apply them in a different context.

If a network has been trained for too long on the same training data or an insufficiently diverse training data, it can be prone to 'overfitting'. Overfitting occurs when the network uses its weights or part of its weights to memorise the individual training images rather than representing a large set of training images jointly with these weights. Overfitting is characterised by small errors on the training data, but a high error rate on new, unseen data. Overfitting is an undesired feature in machine learning, which engineers try to avoid.

Deep networks can both generalise and memorise at the same time. In such case, the network uses most of its weights to represent general patterns in the data, but uses some part of its weights to memorise individual patterns. The presumed primary cause for memorisation is duplication of training data, either by explicit duplication or by training the network for too many epochs, in conjunction with patterns that cannot be easily represented together with other patterns in the dataset – so-called 'outliers'".

The Watermark Experiments

Experimental Methodology

The judgment contains a long and detailed analysis of the various experiments both sides carried out to see if they could produce Getty Images watermarks in the model outputs. One of the experts described the Getty Watermark experiments as an adversarial attack and likened them to academic research, in which researchers had attempted to determine whether a model is capable of being manipulated into generating specific outputs. The experts agreed that the Getty prompts likely overestimated the general prevalence of watermarks.

Using Caption-Based Prompts

One of the interesting ways in which the Getty watermark was generated was by using prompts that were the captions for the images Getty Images claimed were used by Stability AI. One of them, for example, was a lengthy and complicated prompt about Barack Obama meeting the French President on the White House lawn, and it produced what, by 2025 standards, was a comically poor AI-generated image, but it did have the Getty Images watermark.

Statistical Analysis of Prompts

An interesting feature of the case was the analysis of the prompts that went into the Getty Images experiments. Both sides were trying to generate identical images or images to reproduce the Getty Images logo. Where they achieved success was when there were verbatim copies of Getty Images captions associated with photos. However, as the judge noted, of all the prompts analysed (which, by the way, was just a tiny fraction of the total prompts available), only a total of 690, accounting for 0.15% of the total prompts analysed, met the specific criteria. That 0.15% comprised the English word prompts entered by users since the beginning of 2024, where five or more words matched the words in an image caption associated with Getty Images.

There was also a conversation about whether users were using verbatim Getty Images prompts, but it did not seem to be the case that this was happening. Where there was a verbatim prompt, it was for the statement "a flower in the middle of a desert," which the judge and the parties accepted was a trite phrase that was not unique to Getty Images.

Limited Value of Experiments

Ultimately, both experts stated that the watermark experiments were of limited value in determining the likely probabilities and do not indicate that real-world prompts will generate watermarks. The judge noted that they were of little assistance in deciding whether real-world users in the UK had actually generated watermarks from any of the models at issue in the case. The judge made no determinations regarding the statistical probability of watermarks being generated and ultimately found that there was no evidence of any users in the UK generating these watermarks. Although Stability AI acknowledged that models can generate watermarks from various types of prompts, the primary value of that experiment was to highlight the absence of more detailed experiments.

The judge also stated that she considered there to be no real evidence supporting Getty Images' claim that Stable Diffusion users would copy Getty Images' captions and paste them into the model to generate watermarks or near-identical images.

Which Models Infringed?

Scope of Trade mark Infringement

The judge found that based on the discussion of reproduction of watermarks and the evidence given to the court in terms of production of watermarks, the question of trade mark infringement based on the evidence only arose in respect of the v1 models, insofar as the Stable Diffusion Model v1 was accessed via the Dream Studio platform (which applied to v1.4 only) and/or the developer platform, and also it only applied in relation to v2 of the model. The trade mark claims only applied to those versions.

The Stochastic Challenge

The judge commented on how the situation was complicated owing to the stochastic nature of the models, given that since no two images would be exactly the same, when watermarks actually did appear, they would appear in different forms. They might be extremely distorted or blurred watermarks bearing little real resemblance to the actual trade marks; they might just look similar or bear some resemblance, which is different from a traditional trade mark case. An example was given of a picture of mountains with a Getty Images watermark. While it clearly resembled a Getty Images watermark, even Getty Images acknowledged that there came a point where it became too distorted or blurred, such that it could not support the claim of trade mark infringement under the UK Trade Mark Act.

The Average Consumer Test

The Legal Framework

As the judge pointed out, many questions in trade mark law must be assessed from the perspective of the average consumer of the relevant goods and services. The judge conducted an analysis of the relevant case law in this regard, applying the average consumer test to strike the right balance between various competing interests, such as the need to protect consumers on the one hand and the promotion of free trade in an openly competitive market on the other. The judge stated that Stability admitted that Getty Images has a substantial reputation and goodwill as a creator of visual content in the UK and as a licensor of that content, and it could be safely assumed that potentially relevant consumers were familiar with it.

Three Types of Average Consumer

However, the class of average consumers who are exposed and likely to rely on the sign must be considered, not just the average person, but who is actually going to rely on the Getty Images logo? Both parties accepted that the class itself may be split into classes of average consumers with different attributes, depending on the different ways in which each class will encounter the signs.

There were three types of average consumers for that purpose:

  • The first was the average consumer who downloads the code base and model weights from GitHub and Hugging Face and runs the inference offline, which is a very esoteric type of average consumer.
  • The second type was the average consumer who runs the inference on Stability's computing infrastructure using an API, in other words, the developer platform on Stability AI.
  • The third type of average consumer was the people who use Stability AI's Dream Studio service and run the inference on Stability's computing infrastructure using the web interface. To my mind, that would be the actual average consumer.

Getty's Broader Argument

Getty Images later contended that ordinary members of the public with little to no technical skills would be exposed to Getty Images signs on synthetic images. Getty Images suggested that designers might screen grab images with watermarks to mock up a design which others could see. The judge asked the question whether it was realistic to suppose that users of Stable Diffusion would forward images bearing watermarks to others or share screenshots showing images with others in the post-sale context. The judge stated that, given the evidence, she did not consider it realistic or representative to suppose that there was a class of unsophisticated average consumers who would forward images bearing watermarks to others or share screenshots showing images with others in the post-sale context.

Context Matters

The judge stated that it was common ground that the court must consider the actual use of the sign complained of in the context in which it has been used. Context is important, and there was a discussion of the applicable legal precedent related to the context point. The judge then got into the infringement point with a discussion of the Trade mark Act in England and the concept of infringement, identifying that a person infringes a registered trade mark if they use in the course of trade a sign which is identical with the trade mark in relation to goods or services which are identical with those for which it is registered. She analysed some case law regarding infringement, specifically examining the use of a sign, the identity of the mark and sign, and the identity of goods and services.

Who Bears Responsibility?

Rejecting the Passive Tool Argument

An enormously important point in the context of AI, in the context of AI liability, in the context of IP infringement liability, in the context of contracts, AI transactions, and IP transactions, was that the judge rejected Stability's submission that it was the user and not Stability who was responsible for conditioning the circumstances in which outputs are generated. This is a really important practical point for AI users and providers to consider. The evidence provided added to the overall picture, showing that while users have some degree of control, they do not have complete control, which is a significant point.

The evidence presented was that the model itself is just a subcomponent of a much wider pipeline, which involves post-processing and filters. While the model itself can produce outputs without post-filtering, it can still produce "not safe for work" (NSFW) outputs. There is also a filtering functionality that prevents the rendering of photoreal likenesses of well-known public figures, which is applied to models.

The judge rejected Stability's submission that it was the user, not Stability, who was wholly responsible for conditioning the circumstances in which the outputs were generated, given the substantial pipeline of filters and post-processing. While the judge stated that the evidence showed that Stability exercised a greater degree of control over the outputs generated by users accessing via Dream Studio or the developer platform than it did in respect of users who downloaded just the model and ran the model locally, she stated that Stability made v1 of the model available via the API, the developer platform, and v1.4 available via Dream Studio, thereby taking responsibility for those releases and employing a post-processing filtering service. In terms of v2 onwards, Stability made deliberate choices regarding the content and makeup of the data set on which the models were trained, as well as the filters to be applied at that stage.

A Critical Point – Training Data Responsibility

AI developers will need to pay attention to the important point addressed by the court, that where an AI developer is making deliberate choices as to the content and the makeup of the data set that the model is trained on, that carries with it a certain amount of responsibility that cannot be foisted onto users. The judge also stated that Stability was responsible for the model weights for the versions referred to in the judgment. The judge found that, to her mind, it went beyond merely creating the technical conditions necessary for the use of the sign. The judge pointed out that while an AI model such as Stable Diffusion may be described in simple terms as a tool to enable users to generate images, she stated that this was not a complete description. This is also important: a model on its own is not enough. One has to consider the entire pipeline, as the term is used in the judgment. One must consider all the other models that work in concert with the main model, performing filtering, post-processing, and pre-processing. The model alone is insufficient.

Control Over Watermark Generation

One of the main issues identified by the judge from the evidence was the suggestion that one of the reasons the model produced the Getty Images watermark was that Stability failed to remove images bearing that watermark from the dataset. The judge questioned how a user could have any control over whether a watermark is produced. The judge found that the only entity with any control over the generation of watermarks on synthetic images was Stability. It was not as passive as Stability submitted.

The CSAM Distinction

There was an interesting conversation surrounding CSAM, where Stability's contention was that the criminal responsibility for anyone using the model to produce CSAM would lie with the user, not Stability, and that users who choose their prompts thereby generate the images. The judge acknowledged that, yes, that was true from a criminal perspective, but that the obvious distinction is that, unlike pornography or violent images, watermarks appear where the user does not even seek to generate them. The inadvertent appearance of the watermarks was important.

User Understanding

Stability contended that the average consumer had a sufficient technical understanding to appreciate why the watermarks appear. This was arguably a substantial stretch. Few people have any remote concept of why the watermarks appear. Only the most sophisticated users would have this idea. The judge more or less took the same approach. There was also the issue of branding and the potential for Getty Images logos to appear on images that would not be acceptable and could negatively impact the Getty Images brand.

The Trade Mark Judgment

Affixing Signs to the Market

Ultimately, in relation to the trade mark point, the judge held that there was evidence of output images generated by v1 and v2 of Stable Diffusion, which included the Getty Images logo. The judge found that the average consumer would perceive this as a commercial communication by Stability. The judge noted that Stability was running a business in the UK and providing Stable Diffusion to consumers as part of that business, and that the signs were affixed to synthetic images generated by customers owing to the functionality of the model, which itself was dependent on the training data over which, as the judge noted, Stability had absolute control and/or responsibility. This is, again, a critical point. It was in this way that Stability offered and put synthetic images bearing the signs on the market. The judge found that it created the impression that there was a material link in the course of trade between the goods concerned and the trade mark proprietor in relation to the average consumer using Dream Studio, who would have a less sophisticated understanding of AI.

Assessing Similarity and Identity

The judge then had to consider the similarity of the trade marks, as they were often distorted and presented out of context, and none of them was identical to the actual trade mark. The judge actually found that there was an identity of the mark and sign in relation to the iStock marks, but she found that there was no evidence of any real-world use of a sign that was identical.

Specific Findings

Ultimately, then, the judge found double identity infringement by Stability in respect of the iStock watermarks generated by users of v1. She dismissed the claim of double identity infringement in relation to v1 and v2 in respect of Getty Images watermarks, dismissed the claim of double identity in relation to v2 in respect of the iStock watermarks, and found that there was no evidence of a user in the real world generating an image bearing a watermark from SDXL or v1.6 and dismissed that claim.

The judge also found that she considered Getty Images to have made out its case in relation to section 10(2) of the Trade marks Act in respect of the iStock mark and in terms of model 2 in relation to the Getty Images mark.

Dilution and Tarnishment Claims Fail

The judge did not think that the market would be diluted by the proliferation of these images with the mark, resulting in a change in economic behaviour. The judge did not believe this would be a type that was in any sense realistic. Getty argued that the marks are inherently distinctive such that they would be brought to mind inherently and strongly when the average consumer encounters them and that the use of the sign would inevitably weaken the marks' ability to identify the goods for which they are registered as coming from Getty Images as there has been and would continue to be a proliferation of synthetic image output images bearing the marks. That was an important point, that there would continue to be a proliferation of those images.

However, as the judge pointed out, it had been three years since the model was first created, and the judge noted that Getty Images had not sought to run any case based on the probability of the number of watermarks that may have been produced. There was at least one infringing trade mark produced by v1 and v2, and Stability did not run or had no case of de minimis. However, the judge essentially stated that, considering all the circumstances, there was no evidence from which she could infer that the trade marks would continue to be used, thereby diluting the distinctiveness of the watermarks.

The evidence also showed that Stability had taken steps to filter out content not suitable for work from the training datasets. Getty Images' concern was that not-safe-for-work content could bear the Getty Images logo, which would tarnish the Getty reputation. But as the judge stated, there was not a scrap of evidence that watermarks had in fact appeared on pornographic images in the real world at any time in relation to the model. There were no examples of watermarks appearing on not-safe-for-work images other than those of Miley Cyrus, and there was no evidence to explain why this might have happened in relation to the Miley Cyrus image. Actually, in cross-examination, it emerged that there were no safe-for-work images of Miley Cyrus available on Getty Images websites, albeit they came with the appropriate warnings. For all of these reasons, the judge found that Getty Images' claim of section 10.3 infringement failed. There was then an analysis of the tort of passing off, but the judge did not ultimately address this.

Secondary Infringement: The Most Significant Element of the Decision

The Legal Framework

A critically important aspect of this judgment dealt with Getty's claim of secondary infringement. As the judge pointed out in the judgment, the English Copyright Designs and Patents Act 1988 (Copyright Act) differentiates between primary and secondary acts of infringement. Secondary acts of infringement are broadly addressed in downstream dealings or involvement, as opposed to acts which originate from reproductions of copyright works. Getty Images claimed that Stability committed two acts of secondary copyright infringement. But since they abandoned their Training and Development Claim, they had to focus solely on the claim that the actual model, Stable Diffusion, was an infringing copy.

Applying Bootleg Concepts to AI

Something to consider in light of the judgment is that when copyright legislation in Ireland and the UK was developed, the concept of an infringing article was designed to address bootlegs, such as bootleg cassette tapes or bootleg VHS tapes, which would have been available in flea markets and similar venues. However, Getty Images was trying to apply that concept, which is still enshrined in our copyright legislation, to the cutting edge of human technology, namely a large AI model. Getty Images wanted the Stable Diffusion model to be considered an infringing copy in the same way that a bootlegged version of "Independence Day" would have been in the 1990s, because it was imported into the UK, and its creation in the UK would have constituted an infringement of the copyright.

Getty's Argument

Getty Images contended that the infringement occurred due to the importation of the Stable Diffusion model, which was downloaded in the UK, and its distribution in the course of business by Hugging Face. The definition in the English Copyright Act states that Section 22, dealing with secondary infringement and importing an infringing copy, says that the copyright in a work is infringed by a person who, without the licence of the copyright owner, imports into the United Kingdom, otherwise than for private and domestic use, an article which is, and which the person knows or has reason to believe is, an infringing copy of the work. Secondary infringement is possessing or dealing with an infringing copy of an article which is an infringing copy of the work.

The definition in English Copyright Law of an infringing copy is in Section 27. An article is an infringing copy if its making constitutes an infringement of the copyright of the work in question. The issues of law that were relevant were whether Stable Diffusion was capable of (i) being an article for the purposes of Sections 22 and 23, and (ii) being an infringing copy for the purposes of Section 27.

The Model Weights Argument

Getty Images was not saying that the model itself comprised a reproduction of any of the Getty copyright works. Their submission was that the definition of "infringing copy" in the Copyright Act was sufficiently broad to encompass an article, including an intangible article, whose creation or making involved copyright infringement. They pointed out that it was common ground that the training of the model involved reproducing, through storage, the Getty copyright works, both locally and in cloud computing resources. In terms of what the article stated, it mentioned that this involved the model weights, and the process that would have been involved would have been the optimisation of the model weights, which would have required repeated exposure to training data through a process known as gradient descent. This gradient descent fundamentally alters the composition of the weights in the model. Getty's argument was that if that making had been done in the UK, it would have constituted an infringement.

Stability's Counter-Argument

But Stability was of the view that the term article in the Copyright Act was limited to tangible objects. It stated that intangible or abstract information, such as the Stable Diffusion models in issue, was not articles. Stability's view was that the construction was clear from the broader context in the Copyright Act. They also stated that infringing copy could not apply to Stable Diffusion in circumstances where it was trained on copyright works in the US. Copies of those works were never present within the model, and the model had never had the slightest acquaintance with a UK copyright work.

Issues to be Determined

The issues of fact to be determined were (i) whether Stable Diffusion is an infringing copy as a matter of fact, (ii) whether Stability AI had knowledge or reason to believe that Stable Diffusion was an infringing copy, and (iii) whether Stability AI had committed secondary acts of copyright infringement.

What AI Models Actually Are – the Single Most Important Part of the Judgment

Models Do Not Store Training Data

To consider the secondary infringement issue, the judge needed to determine what an AI model actually is.

Importantly, and this is critical to the wider context in terms of data protection, copyright, etc., the judge noted, at paragraph 552, that at the outset of the judgment, she said that (as was agreed by the experts) Stable Diffusion does not itself store the data on which it was trained. The judge made it clear that Stable Diffusion did not store the data on which it was trained. Irrespective of potential for memorisation etc., the model did not store the data on which it was trained, according to the judge. The judge stated that it was not possible to determine whether Stable Diffusion was capable of being an infringing copy without a clear understanding of what Stable Diffusion actually was.

Expert Evidence on Model Function

The evidence accepted by the judge is repeated verbatim as follows:

"8.36...in order for a diffusion model to successfully generate new images, that model must learn patterns in the existing training data so that it can generate entirely new content without reference to that training data.

8.37 Rather than storing their training data, diffusion models learn the statistics of patterns which are associated with certain concepts found in the text labels applied to their training data, i.e. they learn a probability distribution associated with certain concepts. This process of learning the statistics of the data is a desired characteristic of the model and allows the model to generate new images by sampling from the distribution.

8.40 ...For models such as Stable Diffusion, trained on very large datasets, it is simply not possible for the models to encode and store their training data as a formula.... It is impossible to store all training images in the weights. This can be seen by way of a simple (example) calculation. As I explained in paragraph 6.28 above, the LAION-5B dataset is around 220TB when downloaded. In contrast, the model weights for Stable Diffusion 1.1-1.4 can be downloaded as a 3.44GB binary file. The model weights are therefore around five orders of magnitude smaller than a dataset which was used in training those weights".

Latent Space and Model Architecture

Again, critical to an understanding of the issues is the question: what is an AI model? At paragraph 555, the judge stated that the evidence provided was consistent with the agreement of the joint experts. In the experts' joint statement, it was stated that the model weights do not directly store the pixel values associated with billions of training images, i.e., digital images, each consisting of what the experts described in their report as an array of pixels. During training, images are converted from pixel space into what is called latent space using an autoencoder. In terms of what latent space is, it is a compressed representative form of the pixel space image that is both more memory and computationally efficient.

The Training Process

The agreed-upon Technical Primer outlines how text-to-image models, such as Stable Diffusion, are trained. Each image is converted into a latent representation and deliberately degraded by adding random noise. The model learns to predict and remove this noise, guided by the accompanying text descriptions, through a process called stochastic gradient descent. Training occurs in small data batches, with each batch updating the model's weights slightly based on the average direction of improvement (the gradient). Every complete pass through the dataset is an epoch, and the process is repeated many times. Because each image influences the model only marginally, progress depends on patterns consistently reinforced across many samples. The goal is gradient descent – descending the loss function, which essentially means reducing inaccuracies and errors.

The Memorisation Caveat

However, a point which must be borne in mind, and again this is crucial in the broader AI context, at paragraph 558, in relation to the question of whether or not models contain their training data or store their training data, the experts agreed that while Stable Diffusion can and does produce images that are distinct from the training examples, the model can also produce nearly identical images, i.e. a memorised image, or what is called in the New York Times and OpenAI case. a regurgitation, and that it can produce images that are derived from a training image, either in part or in whole (a derivative). This raises questions about why there was no claim for substantial similarity infringement. Substantial similarity infringement is a United States doctrine, and in the UK, the consideration would have been whether what was allegedly copied constitutes a substantial part of the original work; however, that was not tested here. Substantial similarity infringement is a common occurrence in the music industry. Ed Sheeran recently successfully defended such a claim, and other famous substantial similarity claims involved the Marvin Gaye estate's dispute with the song "Blurred Lines."

Generalisation versus Memorisation

As set out above, the technical primer states that while a neural network's weights are trained on existing data, its real objective is to perform accurately on new, unseen inputs by learning general patterns rather than memorising specifics. Overfitting occurs when the network learns the training data too precisely, resulting in poor performance on new data. This occurs when the dataset lacks diversity or training continues for an excessive number of epochs. Although deep networks can simultaneously generalise and memorise, excessive memorisation, often caused by duplicated data or outliers, undermines their ability to generalise effectively.

Can an Article Be Intangible?

In terms of the analysis, the judge stated that she considered whether an article had to be an infringing copy and, if so, whether it was capable of being an electronic copy stored in an intangible form. She stated that she agreed with Getty Images that if the word "article" were construed as only covering tangible articles, this would deprive authors of protection in circumstances where the copy is electronic and dealt with electronically. The judge stated that the dispute really turned on whether an article whose making involves the use of infringing copies, but which never contains or stores those copies, is in itself an infringing copy such that its making in the UK would have constituted infringement. This is important in terms of distinguishing whether it was an article.

The Central Question

As the judge stated, it turned on whether an article whose making involves the use of infringing copies, but which never contains or stores those copies, is itself an infringing copy, such that its making in the UK would constitute infringement. The judge found that an AI model which derives or results from a training process involving the exposure of model weights to infringing copies is not an infringing copy itself. She stated that it was not enough that the timing of making copies of the Getty's copyrighted works coincided with the making of the model. She stated the model weights were not themselves an infringing copy, and they did not store an infringing copy.

Paragraph 600: The Key Finding

At paragraph 600 of the judgment, the judge stated that the model weights are not themselves an infringing copy, and they do not store an infringing copy. They are purely the product of the patterns and features which they have learnt over time during the training process through the process of gradient descent.

The judge held that Getty Images' central submission that "as soon as it was made, the AI model is an infringing copy," was "entirely misconceived."

Secondary Infringement Dismissed

The judge ruled that Stable Diffusion was not an infringing copy for the purposes of the Copyright Act, and accordingly, the secondary infringement claim was dismissed. There was then an analysis of some other legal aspects, such as the concept of first ownership of copyright in the employment context and other minor considerations. These were not relevant, but the judge considered them in case she was found to be wrong in her application of the law. The judge went on to say that a key issue in connection with the secondary infringement of the copyright claim was whether licensed copyright works relied on by Getty Images were subject to an exclusive licence. The judge also analysed the concept of exclusive licences under English law and New York law.

A Narrow Victory for Getty

Getty Images only succeeded in a very small part of their claim, which pertained to a limited element of trade mark infringement, and the secondary infringement claim failed entirely. The judge's key findings were that Stability bears no direct liability for any tortious acts alleged in the proceedings arising from the release of the v1 models via GitHub and Hugging Face. The question of trade mark infringement only arose in relation to specific models. There was no evidence of a single user in the UK generating either Getty Images or iStock watermarks using the SDXL and v1.6 models.

Specific Trade mark Findings

Getty Images succeeded in respect of the iStock watermarks generated by users of v1 of Stable Diffusion. However, they failed in respect of the Getty Images watermarks, and there was no evidence of infringement of the Getty Images marks under section 10(1) of the Trade marks Act. That claim was dismissed.

In terms of the Getty Images claim under section 10(2) of the Trade marks Act, Getty Images succeeded in respect of the iStock watermarks generated by users of v1 of Stable Diffusion. They succeeded in respect of Getty Images watermarks generated by users of v2 in that infringement of Getty Images marks pursuant to section 10 (2) of the Trade marks Act had been established. This was explicitly based on the example watermark on the first Japanese temple garden image, which was generated by model v2.1 of Stable Diffusion. Getty Images' claim under section 10(3) of the Trade marks Act was dismissed. The court did not address the passing off allegation.

The Copyright Finding

The claim of secondary infringement was dismissed, but the judge did find that an article can be considered an intangible object for the purposes of the Copyright Act. Furthermore, an AI model, such as Stable Diffusion, which does not store or reproduce any copyright works and has never done so, is not an infringing copy.

Final Thoughts

The significance of this judgment lies in the framework that Smith J established for understanding AI models. The finding that model weights are not infringing copies because they do not store or reproduce training data will be cited for years, establishing that learning patterns from copyrighted material do not automatically create a legal copy. The infringing act in a copyright context is the copying of training data for the purpose of training, not the act of training itself, nor the existence of the model.

The judgment's most consequential contribution may be its treatment of responsibility. Stability AI's attempt to position itself as a mere provider of neutral tools was rejected. The judge found that AI developers exercise "absolute control" over training data and cannot fully disclaim responsibility for outputs when they choose not to filter problematic images from datasets.

The model is not a standalone tool but part of a "pipeline" of interconnected systems and choices.

What Smith J has given us is a step towards clarity: AI models are not copies, but neither are they consequence-free tools deployed by passive intermediaries. They are the product of deliberate choices for which their creators bear responsibility.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

[View Source]

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More