Thomson Reuters Wins a Victory Against Both AI and Public Access to Law

Recently, an opinion was published by a magistrate judge in Delaware, Stephanos Bibas, in Thomson Reuters v. Ross Intelligence, a case filed in 2020 about AI training.

For most of us who have been following the spate of ongoing cases about generative AI, this case was not too high on the radar, because it’s about regular AI: substantive number crunching. Moreover, the case was filed in 2020, long before all the hoo-hah about ChatGPT started. Apparently, in the depths of a societal shutdown from COVID-19, Thomson Reuters decided it needed to shore up its putative rights in public domain legal texts.

I have to confess that it took quite a while to sit down and read this opinion, because I just didn’t want to. I think the legal reasoning in this opinion is not correct, but that’s not why it took me so long. The effort of the legal publishers to restrict access to primary law sources causes me great dismay. So, if you want objective legal analysis, you should probably read some of the other commentators who have written about this opinion.

Legal Opinions are Freely Available

In the US, legal opinions written by courts are in the public domain–they enjoy no copyright protection. That is as it should be. Everyone should be able to read the law and decide for themselves what it means. And while it’s true that access to primary information is usually not enough, and those with the money to hire the best-dressed lawyers often win legal battles–more about that below–free access to primary legal sources, like court opinions and statutes, is table stakes for political liberty.

I graduated law school in 1994, which was just on the cusp of widespread access to the Internet. So, I learned to research the law in books–tree-killing, heavy, expensive hardback books called case reporters. While the online legal research tools, Westlaw and LEXIS, were available to us then, they were viewed with skepticism. And they were rudimentary, expensive, and often more confusing than the books. So, a library, like the one at UC Berkeley where I attended, merely had to buy the case reporters, and then anyone was free to read them.

Because legal opinions are usually long and tedious to read, case reporters include micro-summaries printed at the front of the case text, called headnotes. And with all due deference to whatever poor underemployed lawyers create the headnotes, they are not what you would call poetry. They are summaries of individual nuggets of law, intended to point a researcher to actual language in the case. They are like a very wordy index. No competent lawyer would use a headnote standing alone. Headnotes are merely entry points to the official legal text. And for the most part, they either quote or slightly rephrase the source material.

Here is a (made up) example from the opinion:

Headnote:

Originality, for copyright
purposes, means that the
work was independently
created and has some minimal degree of creativity.

Opinion:

Original, as the term is
used in copyright, means
only that the work was
independently created by
the author (as opposed to
copied from other works),
and that it possesses at
least some minimal degree
of creativity.

Although this is a hypothetical example, it is fairly typical of what headnotes contain: a verbatim or slight rephrasing of what the court says in the opinion.

A Doomed Business

Since the mid-1990s, the legal publishers have struggled. After all, their business model is almost completely obsolete. Most legal text is available online, free of charge. Since that time, legal publishers have been on a campaign to squeeze the last remaining value out of their businesses, in part by suing people for infringing their so-called copyrights in what they create: headnotes and page numbers.

But the truth is this: no one needs headnotes anymore. With sophisticated pattern-matching search, not to mention AI-enhanced search, most research can be done on the web. If you want to find legal opinions on the web, for example, there is the Free Law Project, which even allows access via REST APIs. (I am a member for the princely sum of $10 a month.) And there is Google Scholar, which is free, and even recommended by the Library of Congress. Not to mention FindLaw and Justia. And today, search technology is good enough so that it’s pretty easy to find the law online free of charge.

The opinion in this case says, “The law is no longer a brooding omnipresence in the sky; it now dwells in legal research platforms.” This portends the opinion’s ill-advised conclusions. In truth, the law is indeed an omnipresence in the sky–it’s on the world wide web. And it is not, and never should be, confined to “legal research platforms.” The advent of the web should have freed us from the firewalls of legal research and democratized access to the law.

So, the legal publishing business, which once carved out a living by filling a practical gap–the difficulty of the body politic getting access to paper legal documents–is now simply unnecessary. And the entire business of paywalling access to primary legal sources should have died a graceful death.

If you are of a mind to lament businesses toppling due to the advance of technology, I refer you to this blacksmithing video. Blacksmithing is fascinating to watch, but I don’t want to fire up a forge every time I want a new kitchen utensil, and I don’t want to pay a thousand bucks for a hand-made skillet. Also, just imagine what an outcry there would be if a big, bad tech business like Google tried to firewall access to public domain legal documents. Sacrilege! Techno-oligarchy! But when a dying business like Westlaw tries to do this, it’s perfectly fine, because…authors’ rights!

…Nevermind

The opinion begins, “A smart man knows when he is right; a wise man knows when he is wrong.” The judge proceeds to overturn his own opinion on the exact same topics from 2023.

For those of you who are not lawyers, I want to convey how bizarre it is for a federal judge to reverse an existing opinion without one of the parties appealing the decision. It just doesn’t happen–and for good reason. The job of judges is to arbitrate disputes. Judges don’t just decide what they think we ought to hear. They answer motions brought by litigants. But in this case, the judge just literally changed his mind of his own accord, between 2023 and 2025. Otherwise known as the ChatGPT era.

This is the first signal that something is rotten in Denmark. But then, we get to the substance of the opinion.

Headnotes are Copyrightable Material?

If you have your doubts about headnotes being protectable under copyright when reading the above example, you are not alone.

This court, in an ominous foreshadowing of its misguided conclusion, says “A headnote is a short, key point of law chiseled out of a lengthy judicial opinion.” It proceeds to analogize the creation of headnotes with the work of a sculptor–like Michelangelo, who famously (and probably apocryphally) said, that to carve an elephant, you just chip away the part of the marble that doesn’t look like an elephant.

More than that, each headnote is an individual, copyrightable work. That became clear to me once I analogized the lawyer’s editorial judgment to that of a sculptor. A block of raw marble, like a judicial opinion, is not copyrightable. Yet a sculptor creates a sculpture by choosing what to cut away and what to leave in place. That sculpture is copyrightable. 17 U.S.C. §102(a)(5). So too, even a headnote taken verbatim from an opinion is a carefully chosen fraction of the whole. Identifying which words matter and chiseling away the surrounding mass expresses the editor’s idea about what the important point of law from the opinion is….So all headnotes, even any that quote judicial opinions verbatim, have original value as individual works. That belated insight explains my change of heart. In my 2023 opinion, I wrongly viewed the degree of overlap between the headnote text and the case opinion text as dispositive of originality…. I no longer think that is so. [Emphasis added]

But creating a headnote is not like carving a work of art from a block of stone. It’s more like taking a core sample from stone to see if it has gold in it. Sure, there is skill in deciding where to drill. But pointing to the gold is not expression. And neither is the sample that results.

The 2023 opinion in this case said this issue should go to a jury, “to decide whether the headnotes … were original enough” for copyright protection. That was the right result. This new result takes this conclusion away from the fact-finder, and hands the plaintiff a potentially huge victory.

What Will Happen?

One thing that seems clear here is that AI may die an ignominious death if the decision is used as precedent for other cases. Don’t get me wrong–I am not a supporter of this particular defendant. I hope that machine learning models to analyze the law are developed and offered free of charge to everyone–not by a defendant that couldn’t figure out how to train an AI without using headnotes. I care more about the next defendant in the next case, who tries to train AI to analyze public domain material, and gets slapped down by an aging business sector that makes its living aggregating that material and is only trying to forestall its inevitable death at the hands of innovation.

Do you think the plaintiff has the resources to create great AI for legal research? That’s unlikely. They are an aging business, and AI training is expensive. But this opinion says that “it does not matter whether Thomson Reuters has used the data to train its own legal search tools; the effect on a potential market for AI training data is enough” to deny the defendant the right to bring a fair use defense to a jury.

Unfortunately, this particular defendant probably will not have the resources to keep defending this claim. Ross Intelligence closed down its product in 2021 because of this lawsuit. Contrast the fact that Thomson Reuters used Kirkland and Ellis, one of the top IP firms in the country–and one of the most expensive–on this case. So the outcome does not even benefit the plaintiffs much here, it just threatens AI development everywhere.

Perhaps some other AI developers with deep pockets might step in to help fund a legal war chest to overturn the decision. This decision not only suggests that AI training is not fair use, but that the question of fair use is a matter for a judge, and would not even go to a jury. Fair use is likely to be a significant defense in many of the cases currently pending about AI. In Oracle v. Google, a fair use case that went on for over a decade and costs many millions, we saw how vulnerable fair use is to rights-holders with big war chests. In that case, similarly, the Federal Court of Appeals tried to take the decision making on fair use away from the jury. Luckily, the Supreme Court thwarted that attempt, and we were all lucky that the defendant, Google, had the money and the gumption to fight that case for over ten years. Otherwise, most APIs would be paywalled now.

Do not celebrate this as a victory for authors’ rights. Headnotes are not fine art. This opinion is a mistake, and I hope it is corrected.

And All the Rest…

There is a lot more to say about this opinion, and I have not attempted to analyze it all here. For example, the copyright protection of the Key Number System even more absurd than protection for headnotes, and yet this opinion seems to allow for it. The opinion glosses over countervailing law in the Oracle v. Google case, saying dismissively that “those cases are all about copying computer code. This case is not.” It also comments, with little analysis, that AI training is not transformative–which is probably is. (The name of the current method for machine learning training is in fact the transformer method, and the name is apt. Machine learning models look nothing like their inputs.) The transformation test usually wins the fair use question. Also, the opinion comments that each headnote is an individual copyrightable work, a scary piece of dictum that could result in precedent for eye-watering statutory damages, for which a court can award up to $150,000 per work.

Also, those trumpeting this opinion as a victory for authors will no doubt ignore that this case only involves one side of the question: the input side. For generative AI, the main issue is probably the output side. The opinion itself says, “Because the AI landscape is changing rapidly, I note for readers that only non-generative AI is before me today.” But that statement will be roundly ignored. By writing “for readers,” this judge knew very well that this decision would create an avalanche of conclusions that AI training is infringing. And the media wants a win, and so that’s what it will report.

Author: heatherjmeeker

Technology licensing lawyer, drummer

One thought on “Thomson Reuters Wins a Victory Against Both AI and Public Access to Law”

  1. Thank you for a lovely analysis, Heather, it is so refreshing to read your punchy and rich legal blog, really, really, enjoy your review!

Leave a Reply

Discover more from Copyleft Currents

Subscribe now to keep reading and get access to the full archive.

Continue reading