It was an idea that probably never stood a chance in the long run. Since 2005, the Internet Archive, a nonprofit entity devoted to “providing[ing] universal access to all knowledge,” has been digitizing physical books and posting the copies to its website, where users may read them for free. In 2018, the Internet Archive began partnering with various libraries around the county to offer online access to their respective physical holdings as well.
“Not everyone has access to a public or academic library with a good collection,” the Internet Archive argued, “so to provide universal access we need to provide digital versions of books.” Such thinking led to the creation of the Internet Archive’s “Free Digital Library” and “Open Library Project,” in which “one reader at a time can read a digitized copy of a legally owned library book.”
Sounds like a safe and noncontroversial idea, right? Not exactly.
In September 2024, the U.S. Court of Appeals for the 2nd Circuit found the Internet Archive to be in violation of federal law over its “large scale copying and distribution of copyrighted books without permission from or payment to the Publishers or authors.” To allow such behavior to continue, the 2nd Circuit declared in Hachette Book Group v. Internet Archive, “would allow for widescale copying that deprives creators of compensation and diminishes the incentive to produce new works.”
Copyright law is a jagged rock on which many seemingly promising ideas have been wrecked. In this particular case, the perilous legal provision turned out to be Section 107 of the Copyright Act, which governs “the fair use of a copyrighted work.” According to that provision, a copyrighted work may be fairly used without permission from the copyright holder “for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research.”
According to the Internet Archive, that language should be read to fully protect both the collecting and lending practices of its free digital library. “The record is replete,” the group pointed out, “with examples of IA facilitating access to books needed for classroom use and academic research that would not have been possible otherwise.”
That the Internet Archive has been a great boon to students, teachers, and scholars is undoubtedly true. Indeed, I can personally testify to the fact. When I was researching my recent book about Frederick Douglass and the Constitution, I was grateful for the many dusty old volumes that the Internet Archive effectively placed at my fingertips. Innumerable other writers and researchers would surely say something similar.
But the problem for the Internet Archive is that Section 107 also states that certain other factors must be considered “in determining whether the use made of a work in any particular case is a fair use.” And one of those factors is “the effect of the use upon the potential market for or value of the copyrighted work.” In other words, fair use does not include undercutting the commercial viability of the copyrighted work.
That is where the Internet Archive’s legal troubles really originated. In 2020, a coalition of four book publishing giants—Hachette Book Group, Harper Collins Publishers, John Wiley & Sons, and Penguin Random House—filed suit, alleging that the Internet Archive’s “unauthorized copying and distribution of Plaintiffs’ works include titles that the Publishers are currently selling commercially and currently providing to libraries in ebook form, making Defendant’s business a direct substitute for established markets.”
Unlike traditional public libraries, which “buy print books and license ebooks (or agree to terms of sale for ebooks) from publishers,” the four publishers stated in their lawsuit, the Internet Archive commits “willful mass copyright infringement” and then distributes “digital bootleg versions online.”
Unfortunately for the Internet Archive, the 2nd Circuit basically shared that negative assessment of the situation. “Is it ‘fair use’ for a nonprofit organization to scan copyright-protected print books in their entirety and distribute those digital copies online, in full, for free, subject to a one-to-one owned-to-loaned ratio between its print copies and the digital copies it makes available at any given time, all without authorization from the copyright-holding publishers or author,” asked the 2nd Circuit. “We conclude the answer is no.”
In the 2nd Circuit’s view, the Internet Archive “does not perform the traditional functions of a library.” Instead, it “prepares derivatives of Publishers’ Works and delivers those derivatives to its users in full.” In other words, the court said, the Internet Archive offers illegal free versions of what publishers and authors alone have the exclusive legal right to sell or otherwise distribute.
Furthermore, the appellate court declared, “were we to approve IA’s use of the Works, there would be little reason for consumers or libraries to pay Publishers for content they could access for free on IA’s website.” All of which made it “self-evident” to the 2nd Circuit, “that if IA’s use were to become widespread, it would adversely affect Publishers’ markets for the Works.”
It was for these reasons that the Internet Archive failed the fair use test, as interpreted by the 2nd Circuit. Of course, the Internet Archive is likely to appeal its loss, but that is no guarantee of a better outcome. In fact, given the thoroughness of the 2nd Circuit’s judgment, combined with the unforgiving language of the Copyright Act, the Supreme Court might not even bother to hear an appeal in this matter.
That outcome may be good news for publishers who want to sell more ebooks. But is it good news for authors and readers in general?
The writer Virginia Postrel, who specializes in the intersection of culture and technology, once observed that the rise of the web has taken us from “a world in which reading material was relatively scarce and expensive to one in which it’s overabundant and nearly free.”
For many authors, that has made it much more difficult to even get their books noticed in the first place. Furthermore, Postrel noted, “for increasing numbers of readers, a book that doesn’t show up in a Google search or can’t be linked to in some way online might as well not exist.”
The Internet Archive has stood as a sort of bulwark against that particular trend. Its free digital library grabbed hold of such titles and helped to ensure that future readers might someday discover them online. As a result of the judgment in this case, however, that bulwark has been weakened, if not fatally undermined.
“The knowledge embodied in books deserves preservation, not destruction, and particularly not destruction at the behest of authors and publishers,” Postrel told me about the 2nd Circuit’s decision. “The ruling could hardly be worse. Forcing the Internet Archive to destroy digital copies—to burn digital books—goes far beyond protecting authors’ rights to profit from their copyrighted work. By destroying the ability to search those books, it potentially depresses sales. When you discover that a book discusses a topic you’re interested in you’re more likely to buy it.”
And yet, Postrel added, “if I’d been on the Internet Archive’s board, however, I would have argued against the Free Digital Library.” Why? “Not because the archive’s case was wrong,” she says, “but because its definition of fair use was aggressive and extremely risky for the institution.”
The damage arising from that risk has certainly been done now. Whether the Internet Archive ever fully recovers from the legal loss remains to be seen.