October 15, 2012

Is scanning books for search purposes “transformative” and fair use?


A federal judge handed down summary judgement  in yet another Authors Guild lawsuit last week, ruling that book scanning by libraries is “transformative” and thus fair use under copyright law. As quoted by Siva Vaidhyanathan in a post for the Chronicle of Higher Education, district court judge Harold Baer wrote “I cannot imagine a definition of fair use that would not encompass the transformative uses made” by the project. The decision is a blow against the Guild’s claims that scanning practices are infringement, but may be most important for its influence of another ongoing lawsuit against Google.

The suit’s defendant, the Hathi Trust, is closely linked with Google’s library scanning practices, though the Authors Guild had filed separate suits against each. The Trust is an organization set up as a repository for many of the scans of collections—most often undertaken by Google—from university libraries across the country. After Google has scanned the books, libraries work with the Hathi Trust to ensure that the resulting files are more readily searchable.

As reported by Ars Technica, the decision of whether something is fair use under copyright law hinges on four individual guidelines, and the judge declared in favor of the Hathi Trust along each. The most important, however: the work must be transformative. Parody, for instance is often transformative and thus not infringing. Likewise using copyrighted work for research can often be considered fair use, because added commentary is transformative, because the amount of the work used is often limited, and because scholarly uses of an original work are very rarely in danger of competing with potential markets for the work.  In the case of Hathi Trust, Judge Baer found that using scans of the work to make the books accessible to search was in fact transformative. He was also much taken with the benefits to accessibility granted by scans of books—in text-to-speech programs for the blind, for instance.

One major spoke in the Authors Guild defense has also been that Google’s pre-emptive scans of authors’ work is an infringement on potential future sales of the right to scan their books. From Ars Technica:

While a book search engine obviously doesn’t undermine the market for paper books, the authors had argued that a finding of fair use would hamper their ability to earn revenue by selling the right to scan their books. But Judge Baer rejected this argument as fundamentally circular. He quoted a previous court decision that made the point: “Were a court automatically to conclude in every case that potential licensing revenues were impermissibly impaired simply because the secondary user did not pay a fee for the right to engage in the use, the fourth factor would always favor the copyright owner.”

While Google is likely to benefit from the precedent of search indexing as a transformative use, their victory in their own suit is not yet a sure thing. as Matthew Ingram writes for Gigaom:

Those kinds of arguments likely wouldn’t hold as much weight for Google itself, except as they apply to scholarly works that are provided to universities and projects like the Hathi Trust. A big part of the Authors Guild case rests on the fact that Google is a corporation with a profit motive, and therefore shouldn’t be allowed to scan copyrighted books without permission, even if its index makes them easier for buyers to find and purchase (Google also shows excerpts or “snippets” for all of the books that it scans, while the Hathi library only shows excerpts for public-domain books).

This decision has been a cause of some contention even within our offices, but I find the argument that these scans are fair use to be much less persuasive than Judge Baer did. It’s easy to see that preserving some of the books in question by scanning them is in the public interest, but that’s not where the decision lies. The confusion here is between the scanned files themselves and how they are being accessed. If, for example, a person were to photocopy an entire book and print that book with no additions or alterations on a different medium — something other than paper, let’s say vellum made from the hides of cloned sheep (this is the twenty-first century we’re talking about here, after all, when hgh science and the artisanally quixotic came out to play)—but if the purpose of that copy remained the same — to be read — then the work would be infringing. Licensing issues aside, an author’s rights do not inhere in the medium on which their work is printed, but rather in the body of words. So too, a scan of a book, not used as art but meant to be read, is an infringement under current law. That is why digital books are subject to copyright.

Whether or not the Hathi Trust is a boon to research or to the blind communities, and whether search itself is transformative, the files at the foot of the search, those scans of the book, are I believe, themselves infringements on the rights of the authors. Most likely the fault for that infringement lies with those who created the copies: Google. I also believe authors would be wise to allow the Trust, and perhaps even Google, use of their work, if only for search purposes. It surely is a benefit to the public. But it should be a question of just that: allowance, permission.



Dustin Kurtz is the marketing manager of Melville House, and a former bookseller.