Cover Story

Washington Lawyer

March 2006

Google, Books, and Fair Use

By Joan Indiana Rigdon

Since its modest beginnings in Silicon Valley in the late 1990s, Google Inc. has outlasted or overtaken its rivals to become the premier Internet search service of choice for millions of people around the world. Such is its popularity today that when people use the service to look up information, they are said to “Google” it.

Google’s reputation, however, came into question in late fall of 2004 when it announced a program that threatened to upset the notion of copyright in the digital age. Google aims to scan and index portions or the entire collections of five major libraries—Harvard University, Oxford University, Stanford University, the University of Michigan, and the New York Public Library—so that people can use Google to search the full text of books the way they search web pages.

The scope of the scanning is massive. According to estimates by the Online Computer Library Center, the “Google 5” libraries collectively hold about 10.5 million unique books, written in more than 400 languages (though about half are in English).

Once the index includes millions of volumes, a search term like Hitler could yield tens of thousands of hits in everything from obscure German-language histories to novellas in which the name is mentioned only in passing. Google swears it won’t let searchers see the whole book, or even a chapter or a page. Instead its engines will serve up results in the form of “snippets” that contain the term—a line or two at a time. The snippets will be presented alongside links to stores that sell the book, if any do, and the nearest library that holds the book.

Almost everyone who is aware of Google’s undertaking, even those who are unsure of its legality, agree that it would be a fantastic research tool, like a card catalog on steroids. “I’m pretty excited about being able to log on to the British Library and get some work from 1750 that, absent a trip to the British Library, I couldn’t have,” says Kurt Wimmer, a Covington & Burling partner who specializes in intellectual property law. “This is an important societal benefit. The only question is how it gets done.”

Even the Authors Guild, which otherwise objects to Google’s library scanning program, allows that Google’s project “would make human knowledge available on an unprecedented scale.” So said the guild’s president, Nick Taylor, in an October 22 Op-Ed in the Washington Post.

The problem is, to compile its index, Google says it must make full digital copies of each book, even books protected by copyright. (Of the five libraries, the New York Public Library and Oxford University say they won’t allow Google to scan copyrighted works, only those in the public domain.) One copy will go to the originating library. Google believes it has the right to keep another copy on its servers so it can serve up its snippets to researchers.

Because Google believes this copying is protected by fair use, it isn’t asking for the copyright holders’ permission. Instead it says it is extending a courtesy by giving them the opportunity to opt out.

If Google had to segregate out copyrighted books, find the rights holders, and obtain their permission before scanning, “as opposed to just vacuum cleaning the shelves,” says Stanford University law professor and intellectual property expert Lawrence Lessig, the cost would be burdensome. In that case Google couldn’t afford to make its book index, he argues, and a lot of knowledge would be left offline. He says most of the works to be scanned are “orphans,” works whose copyright holders would be difficult or impossible to find.

Publishers and authors are outraged, partly because they had been in talks with Google about the project before the scanning began; then negotiations stalled and Google decided to proceed without their blessings. They are also angry because the U.S. Copyright Office is currently investigating whether to amend copyright law so orphaned works will be more accessible to the public. They wish Google had waited for the outcome before proceeding.

The Association of American Publishers and the Authors Guild have filed separate suits against Google alleging copyright infringement. The cases are before the U.S. District Court for the Southern District of New York.

Allan Adler, vice president of legal and government affairs for the Association of American Publishers, sums up the association’s objections: Google will “make digital copies of these works in their entirety. They will include them in their own database. They’re going to regard this material as proprietary to them in a way that directly serves to benefit their primary source of revenue, which is sales of advertising. And they’re claiming it’s fair use, which frankly we see as untenable.” He adds, “If Google can do this, so can anyone else.”

Believing Google’s actions are not fair use, Adler is incensed at the idea that publishers must go out of their way to opt out if they don’t want to be included. “Google is insisting that Google is allowed to go ahead and scan unless the rights holder learns about it and tells Google it can’t. That turns the basic exercise of rights of copyright on its head,” he says.

The Authors Guild wonders aloud what Google will do with its own digital copy of scanned books. “For all we know they’re making 20 copies internally to use for their own purposes,” says Jan Constantine, general counsel for the guild. Constantine also worries that libraries may give out copies of their copy (to regional libraries, for instance), and that the duplications will continue from there.

Google is “trying to say that we’re Luddites and we’re not in the 21st century,” says Constantine. “We’re not saying the program should be stopped. We’re saying the program as it stands should be stopped because we deserve to be paid for our intellectual property.”

“When did we decide that socialism was the way to run the Internet?” asks Taylor.

Of course, the issue is not whether Google is greedy, but whether Google’s actions are protected by fair use. If so, then it would be legal for Google to make full copies and profit from its use of those copies without asking permission from or paying fees to the copyright holders.

In the near term, the issue of fair use will be decided by U.S. District Court. If appealed, the cases would go before the U.S. Court of Appeals for the Second Circuit, where one of the judges, Pierre Leval, is noted for his successful effort, in 1990, to sharpen the definition of fair use, which he complained was too vague.

As the federal court system considers the lawsuits against Google, it may also choose to face a larger question: one hundred sixty-five years after Folsom v. Marsh, which codified the four factors of today’s fair use doctrine, is it time to redefine that doctrine for the digital age?

The Other Program

Google actually has two book-scanning programs. The one that most people don’t hear about is the first, introduced last November, which is aimed at publishers. In this program Google asks for permission to scan copyrighted books. If the copyright holders agree, Google scans the books and presents up to five pages of text surrounding a single search term. If a researcher types in another search term, the researcher will get another five pages of text.

Although several pages of the work can be viewed at a time, Google and its partner publishers (many of whom oppose the library project) expect that people who want to read substantial portions of the book will click on links either to buy it or to find it in the nearest library. (It’s also feasible that Google will one day offer researchers a way to buy a portion of a book—a service that is already available through Amazon.com.) Publishers benefit by getting free advertising for their titles, increased sales through the online orders, and if they allow Google to place ads on the pages, a portion of the corresponding ad revenue.

Publishers lauded the program. But it made it that much harder for them to accept the library project, which was announced shortly thereafter.

“Part of the consternation on the part of the publishing and author community is that [Google did] the initial [program for publishers] in the correct way, by going to the rights holder and saying, ‘Let’s work out a deal for a license,’ ” says Adler. “Then, two months later, they announce, ‘We’re also going to have a deal with these libraries, who are going to let us take the books right off their shelves.’”

The Point of Fair Use

In 1990 Leval, then a U.S. District Court judge, published an essay in the Harvard Law Review titled “Toward a Standard for Fair Use.” Leval had judged several fair use cases, and complained that he and other judges were unsure how to apply fair use because the doctrine was too vague. In the 300-year history of fair use, Leval wrote, no judicial decisions or laws had tried to define or explain it.

Leval proposed a new summary of fair use, which several courts now use: “[T]he use must be of a character that serves the copyright objective of stimulating productive thought and public instruction without excessively diminishing the incentives for creativity.”

To decide this, judges consider four factors: the purpose and character of the use, including whether it is for nonprofit or commercial uses and whether it is transformative; the nature of the copyrighted work, including whether it is creative and whether it has been published before; the amount and importance of the material used; and the effect of the use on the value of or potential market for the copyrighted work.

On the first factor, Google admits that although it is not posting advertising next to snippets of copyrighted material, the overall purpose of its book search is commercial insofar as Google itself is commercial. But it points out that in 1994, in Campbell v. Acuff Rose, the U.S. Supreme Court found that commercial uses of copyrighted works did not preclude a finding of fair use. (In that case, the owner of the copyright for Roy Orbison’s song, “Oh, Pretty Woman,” sued rap group 2 Live Crew for copyright infringement after the group used part of the song in a parody of that song.)

Adler says Campbell is distinguishable from Google’s situation. He believes satires and parodies have more protection than indices do under fair use.

In his 1990 essay Leval proposed that when considering the “nature” of the use, judges should focus on whether the use is transformative. Google claims that even though it is making full digital copies of books as an intermediary step, its use of the copies is transformative because it is putting the books in a search engine, which then serves up the books in snippets.

On the second factor, although the books are creative, Google argues that they have already been published (physically, if not online). On the third factor, Google concedes that it is using the entirety of copyrighted works. On the fourth factor, Google says it is not detracting from the value or potential value of the books because its search results don’t replace the book, and, in fact, the results help people buy books they might not otherwise find.

Kelly v. Arriba

To support its argument, Google is relying heavily on the Ninth Circuit Court of Appeals’ 2003 decision in Kelly v. Arriba Soft Corp. In that case Leslie Kelly, a professional photographer who had posted full-resolution copies of his original photos of the California gold country on his web site, objected when Arriba Soft Corporation copied those photos in order to create low-resolution thumbnail images. Arriba added those thumbnails to its index of pictures that could be found on the Internet.

The court found that Arriba’s actions were protected by fair use largely because its use of Kelly’s work was transformative. Arriba’s thumbnails were tools, in this case a searchable index of photos, whereas the purpose of the original work was aesthetic. The court found that this transformation, along with the fact that Arriba’s index “benefited the public by enhancing Internet information gathering techniques,” outweighed the fact that Arriba’s index was “operated for commercial purposes.”

On the second factor, the court found that although Kelly’s work was creative, it had been published online, and was therefore less protected from fair use than it would have been otherwise. On the third factor, the court found that it was reasonable for Arriba to copy whole pictures in order to produce its thumbnails. On the fourth factor, the court found that Arriba wasn’t hurting the market for Kelly’s images because the thumbnails couldn’t be used as replacements for the originals, and because the search engine “would guide users to [Kelly’s] web site rather than away from it.”

“What the Kelly court said was ‘Are you using the thing for the same purpose?’ ” says Alexander Macgillivray, senior product and intellectual property counsel for Google. “Making use of information, including making a complete copy, for the purpose of an information location tool is not an infringement.”

Some who oppose Google have suggested that the Ninth Circuit is not nearly as experienced in copyright cases as the Second Circuit.

Jonathan BandJonathan Band, a former Morrison & Foerster intellectual property and Internet regulation lawyer who went solo last May, dismisses this suggestion. “If Kelly is good law, then Google presents a stronger case than Kelly and Google wins. That’s why opponents are saying it’s just the Ninth Circuit and it’s not good law.”

Lessig also agrees with Kelly. “Kelly was right. And I think it’s the strongest authority Google has. What Kelly says is they take a complete copy; they remake it; they transform it into a version which is not really usable but which is sufficient to index the original copy; they make that index available; and they drive people back to the original copy. That’s exactly what [the library portion of Google Books] is doing.”

The Authors Guild believes Google’s project is substantially different, because in Kelly the photographer posted his work on the Web, knowing that the Web is a public space and that companies index web pages and their contents without asking. “His picture was already on the Internet,” says Constantine. “I think it’s distinguishable because our books are not.”

Macgillivray questions the assumption behind this argument. “Are you saying that people [whose work is] online, that their copyright is less valid than yours as a publisher? We think that people who publish online can be just as creative and just as deserving of the protections of copyright law as anyone else.”

Snippets Versus Full Copies

Charles Ossola, the partner at Arnold & Porter LLP who represented Kelly on appeal, agrees with Lessig in one respect. Kelly “is an important precedent, and the closest one that would be relevant” for the Google case, he says. “But there are a number of differences that are going to be quite significant. They’re trying to analogize snippets to the thumbnails. . . . The analogy is not nearly so simple as Google would have it.”

In Kelly, on the subject of the third factor, a lower court found that Arriba could not have produced a thumbnail image for its index without first making a copy of the entire picture. (The court of appeals did not address this, except to say it was reasonable to copy the whole picture to produce the thumbnail.) Ossola says this may be true for pictures, but it is not necessarily true for books. In order to form an index of a book that serves up snippets, Google doesn’t need to scan the whole book, Ossola argues.

“I don’t think you have the same necessary act of intermediate copying that you arguably had with Kelly,” he says. “Why don’t they just scan the snippets?”

Ossola suggests that a book about the Bolshevik revolution, for instance, could include three search terms: Bolshevik, Lenin, and Trotsky. To index the book with these terms, “All you have to copy are three or four snippets,” he says.

When Google argues that it must create and keep a full digital copy of a book so it can be included in the index, that’s “because of the way they designed the program,” Ossola says. “I don’t think Charles Ossolacopying the whole book is necessary to do what I understand they’re trying to do, which is alert readers to resources that are available.”

Google counters that its book search is meant to be comprehensive. When every word in a book is indexed, even tangential mentions of a subject can be found. Google says what Ossola is suggesting is a very limited service: a category search, like the one already offered by card catalogs. “There’s certainly the argument that we shouldn’t have innovation, we should have the old card catalog,” says Macgillivray.

He adds, “The question is, at what point do you stop [indexing a work]? Our view in terms of the Web is the best way to index the Web is to index the Web, rather than to make an a priori decision about what words are going to be useful.” Google holds the same view for its book-scanning project.

Ossola also questions the amount of copyrighted material that Google’s search engine will return in response to search terms. “If what they’re saying is in a book of 20,000 words, we’re showing 100, if that’s all they showed, they probably have a fair use argument. . . . Again, it comes back to, are they just trying to make a substitute of the book or just give the reader a glimpse?”

Google argues that its snippets only provide a glimpse. Each search term is rewarded with a maximum of three snippets from the same book, unless the book is in the public domain. If that is true, reading a whole book through the library portion of Google Books would require the same effort as reading a book that has been through a strip shredder.

Ossola says he agrees that three snippets at a time would not be a good substitute for reading a book. But he says a court may find Google is giving away too much copyrighted content if it considers the amount of work that Google is serving up across the board to several searchers using several search terms. Ossola adds that in some cases the snippets may not be considered fair use if a court finds that the snippet goes to the heart of a work. One example: “Call me Ishmael,” the opening line of Moby Dick.

Tale of Two Copies

Google’s supporters believe that too much emphasis is being put on the making of the intermediate copy, which Google uses to feed its index. “The intermediate copies are only a technicality. The only material that is being made available is the snippet. For fair use purposes the court is going to look at material that is made available to the public,” says Representative Rick Boucher (D-Va.), a fair use advocate.

Google’s critics are “fetishizing this idea of copying,” says Lessig. In 1909, when Congress gave copyright holders the exclusive right to copy, “what they were thinking of was printing presses. They weren’t thinking about schoolgirl writing out a poem 50 times in order to memorize it.” Now that “copying is as common as breathing,” says Lessig, “what any sensible policy maker should do in light of that is say, ‘Does it really make sense for us to be triggering this regulation on this event called copying, as opposed to all of the other uses of a work?’ ”

Some people are concerned that Google may eventually try to sell the full digital copies, or lose control of them so that someone else can. Band says these worries shouldn’t play into the court’s decision on whether Google’s current activities constitute fair use. If the copies fall into other hands and “those people do any infringement, they themselves would be directly liable,” he says.

Band doesn’t understand why people are so worried that someone might manage to hack into Google and steal the entire digital copy of a best seller. “It’s a whole lot easier to walk down to Barnes & Noble, buy a copy, and scan it. Yes, there’s a security issue [with Google’s copy]. But why aren’t you worried about the copies in the stores?”

Ironically, the libraries’ copies may be at higher risk of becoming publicly available. The University of Michigan, for one, has promised to use its copies of copyrighted works only for backup in case the corresponding hard copy is lost or destroyed. But other libraries may make different choices.

Many university libraries now have an “e-reserve” system. With it, reference materials that were once held at the reserve desk are now also available online. Currently, these materials include articles, chapters of books, and videos. Once a library has millions of volumes in digital format, it could feasibly begin offering them through e-reserve.

Copying whole books “under the rubric of fair use allows you to do practically anything you want on e-reserve,” says Sanford Thatcher, director of Penn State University Press. Thatcher is an outspoken fan of the publisher portion of Google Books, but ardently opposed to the library project.

However, if the libraries keep their copies as a backup, lawyers believe the courts will find these copies to be fair use.

The Profit Motive

Google argues that if anything, Google Books will lead to increased sales of books that could not otherwise be found. But the publishers and authors are focusing on a different source of money. They believe that they are losing out on a potential market, since Google could be paying them to put their books in its library, but isn’t.

Ossola believes they have a valid argument. With Google offering its searchable book index for free, the publishers and authors will find it difficult to sell their content to other index makers. “They’ll essentially be preempted. If something is really ‘free,’ then it can be hard to take advantage of a market for it later,” he says.

During a web cast debate last December at the New York Public Library, Adler emphasized Google’s commercial motives and the publisher’s loss of potential revenue. “What [Google is] chiefly doing is directly promoting their search engine. They are a for-profit company. If they are going to directly promote it through use of valuable content, intellectual property created by others, those others at least should have the right to have permission asked, if not to be able to also share [in the profits],” he said. He added, “Why couldn’t we license this to Google?”

Band points out that in fact publishers and authors can license their works to Google through the publisher portion of its program, the one that presents several pages of text at a time. He says the existence of that program undermines any argument that publishers and authors are losing money as a result of the library project.

Band notes that, as of press time, neither the authors nor the publishers had moved for a temporary restraining order. “Usually, if you’re saying, ‘Oh, what they’re doing is terrible and irreparable injury,’ you file for a TRO. Neither has done that, and I find that very interesting. Either they don’t think it’s a big deal or they don’t think they can win.”

Adler says the publishers have not filed for a restraining order because their argument is “about lost opportunity. We’re not arguing there is a hemorrhaging loss because books are being released to the public.”

Constantine says the Authors Guild considered moving for an injunction, but decided not to because Google has publicly said that its initial scanning concentrates on works in the public domain. If the guild learns that scanning of copyrighted works has begun, it will reconsider bringing a motion for an injunction.

The Greater Good?

As to why Google isn’t licensing content for its library project, Google maintains it doesn’t need to because its scanning is fair use. “What’s amazing here is that no one would have suggested that it is illegal to create a library card catalog in the analog world. To do it well, and to do it in the digital world . . . requires making a copy,” Google’s general counsel, David Drummond, said during the December web cast.

Drummond argued that Google Books is effectively a card catalog because it will be used the same way. “It just helps you go find things. It just seems like it would be a tragedy if you wanted to come into a library and actually look up a book and try to find it, [that you would] have to pay a toll to somebody just because we’re in the digital age.”

Lessig predicts that when the court weighs the benefit of the library portion of Google Books against the publishers’ and authors’ claims of lost revenues, it will decide for Google. “People are racing. They are so excited about filing a lawsuit against one of the most successful, richest companies in the world. If [the plaintiffs are] successful and this is not allowed, three quarters of our printed knowledge might not be accessible.”

Lessig is basing his estimate on the three quarters of books that are orphans, that is, works in copyright whose owners are unknown or dead and therefore difficult or even impossible to locate.

He adds, “It’s hard to say that any payment that authors and publishers might get of a few cents for a search is worth the loss to our culture of three quarters of our printed knowledge.”

Ossola doesn’t think the court will find that the public benefit outweighs the potential loss to the rights holders. “The argument that all infringers make is ‘What I’m doing is going to be good for you. . . .’ The Second Circuit, of all places, is pretty good at cutting through the Gee-this-is-good-for-copyright-owners argument. Because the truth is, it generally isn’t.”

If the courts hold that Google’s library book-scanning project is protected by fair use, “then I think there will be lots of other players including Google that try to apply that ruling to all different types of media,” including audio and video, says Joy Butler, a solo practitioner who specializes in entertainment and copyright law.

Several companies including Google already offer some form of video or audio search, but none have yet announced any plans to digitize video or audio without permission.

Band says that Google could feasibly argue that snippets of lengthy videos or audios are fair use, but that argument will be limited in cases in which snippets themselves are marketable. For instance, there is a market for ring tones for cell phones. “It turns out that you do have markets for relatively short pieces of music,” he says. In these markets, presenting snippets “is more likely to harm a market,” he says.

Butler points out that in 2003 a federal court had already ruled that two-minute video snippets used as trailers are not protected by fair use because there is an established market for trailers. In Video Pipeline Inc. v. Buena Vista Home Entertainment Inc. the U.S. Court of Appeals for the Third Circuit decided that a video distribution company, which originally had permission to include movie trailers in physical video rentals, did not also have the right to market those trailers over the Internet. Even though the trailers were short (two minutes long), the court found that trailers have become “valuable entertainment content in their own right.”

Another problem: courts are less likely to find fair use when the material used goes to the heart of the work. A service that fully searches a video will necessarily include all of the video, including hugely important highlights, like the game-winning home run of a World Series, or the moment where Harry Potter’s enemies draw his blood.

Time for an Update?

The lawyers interviewed for this article disagreed on whether fair use needs to be overhauled for the digital age.

“Any technological innovation pushes copyright law, and copyright law has to come on afterwards and be interpreted to apply or adapt. Initially it wasn’t clear whether a song on radio was a public performance and Congress clarified that. You can see something similar happening here,” says Wimmer. “But it’s hard to imagine Congress trying to step up to the plate and coming up with some royalty scheme for book publishers.”

Band thinks no change is necessary. “Fair use doctrine is doing just fine. It’s as flexible as it needs to be. It is able to adapt the law to new situations that are unforeseen. It would be a mistake to start tinkering with section 107 itself.”

Lessig says that although “the framework is good” and allows for evolution, the doctrine should be clearer. He recommends a specific list of items that would be protected by fair use so that all the people who publish on the Web will be clear on what is protected and what is not. Without more specifics, “fair use is the right to hire a lawyer, and for very few people is that a valuable right,” he says. “If copyright law is really going to regulate everybody that uses technology, then you’ve got to make the law so anyone can understand the law.”

Lessig is one of many critics who believe that the 1998 Digital Millennium Copyright Act threatens to extinguish fair use. Among other things, the act makes it a crime to circumvent any device that is designed to prevent the theft of digital content. The goal of the law was to fight piracy, especially the theft of software and movies.

To protect fair use, the act contains a provision that no part of it “shall affect rights, remedies, limitations, or defenses to copyright infringement, including fair use. . . .” But in practice the law allows content owners to digitally lock down their content against all uses, including fair uses.

For instance, some audio compact disks now contain a mechanism that prevents users from making a backup copy, an act that is protected under fair use. Representative Boucher is sponsoring a bill that would require such CDs to be clearly labeled to warn consumers that they can’t be copied.

The Way of Progress

The fate of Google Books is unclear. One possibility is that business events will overtake the facts of the cases against Google.

For instance, last December, as the plaintiffs were lobbing accusations of thievery while Google countered with complaints about those who stand in the way of progress, HarperCollins announced its own solution. The publisher will pay to scan its own titles into its own digital warehouse. It will allow any search engine to search that warehouse, but it will not allow them to make or keep copies. In this way HarperCollins maintains control over its titles.

Although the solution poses some technical problems—it’s uncertain how the digital copy would be formatted in order to interface with different brands of search engines—assuming that HarperCollins owns the electronic reproduction rights for each title, its arrangement doesn’t present the legal questions that Google’s does.

Adler says that HarperCollins’s project shows that there really is a way for Google to build a comprehensive book search database without making full digital copies of copyrighted works for anyone else besides the copyright holders. “The whole reason this came to a head,” Adler says, referring to the lawsuits, “is Google’s insistence that [the Google Books setup] is the only way to make the project work.”

Adler hopes to avoid court. “In some cases, [filing a lawsuit] makes people more willing to sit and try to reach a mutual agreement,” he says. “If Google is willing to pick up that discussion again, I think it will find the publishers are willing to pick up that discussion as well.”

Freelance writer Joan Indiana Rigdon is a frequent contributor to Washington Lawyer.

Back to Joan's clips >