Books 'R' Google

By Robert Darnton
The New York Review of Books
Volume 56, Number 2, February 12, 2009

Edited by Andy Ross

Google has digitized millions of books and made the texts searchable online.

When fields of knowledge turned into professions and university departments, professional journals sprouted throughout the fields. Commercial publishers made a fortune by selling subscriptions to the journals. They could ratchet up prices without causing cancellations, because the libraries paid for the subscriptions and the professors did not. And the professors provided free labor: they wrote the articles, refereed submissions, and served on editorial boards.

When businesses like Google look at libraries, they see potential assets, content, ready to be mined. Built up over centuries at an enormous expenditure of money and labor, library collections can be digitized en masse at relatively little cost. To digitize collections and sell the product in ways that fail to guarantee wide access would be to repeat the mistake that was made when publishers exploited the market for scholarly journals, but on a much greater scale.

Four years ago, Google began digitizing books from research libraries, providing full-text searching and making books in the public domain available on the Internet at no cost to the viewer. Google collected revenue from some discreet advertising attached to the service. Google also digitized an ever-increasing number of library books that were protected by copyright in order to provide search services that displayed small snippets of the text. In September and October 2005, a group of authors and publishers brought a class action suit against Google. In October 2008, the opposing parties announced agreement on a settlement.

The settlement creates a registry to represent the interests of the copyright holders. Google will sell access to a gigantic data bank composed primarily of copyrighted, out-of-print books. Organizations will be able to subscribe via an institutional license for access to the data bank. A public access license will make this material available to public libraries. Individuals will be able to access and print out digitized versions of the books by purchasing a consumer license. Google will retain 37 percent of the revenue, and the registry will distribute 63 percent among the copyright holders.

Of the seven million books digitized by November 2008, one million are works in the public domain, one million are in copyright and in print, and five million are in copyright but out of print. Google will continue to make books in the public domain available for users to read, download, and print, free of charge. Many of the books in copyright and in print will not be available in the data bank unless the copyright owners opt to include them. They will be sold as printed books and perhaps also as digitized copies via the consumer license. Most of the books covered by the institutional license are in copyright but our of print.

The proposal could result in the world's largest digital library. Google could also become the world's largest book business. Virtually all books will be brought within the reach of anyone with access to the Internet. Not only will Google bring books to readers, it will also open up extraordinary opportunities for research.

Google did not set out to create a monopoly. But the class action character of the settlement makes Google invulnerable to competition. Most book authors and publishers who own US copyrights are automatically covered by the settlement. No new digitizing enterprise can get off the ground without winning their assent.

This outcome was not anticipated at the outset. We missed a great opportunity. We could have created a National Digital Library. It is too late now. Not only have we failed to realize that possibility, but we are allowing the control of access to information to be determined by a private lawsuit.

Google will enjoy a monopoly of access to information. Google has no serious competitors. Google alone has the wealth to digitize on a massive scale. And having settled with the authors and publishers, it can exploit its financial power from within a protective legal barrier. No new entrepreneurs will be able to digitize books within that fenced-off territory. Only Google will be protected from copyright liability.

This is a tipping point in the development of the information society.
 

Google Book Search

By Robert Darnton
The New York Review of Books
Volume 56, Number 20, December 17, 2009

Edited by Andy Ross

Google has by now digitized some ten million books. On what terms will it make those texts available to readers? The terms of the settlement will have a profound effect on the book industry for the foreseeable future.

Google plans to enable consumers to purchase access to millions of copyrighted books currently in print, with payment going to authors and publishers as well as Google. Books covered by copyright but out of print, at least seven million in all, will be available through subscriptions paid for by institutions such as universities. The database, along with books in the public domain that Google has already digitized, will constitute a gigantic digital library.

But Google's dominance of access to books will reinforce its power over access to other kinds of information, raising concerns about privacy, competition, and commitment to the public good. As a commercial enterprise, Google's first duty is to provide a profit for its shareholders, and the settlement leaves no room for representation of the public.

Google Book Search (GBS) will certainly be challenged by groups and individuals who claim they were not fairly represented in the classes of authors and publishers. The case may take years to work its way through the courts. As the first step toward a resolution, the filing on November 13 suggested just how far Google is willing to go in modifying the original settlement.

The governments of France and Germany urged the court to reject the settlement. Far from seeing any potential public good in it, they condemned it for creating an "unchecked, concentrated power" over the digitization of a vast amount of literature and for doing so by a "commercially driven" agreement negotiated "in secrecy." In contrast to the commercial character of Google's enterprise, both governments stressed the higher values represented by their national literatures.

The French emphasized the unique character of the book, which, they claimed, would be compromised by Google's commitment to commercialization. The Germans spoke in the name of "the land of poets and thinkers," but they laid most stress on the right of privacy, which, they argued, Google could threaten. Both governments then listed a series of subsidiary arguments:

1. The settlement gives Google a virtual monopoly over orphan works, even though it has no claim to their copyrights.

2. Its opt-out provision, which means that authors will be deemed to have accepted the settlement unless they notify Google to the contrary, violates the rights inherent in authorship.

3. It contains a provision that prevents a potential competitor from obtaining better terms than Google in any new commercial uses of the digitized books. The terms of such future enterprises will be determined by a Books Rights Registry composed of representatives of the authors and publishers.

4. It gives Google the power to censor its database by excluding up to 15 percent of the digitized works.

5. Its guidelines for pricing will promote Google's commercial interests, not the good of the public, through the use of algorithms created by Google according to Google's secret methods.

6. It favors secrecy in general, hiding audit procedures, preventing the public from attending meetings in which Google and the Registry will discuss library matters, and even requiring Google, the authors, and publishers to destroy all documents relevant to their agreement on the settlement.

Above all, the French and Germans condemned the settlement for sanctioning the "uncontrolled, autocratic concentration of power in a single corporate entity," which threatened the "free exchange of ideas through literature."

The same points were made in a hearing before the European Commission in September by the International Federation of Library Associations (IFLA), the European Bureau of Library, Information and Documentation Associates (EBLIDA), and the Ligue des Bibliothèques Européennes de Recherche (LIBER).

All three stressed the danger that "a large proportion of the world's heritage of books in digital format will be under the control of a single corporate entity." They summoned up the prospect of a digital library of 30 million books and concluded that Google would exercise something close to hegemony in the book world. They appealed to the European Commission to defend the interests of the public.

The U.S. Department of Justice pointed to serious difficulties with the settlement and suggested the following changes:

1. Require rights-holders of out-of-print books to participate in the settlement by opting in instead of operating from the assumption that they had agreed to participate unless they opted out.

2. Do not distribute the profits from the sale of orphan books to the parties of the settlement but rather use the money to fund a thorough search for the unknown rights-holders.

3. Appoint guardians to protect the interests of orphan rights-holders by serving on the registry.

4. Find some mechanism by which potential competitors to Google could gain access to orphan works without exposure to suits for infringement of copyright.

5. Prevent Google from using out-of-print works in new commercial products without the owner's permission.

The revised settlement, or GBS 2.0, released on November 13, reads as if Google and the plaintiffs took most of their cues from the DOJ recommendations. GBS 2.0 provides that the Registry will include a court-appointed guardian to represent the rights-holders of unclaimed books. But Google alone would enjoy immunity from prosecution by any rights-holders.

As to revenue from the sale of orphan books, GBS 2.0 accepts that the money not go to Google and the plaintiffs but will be spent in efforts to search for the unidentified rights-holders. GBS 2.0 also allows Google's competitors to license out-of-print books in retail enterprises, although Google would maintain exclusive control of the institutional subscriptions to its gigantic database.

How the prices will be set remains unclear. GBS 2.0 contains no effective mechanism to prevent price gouging, no provision for a public authority to monitor prices, and no way to protect the public from excessive pricing should Google be taken over in the future by rapacious speculators.

GBS 2.0 does not therefore differ in essentials from GBS 1.0. It largely ignores the objections of foreign governments, except by narrowing the scope of GBS to books published in the United States, the United Kingdom, Canada, and Australia. GBS will not cover books published in countries like France and Germany.

One can imagine two general solutions to the problems posed by GBS, one maximal, one minimal.

The most ambitious solution would transform Google's digital database into a truly public library. An act of Congress would clear up a messy legal landscape and give the American people a national digital library equal to the needs of the twenty-first century.

A minimal solution could be devised for the private sector. Congress would legislate to protect the digitization of orphan works from lawsuits, but it would not appropriate funds. To avoid conflict with market interests, the database would include only books in the public domain and orphan works. At the rate of a million books a year, we would have a great library, free and accessible to everyone, within a decade.
 

The Future Of Publishing

By Jason Epstein
The New York Review of Books
Volume 57, Number 4, March 11, 2010

The digitization of the book publishing industry is now irreversible. The publishing industry's capital stock faces dissolution within a vast cloud in which all the world's books will eventually reside as digital files to be downloaded instantly title by title wherever on earth connectivity exists.

Digitization makes possible a world in which anyone can be a publisher and anyone can be an author. In this world, the traditional filters will have melted into air and only the human inability to read what is unreadable will remain to winnow what is worth keeping. Amid the chaos, readers will be guided by the imprints of reputable publishers. The more adaptable of today's general publishers will survive.

The difficult, solitary work of literary creation demands rare individual talent and in fiction is almost never collaborative. Until it is ready to be shown to a trusted friend or editor, a writer's work in progress is intensely private. Informed critical writing of high quality on general subjects will be as rare and as necessary as ever and will survive as it always has in print and online for discriminating readers.

The cost of entry for future publishers will be minimal, requiring only the upkeep of the editorial group and its immediate support services but without the expense of traditional distribution facilities and multilayered management. Traditional territorial rights will become superfluous and a worldwide, uniform copyright convention will be essential. Protecting content from unauthorized file sharers will remain a vexing problem. If I were a publisher today, I would consider a renewable rental model for all e-book downloads.

Literary form has been remarkably conservative throughout its long history. Actual books, printed and bound, will continue to be the irreplaceable repository of our collective wisdom. My rooms are piled from floor to ceiling with books. I mention this so that you will know the prejudice with which I celebrate the inevitability of digitization.
 

Googled

By John Lanchester
The Observer, February 21, 2010

Googled: The End of the World as We Know It
By Ken Auletta
Virgin Books, 400 pages

No company in history has grown as fast as Google. Within 400 weeks of its founding, it was earning revenues of $20 billion a year. The 1998 start-up has reached deep into the everyday experience of millions, put itself in the centre of the internet culture that is defining the new century, and had a disruptive impact on some industries and a potentially terminal one on others. Google is one of the wonders of the world.

Since Google's mission statement is "Don't be evil", people hold it to a high standard. Sergey Brin and Larry Page don't ask for permission: they do what they want to do, and rely on the fact that people will understand the point of it afterwards. The basic move in Google's rise to dominance was copying stuff without asking. Don't ask for permission, and rely on the fact that people will love the results when they see them. This model has stood the company in very good stead, but it plainly involves an attitude in which innocence and arrogance are emulsified together.

Auletta looks at the company in its pomp, and sees problems and threats everywhere. At one point in 2008, Google was offering 150 products. Only targeted advertising made real money. YouTube lost $500m in 2009. Google's programme to digitise books has caused a bitter backlash. That was an example of the no-permission policy going badly wrong, because as Brin told Auletta, if they had asked authors and publishers, "we might not have done the project".

Google's mission is "to organise the world's information and make it universally accessible and useful", but that doesn't extend to its own intellectual property, which it guards with ferocity. As its share prospectus says: "Our patents, trademarks, trade secrets, copyrights and all of other intellectual property rights are important assets for us ... any significant impairment to our intellectual property rights could harm our business or our ability to compete." It's hypocritical to pretend that the same isn't true for everybody else.
 

AR  February 2009: I guess Google will work in the perceived public interest, either so as not to be evil or because the public authorities demand it. In the latter case, the public interest will be American. We won't have a globally effective legal framework for such issues for a while yet.

November 2009: The issue is big enough to take very seriously. We cannot merely hope that Google will always do the right thing. I guess Darnton's "ambitious" solution is the best — perhaps then we can hope that the European Union will get on board and make the result a truly global repository.

February 2010: Publishers will need to do deals with Google and Amazon. That's no problem — publishers have always done deals to secure their business. And Google will have to grow up. That may be a bigger problem.