Paris

Text appeal: Jacques Chirac urges the public digitizing of books in response to Google project. Credit: M. HAYHOW/AP

French president Jacques Chirac instructed his government last week to come up with proposals for digitizing the collections of libraries in France and other European countries.

His statement, issued on 16 March, asked Renaud Donnedieu de Vabres, France's minister of culture, and Jean-Noël Jeanneney, the president of the Bibliothèque nationale de France, to come up with proposals to accelerate the dissemination of French and other European works on the Internet. He called on France and Europe to take “a major role” in a “vast digitization of knowledge”.

Chirac's move is widely interpreted as a response to Google's announcement late last year that it intends to scan millions of library books — primarily from collections at the universities of Harvard, Stanford, Michigan and Oxford, as well as that of the New York Public Library — over the next ten years. But this plan is being viewed with trepidation by backers of existing, public-domain projects that aim to do the same kind of thing.

One backer of the public-domain approach is Brewster Kahle, founder of the Internet Archive project, based in San Francisco. In December, Internet Archive teamed up with Carnegie Mellon University, the Library of Congress American Memory Project and universities in Canada, Egypt, India, China and Europe to digitize 9 million books over the next four years. More than 50,000 of them will be digitized by the end of this month.

Kahle says that the Google project could have three possible outcomes. The first is that funding for public-domain projects could dry up, with library collections effectively being privatized by Google. Alternatively, the Google move might result in healthy competition and an increased demand for a public-domain service, Kahle says. He cites as a precedent the human genome project, where the private company Celera's plans to sequence the genome galvanized the public consortium's determination to deliver its own version. The third possibility, Kahle says, is that Google might collaborate successfully with the public-domain efforts.

The Internet Archive's annual administrative costs of about $2 million are met by grants from the US National Science Foundation, the Library of Congress, national archives such as those in Britain and France, and philanthropists such as the William and Flora Hewlett Foundation. But its scanning costs — which could amount to $230 million over four years — are due to be paid by participating libraries. There is now “fear, uncertainty and doubt” over this, says Kahle, with some libraries “waiting to see if they can get a handout from Google” instead.

Michael Hart, the founder of Project Gutenberg — the first ambitious attempt to digitize libraries, launched in 1971 and based in Urbana, Illinois — expresses concern about the proprietary nature of the Google project. He fears that his and other public projects could be hurt if funders think Google can do the job alone.

A public effort is essential, argues Hart, because it should provide users with access to the full text of books and high-quality images that they can use in whatever way they wish, without restriction. In contrast, Google's current system allows users to search texts online and to browse images, but provides access to only a small portion of the texts.

However, Raj Reddy, a computer scientist who is the founder and director of the Universal Digital Library at Carnegie Mellon University, welcomes the competition from Google, and says that, if anything, it should increase support for public-domain projects. “The more the merrier,” says Reddy, whose project scans a million pages a day and has already indexed some 100,000 volumes as part of a plan to scan one million books. “It is going to take us a long time to digitize all these things.”