28 July 2008
Open Library: Online Books Resource
While I have a lot of use for Google Books, and utilize the full-view content available from that site quite frequently (enough to begin to create an index of the genealogically significant ones), I am certainly open to other sites and projects that aim to bring the paper and digital worlds together. In this post I look at the Open Library project, and the potential it has to be a huge research timesaver.
What It Is.
Open Library's tagline is "One web page for every book." As in, "every book ever published", a self-described lofty goal. The site ultimately seeks to populate each of these pages with publication information, links to purchase and borrowing options, as well as links to online versions where available.
If you're thinking this sounds an awful lot like WorldCat, you're not alone. The difference, Open Library stresses, is that this project is open-source, and will be worked upon by the public through a Wiki interface. Users can edit book pages by doing things like adding TOCs, descriptions, publication information, etc. The site is still in Beta, so content is limited somewhat, although they have stub pages for, according to the site, over 13 million books (over 200,000 of which are scanned and readable online). User participation, predictably, is low at this point, so it is difficult to tell what a fully fleshed-out page on Open Library would look like.
What it Could Be.
The benefit and promise of Open Library (as I see it) is its potential to become a quality aggregator site for available online materials. I personally don't see much use for the borrowing or purchasing information as that is readily available through sites like WorldCat or even online book sellers. Of course, if Open Library were to integrate some data from OCLC, it could make the site even more valuable as a one-stop portal for finding any and every book you may ever need.
As for the search capabilities for scanned books, a sample search I ran on "Jones genealogy" returned 13 books, from both Internet Archive and Google Books:
A "Scanned books only" search on "genealogy" returned over 1,000 books. (It is unclear what field a simple search from the main page is actually searching; a similar search for full-view books on Google Books yields over 7,000 entries, but of course those searches default to full-text searches. Full-text search is currently unavailable on Open Library, though it is an option under Advanced Search, so should be coming.)
One runs into problems with the site at times due to its overwhelming thoroughness. In terms of the ingenuity of harnessing the work ethic of the public when it comes to labors of love, I worry about the dilution of effort for certain books when every edition of a book gets its own page. Douglas Adams' Hitchhiker's Guide to the Galaxy has 60 separate entries; Shakespeare's Hamlet has 2,781. Which page do you choose to work upon? Which do you ignore? If one page has a robust entry, and 59 others are empty, what is the value to the user in having to wade through 59 unelaborated pages in order to reach the one with the information he or she is seeking? Will that user even bother and just give up?
While most genealogy books aren't going to see 60 (more or less 3) entries, there are multiple identical editions of some books, so the issue still stands. A case in point is Elizabeth Shown Mills' Professional Genealogy:
Duplicates like this are plentiful on Open Library at the moment, although I would imagine that with time and a dedicated user base, these things would eventually get cleaned up. Of course, separate pages for multiple editions of a single book can be useful if user-supplied notes and descriptions managed to distinguish the editions and noted errata, etc.
Open Library is still in its infancy, possibly even prenatal. I would label this site one to keep an eye on... if they moved effectively toward their goal, Open Library could be an enormous boon to online research. I am a huge fan of any project that makes books more accessible online. I believe that is the future of media (whether or not the publishing companies kick and scream the entire way there), and a website such as this one that operates outside of the corporate environs could be a huge benefit to everyone, genealogists included.