Roundup

image from the Million Book ProjectTwo baby librarians were born to my wonderful friends from the Cornell library. Laih and Anika are both nine and a half pounds, super-smart, and will be entering the University of Michigan’s Information program in 2028. Congratulations Kim and Brian, and Clay and Mike!

I attended a seminar on the Million Book Project last night. There were a buncha computer science folks there, and the usual gang of library folks. I was horrified by the scale of the project, which involves sending hundreds of shipping crates of books to India and China for scanning, and wrangling terabytes of information that takes a week to even copy. Yow. One thing I found curious was their decision to do bitonal scanning instead of grayscale. I’m told that this results in “no real information loss” but many of their pages are really hard to read because of the lost grayscale info. In many cases the entire nature of the page is lost, and the resulting image looks like a rubber stamp of the Mona Lisa. Does anyone with more experience in these matters have thoughts on the subject? They scan at 600 DPI, so it can’t just be a matter of storage space.

In other news, there was a duck in one of our trees this morning, female mallard who was driven to extremes by an ardent male duck. She was up there for about five minutes, wobbling around until she finally fell into the pond.

2 Replies to “Roundup”

  1. Grayscale is, mm, eight or sixteen times as much info per dot as is bitone? I’m pinging pulledoutofmyhat.com here, but if it’s already a problematic quantity of data multiplying it eightfold does count.

    OTOH, better to have one-eighth as many readable pages.

    How much do they expect to be able to change the process as they learn what works?

    clew

Comments are closed.