June 16, 2013

Scanning the Middle Ages

Fascinating letter to Dr. Jerry Pournelle from one of his readers. The technology is bringing about some really wonderful dividends:
THE END OF OBSCURITY
Dear Jerry :

As we have both experienced the often-frustrating reality of �original archival research� in the great libraries of the world, I want to report that change is in the dusty air. It used to be the case that the more distant events were in time, the less the likelihood of retrieving novel information about them. The problem was not the lack of ancient records, but their sheer abundance.

There is nothing novel about the latest NSA privacy scandal- the tendency of state bureaucracies and courts to gather and hoard information about citizens is as old as time, and it is from the court�s own realization of the horrors of information retrieval in bottomless archival pits that modern statutes of limitation have arisen.

The consequence of manuscript hoarding was to sink most of the historical record in oceans of trivia deep enough to drown all but the most persevering scholars. You could easily spent a month in the archives or the stacks retrieving just one new kilobyte to add to the sum of history, and far more of that time would be spent flipping through thousands of cards in a paper catalogue than reading the few documents you elected to retrieve.

Nowhere was this problem more evident than in the dozens of Staatsbibliotek holding the gathered sum of paper once held in the archives of the 300-odd principalities and city-states that preceded the unification of Germany under Bismarck. This archival opacity did not pass un-noticed, and a few decades ago many foundations, like Volkswagen, committed future cash flows to synoptic efforts to map both archives and archaeology with equally Teutonic thoroughness. In short, they decided to upload the middle Ages.

But as the foundation subsidized scanning began, something unexpected happened. Computer search software got smarter at a pace eclipsing Moore�s Law, and the project began to run ahead of schedule, as software fixes reduced the redundancy of uploading the same documents from many different archives, creating a positive feedback that eliminated multiple record entries that wasted scholarly reading time. So while a generation ago, it could take a lifetime of scholarly stack time to find enough new material to extend history by a handful of pages, the intellectual productivity of the paper chase has soared.

Today anybody can go online and find material that holds new meaning in a matter of hours rather than months.
Much more at the site -- fascinating meld of history and geekdom. Two subjects near and dear to my heart. Posted by DaveH at June 16, 2013 3:48 PM