
There was a fascinating article in the Boston Globe (8/17/08, Ideas Section) on how people are now deciphering words from a decaying old book and helping to transform a historic text into a new digital file. Libraries worldwide are using digital cameras to scan millions of pages of old books using OCR (optical character recognition) to "read" the texts and turn them into digital files.
Computer software gets hung up on words that humans can easily decipher. The system developed by von Ahn takes those messy bits of text and places them as mystery words on websites. As people solve those logon puzzles, they also decode a real world. Dubbed "reCAPTCHA," the system is used on some 40,000 websites and has solved more than 44 million words in one year. You can even add it to your site or blog if you want to be part of the solution!
The results are used to correct the text and build clean copies of the books. It's more complex than that, of course, with a system in place to verify the accuracy of the human helpers. But it's nice to know that the next time you have to squint and tilt your head to figure out what those characters are, you might just be saving an old book for future readers.
2 comments:
Interesting, Peg. But I could do without captcha on some blogs. Sometimes blogger gets goofy with word verifs and we turn them off for a while.
That's the problem with technology, isn't it Mary? Sometimes it gets too clever for its own good! No matter how "smart" they get, the machines won't ever replace good old human intuition and creativity.
Post a Comment