Thursday, September 11, 2008

Are You Helping Decipher Vintage Texts?

You know those funky skewed letters you have to type sometimes when you want to leave a comment on a blog or confirm an online order? Well, there’s a name for that: CAPTCHA. Sounds kind of like “gotcha,” doesn’t it? But it’s actually an acronym that means Completely Automated Public Turing test to tell Computers and Humans Apart. Thank Luis von Ahn, of Carnegie Mellon University, who helped develop the security technique that is intended to foil the intrusion of bots.

There was a fascinating article in the Boston Globe (8/17/08, Ideas Section) on how people are now deciphering words from a decaying old book and helping to transform a historic text into a new digital file. Libraries worldwide are using digital cameras to scan millions of pages of old books using OCR (optical character recognition) to "read" the texts and turn them into digital files.

The trouble comes when age takes its toll on pages, and the old type smudges or flakes off the page. Computer software gets hung up on words that humans can easily decipher. The system developed by von Ahn takes those messy bits of text and places them as mystery words on websites. As people solve those logon puzzles, they also decode a real world. Dubbed "reCAPTCHA," the system is used on some 40,000 websites and has solved more than 44 million words in one year. You can even add it to your site or blog if you want to be part of the solution!

The results are used to correct the text and build clean copies of the books. It's more complex than that, of course, with a system in place to verify the accuracy of the human helpers. But it's nice to know that the next time you have to squint and tilt your head to figure out what those characters are, you might just be saving an old book for future readers.


Mary said...

Interesting, Peg. But I could do without captcha on some blogs. Sometimes blogger gets goofy with word verifs and we turn them off for a while.

Peg Silloway said...

That's the problem with technology, isn't it Mary? Sometimes it gets too clever for its own good! No matter how "smart" they get, the machines won't ever replace good old human intuition and creativity.