Where Old, Unreadable Documents Go to Be Understood (atlasobscura.com)
From a report: On any given day, from her home on the Isle of Man, Linda Watson might be reading a handwritten letter from one Confederate soldier to another, or a list of convicts transported to Australia. Or perhaps she is reading a will, a brief from a long-forgotten legal case, an original Jane Austen manuscript. Whatever is in them, these documents made their way to her because they have one thing in common: They're close to impossible to read. Watson's company, Transcription Services, has a rare specialty -- transcribing historical documents that stump average readers. Once, while talking to a client, she found the perfect way to sum up her skills.
[...] Since she first started specializing in old documents, Watson has expanded beyond things written in English. She now has a stable of collaborators who can tackle manuscripts in Latin, German, Spanish, and more. She can only remember two instances that left her and her colleagues stumped. One was a Tibetan manuscript, and she couldn't find anyone who knew the alphabet. The other was in such bad shape that she had to admit defeat. In the business of reading old documents, Watson has few competitors. There is one transcription company on the other side of the world, in Australia, that offers a similar service. Libraries and archives, when they have a giant batch of handwritten documents to deal with, might recruit volunteers.
[...] Since she first started specializing in old documents, Watson has expanded beyond things written in English. She now has a stable of collaborators who can tackle manuscripts in Latin, German, Spanish, and more. She can only remember two instances that left her and her colleagues stumped. One was a Tibetan manuscript, and she couldn't find anyone who knew the alphabet. The other was in such bad shape that she had to admit defeat. In the business of reading old documents, Watson has few competitors. There is one transcription company on the other side of the world, in Australia, that offers a similar service. Libraries and archives, when they have a giant batch of handwritten documents to deal with, might recruit volunteers.
I'd want to see this lady decipher the scribbling of a doctor I visited with foot pain recently. There's the Voynich Manuscript, then there's this.
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
The reCAPTCHA service does two things. Verifying a user is a human by offering something that's really hard to automate is the one everybody knows about. The other is an effort to crowdsource understanding of images. This started with decoding the words in scanned books that OCR was having difficulty with.
There's your competition (though it's admittedly restricted to modern texts, so historical context and historical characters are beyond its scope ... and reCAPTCHA has recently moved on to other forms of image recognition.)
Use my userscript to add story images to Slashdot. There's no going back.
Try reading some "intellectual property" in the future!
Hidden away in some corporate basement. Encrypted, with the key servers shut down long ago...
Researchers complain that we already have the second dark ages[1], starting with the invention of "copyright"[2].
THIS is unreadable.
There was a time, where Germany started to be called "the land of poets and thinkers". It was the time when Germany didn't have such laws but the UK already had. Art thrived and flourished in Germany, and starved in the UK.[3]
(Let's just hope our systems become powerful enough, the corporations don't live on forever, and they don't use one-time pads.)
___ ... i.e. grant a privilege to the actual creators ... but deliberately doesn't.
Note 1: Which is a term referring to the lack of information from that era.
Note 2: Which should really be called "imaginary distribution monopoly privilege, for the purpose of leeching off of artists and fans without working for it in return".
Note 3: And Germany still doesn't really have it. They have something that is often confused with copyright, but differs in all key points: It is not a distributor's privilege, but that of the actual creator of the work. It is implicit and not explicit, depending only on the threshold of originality, making (c) marks unnecessary. And it can never be signed away to anyone else. (You can license it, of course. But you can never lose control.) So all the things that copyright states it would do
I would assume it's on /. because it's interesting "stuff that matters"....
Stephan
to be devoured by some ancient evil or long dead civilization.
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
Where Old, Unreadable Documents Go to Be Understood
It must have been something you assimilated. . . .
Yeah, and the answer to "Who's a convict in Australia?" is "All of them"
I have noticed a lot of tech/computer nerds have a significant interest in language nerdery. I've seen /. threads devolve into arguments over correct Latin grammar. This certainly piques the interest of people who have a bit of language nerd in them, because it's as much about knowledge of old writing systems and abbreviations as it is ability to look at squiggly lines and pattern-match.
For MS Works files, just use Libre Office. Heaven knows Microsoft Office is too incompetently made to handle them.
The US tax code documents would seriously challenge Ms. Linda Watson.
Not because you cannot read the actual words, more because you cannot understand their meaning.
Same with most government documents from just about any government.
There are two handwriting styles in German that are pretty much illegible to modern readers. Sütterlin was taught in the '30s and '40s to people who are alive today, but in 20 years, very few people will be able to read it. I can kinda-sorta read it because my grandmother (b. 1898) wrote letters in it, and my father's (b. 1930) handwriting was this weird combination of Sütterlin and American-style Palmer. Kurrent is even older and was taught to German school children up through the early 20th century. Kurrent's letter forms are however closer to Roman-style alphabet than Sütterlin.
My first thought on seeing the headline was about using technology to read ancient manuscripts which may be too fragile to open or may have even been written on recycled even older manuscripts. They use x-rays and computer imaging to read that which cannot be read by the human eye.
I've seen a few stories about this over the years.
Scientists read ancient sealed documents without opening them
MIT and Georgia Tech develop technology to read books without opening them
Scientists Read Ancient Hebrew Scroll Without Opening It
Scanning an Ancient Biblical Text That Humans Fear to Open
There's lots more out there and note those aren't just 4 different links to the same story.
But this story is still interesting to me too. I'm sure that the people doing the work in the linked article might be tasked with transcribing or translating the images of pages they can't actually touch.
The last paragraph of the article.
I can't decide whether AI would be an improvement on the Slashdot editors or if it's already replaced them.
"Once, while talking to a client, she found the perfect way to sum up her skills."
What's that then? Not going to tell us? Have to go to the article to find out? ps. It's not worth it.
Like I give a fuck about some shopping list for a dude two thousand years ago
Some 2,000-year old documents can still be informative reading, e.g. the System 7 Unix source code.
There's also a lot of practiced physical craft. My wife studied at West Dean College in England, a college dedicated to historical preservation and reconstruction. It includes clock making, tapestry weaving, ceramics, books, and metals conservation. The building is *littered* with amazing historical artifacts, with a wall of ancient weapons that made me drool on the carpet, whimpering "want to play!!!" with some of the lovingly restored specimens.
Sadly, the craft is rapidly disappearing. There's a glut of lightly trained people in it, but a dearth of funding to keep people employed to get the 20 years of hands-on skills for the most delicate knowledge. And a lot of it hard-won, hard-learned skills from working with hundreds or thousands of less valuable documents over a career, and the senior people refuse to die off. There's going to be a massive purge as they hit forced retirement ages, because they haven't been able to train newer experts. There's been no funding to keep them on staff. If you value books as artistic objects in their own right, as I do, it's enough to make you weep.
Now all we need is someone to decipher Word documents we wrote 2 weeks ago but no longer render properly.
The X-Ray stuff is cool. Multi-spectral photography is also great for stuff damaged by fire. But give me a high-quality digital photo and a Curves tool, and I'll show you things you normally wouldn't see.
The big problem is not fragility. Parchment, the treated animal skins that make up the pages of almost all European mss between the sixth and thirteenth centuries (and most books from the fourteenth), is so durable that, when people could no longer make sense of the handwriting style, they took the pages and used them to bind paper books.
The problem is that we have hundreds of thousands of books that haven't been properly catalogued, let alone digitized. And all over the world are boxes of fragments taken from book bindings. There are great discoveries to be made.
(This was back in the 1970s and 1980s, when schoolkids were still being taught cursive.) After considerable thought, I concluded that written text was a WORM operation (write-once read-many).
Cursive saved time at the write stage (easier to write), at the cost of additional time at the read stage (harder to read). Since the write operation happened only once while the read operation could happen multiple times, I decided saving time at the write stage was not usually not worth it - the cumulative extra time wasted at the read stage could easily exceed the time saved at the write stage. And I began writing exclusively in print letters in the 6th grade.
I have noticed a lot of tech/computer nerds have a significant interest in language nerdery. I've seen /. threads devolve into arguments over correct Latin grammar. This certainly piques the interest of people who have a bit of language nerd in them, because it's as much about knowledge of old writing systems and abbreviations as it is ability to look at squiggly lines and pattern-match.
I wouldn't say so, but nerds do tend to be grammar nazis at least up until the point in their lives that they stop caring about what others think (usually mid 30's, about the same time you unashamedly start listening to the greatest hits of the 60's, 70's and 80's in your car). However we have nothing on the kind of pendants that come from old universities like Cambridge and Oxford. If you would like to see a truly vicious argument over a minor point of Latin grammar computer nerds with an interest in language are severely outclassed (and outnumbered).
Calling someone a "hater" only means you can not rationally rebut their argument.
But how do you know if it's a shopping list or an ancient cure for blue ball?
Wanna buy a shirt?
https://www.redbubble.com/people/stealthfinger/shop?asc=u
Those swinging Oxford pendants.