Slashdot Mirror


Human and Machine Readable Handwritten Language?

darrint writes "In some obscure corner of the Earth, has someone developed a human handwritten language which can be easily read by a machine? Why is the visual divide between what can be written by a human and what can be read by a machine so wide? At one extreme is the bar code, which I certainly cannot hand write. Machines can read it easily. Bank checks have a human readable account and routing numbers printed in special ink running along their bottom margins. These numbers can be read by a machine and are clearly legible to a human, but I doubt I could write them for input to a machine. My old Palm handheld could read something like handwriting in its little box. OCR exists but I've never thought of it as reliable. I would like to dash off little notes on stickies or in a tiny spiral notebook and be able to suck them into vim, a browser text-input box, and so forth. Perhaps I'd have to learn some kind of machine readable 'shorthand.' Has it been done?"

13 of 119 comments (clear)

  1. I believe it has been done by the_other_one · · Score: 4, Funny

    I'm old enough to have filled in punch cards with a pencil. Does that count?

    --
    134340: I am not a number. I am a free planet!
  2. Recognition by reldruH · · Score: 3, Insightful

    I don't think that a lot of effort has been made to develop a different language for people to communicate with machines. I think most of research time in that area is spent in improving handwriting recognition, ie changing what machines do rather than changing what we do.

    --
    I've always pictured the color of OS zealotry as a sort of bright flamingo pinkish hue
  3. Sure, it's... by Aladrin · · Score: 3, Insightful

    Sure, it's called... THE ALPHABET.

    Learn to write it neatly and the computer will have no problem reading it. Or humans either, for that matter. Write it poorly and both will have a hard time.

    --
    "If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
  4. Ideal handwriting style by philgross · · Score: 4, Informative

    Most of the responses seem to be missing the point of the post.

    OCR/handwriting recognition folks: what would the ideal handwriting for machine readability look like? Could simple variations on standard English cursive or printing approach 100% recognizability, or would the ideal have to be synthesized, like shorthand, and if so, what characteristics would such a script have?

    1. Re:Ideal handwriting style by gameforge · · Score: 4, Insightful

      Most of the responses seem to be missing the point of the post.

      Okay. I'm attacking the point of the post.

      There's no reason to reinvent the alphabet any more than there is reinvent the wheel.

      If we change the alphabet so machines can read it, other people stop being able to read it. It's the wrong solution for the problem.

      If my handwriting is good enough that I can read it two weeks later, and my peers and friends and family can read it perfectly (i've been told I have particularly good handwriting) then why should I have to change it so that my PC can understand it, but nobody else can?

      I could memorize a second alphabet, having one for me and one for my PC... but why?

      If I could tell the software "This is how I write a 'k' and this is how I write an 'R'", that would improve things a lot IMO. My 'k' might look like someone else's 'R'; but my 'k' and 'R' look absolutely nothing alike. My ampersand kind of looks like a plus sign; but it's totally distinguishable from my plus sign. If I could dawn this on the software...

  5. this page intentionally left blank by blhack · · Score: 3, Interesting

    The problem with a machine readable, human writable language is that humans aren't neat enough. When I write the letter R it looks one way, which is differant than my sister, or my friend, or my butler (okay, i don't have a butler...but a kid can dream!).

    If someone were to develop a language that was machine readable, human writable, it would probably consist of a series of straight lines. Letters would have to be larger, but lines are probably the way to go.

    |_|__|-__-__-_||_|__

    ^like that.

    --
    NewslilySocial News. No lolcats allowed.
  6. PDAs cheat by r00t · · Score: 4, Insightful

    They don't read from paper. They can get extra info:

    * pressure
    * speed
    * stroke order
    * stroke direction
    * pen-up and pen-down events
    * timing

    1. Re:PDAs cheat by DingerX · · Score: 3, Informative

      That's how humans have read handwriting for most of the papyrus/parchment/paper era.

      The problem now is that we're used to reading print. One of the main principles of palaeography is that you read the motions of the pen (or other writing tool) in the medium. Ink in particular is great for this sort of expression, because you can (especially with a flat nib) express all sorts of motions; and using a variety of analytical tools, you can reconstruct missed strokes, damage to the medium, overlapping words and the rest. Some of those analytical tools are, of course, analysis of the linguistic context. And that same context lets us get really fancy with our handwriting. For example, if something logically follows, I don't need to waste my time writing it out clearly.

      To muddy the waters further, no two people use the same handwriting. Even in contexts where the formation of letters is strictly determined, everybody has their individual variations, epsecially in pressure, speed, stroke order, stroke direction, and lifting the pen. They also vary in how they form the letters.

      So yeah, you can probably get decent success using handwriting OCR on things like addresses and bank account numbers -- because you've got a known context, and are basically looking for key numbers.

      And I'm sure there's decent software recognition out there. But to get something that reads human script -- even a forced "machine-friendly" hand -- takes a lot of work, and a lot of training in areas that machines are not good at. You'd need a pretty big neural net.

  7. OCR Reliability by N3Bruce · · Score: 4, Interesting

    The typical account information line printed at the bottom of your typical credit card statement or utility bill is printed in a font known as OCR-A. Equipment for machine reading this type of font has been around for over 25 years, such as some of the old Banctec 4300 series workstations used for processing bill payments and checks. Even these 1970s era machines had better than a 95 percent read rate of the entire account information line, provided that the printing was clear and properly placed. Later machines, such as the NCR 7780 or the OPEX Eagle can have better than a 99 percent read rate of a full line of characters. Again, the usual limitations on reliability of OCR characters are a result of poor or mislocated printing, or stray marks in the OCR field. Here is the obligatory Wikipedia link if you interested in finding out a bit more about the history of Optical Character Recognition.

    MICR fonts, which are those funny looking numbers printed in magnetic ink at the bottom of most checks are designed to be human recognizable but machine readable, and have been around since the '60s. OCRA typically beats MICR today, but a good MICR line is still readable over 95 percent of the time.

    Handwritten fonts are the most difficult to read, but the technology has been available to read handwritten numbers and letters for over 10 years, but typical read rates for something like a handwritten zip code or the numerical amount written on a check range from 60 to 80 percent, and are slowly getting better. Again, a lot depends on how much care is taken when writing out the text, and what kind of background clutter is present.

    As for me, I typed out school reports in 8th grade in 1973, when our family's word processing hardware consisted of a 1940's vintage Underwood typewriter. Even humans had difficulty decoding my handwriting!

    1. Re:OCR Reliability by Inda · · Score: 3, Informative

      Being an ex-postman (survived one month!), I've seen the automatic sorting machines that read hand-written postcodes (zip codes in the US). I forget how many letters the machine sorted a minute, it was between 500 and 1000, but I do remember the 90% accuracy number that was boasted. The machine 'cheated' in some respects because it only had to read a 6 or 7 character postcode, of which there are only a small amount of combinations. The machine also checked the county and city if it needed clarification.

      Any postcodes that could not be read, dark paper and red ink etc, were scanned and transmitted to a postal worker drone in another part of the country who would type in the postcode from their terminal. The machine would receive the code back a few seconds later and the letter would carry on its journey.

      I was impressed.

      --
      This post contains benzene, nitrosamines, formaldehyde and hydrogen cyanide.
  8. Re:Uh.... by MobileTatsu-NJG · · Score: 3, Informative

    "Most of them work by guessing what you wrote based on a dictionary (similar to cellphone texting). Give it anything it can't look up and it'll be close, but more often than not, not quite."

    Depends on what you have it set to. My TabletPC is set to read each individual character at a time. It provides little spaces to write each character in, so you don't have to worry about spacing or anything. That's been my favorite, honestly.

    --

    "I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)

  9. Re:I believe it has been done by shadow+demon · · Score: 4, Interesting

    Somewhat off topic, but there was a certain language that functioned like what you described, just not with numbers. It is called aUI (with that capitlaization) and was created in the 50s by Prof. John Weilgart, a (bored) psychologist. The language is composed of 42 very simple ideographic "letters" that each have both a meaning and set pronunciation. The letters combine to form concepts that can be as simple or as complex as you want to make them, and the latest edition of edition of his book (1979) has a dictionary of over 4000 words. It was made so that only the most general concepts (plus the numbers 0-10) would be classified as single letters, and I think this system works very well. I really suggest you check it out if you have any interest in languages or communication, but the information available online is somewhat limited. I was able to get his book, aUI, the Language of Space, through an interlibrary loan, but I am pretty sure it is long out of print. I really think this language has a much greater chance of being useful than anything based on numbers, and since it only uses very basic shapes (e.g. number shapes, a spiral, circle, oval, etc.) it could probably be recognized pretty easily by OCR systems, probably as well as or better than current print-letter recognition.

  10. Obfuscated handwriting system by jafuser · · Score: 3, Interesting

    I made a handwriting system a long time ago with the following goals in mind in designing it:

    1. It should NOT be easily readable by a casual observer (for notes I didn't want other people to read).
    2. The most commonly used letters should be the simplest to draw, so it should be fairly fast to write, like cursive.
    3. Letters should be as umambigious as possible, so even the most scribbled/hurried writing would be distinctly recognizable.
    4. Each letter should try to hint to the original latin letter to some degree, whenever possible. Although goal #2 usually would take priority over this one when in conflict.
    5. A mid-height clear horizontal marked the beginning/end of a new letter.
    6. (just for fun) It should look kinda weird and cool in a sci-fi sort of way, so if someone came across my notes they would be kind of baffled =)

    While #2 and #3 might work towards making this an easy-to-OCR handwriting system, #1 and #6 probably makes it moot, at least for the system I made. However, I imagine it wouldn't be too hard to make a less-obfuscated more-practical writing system which try to accomplish similar goals to #2-4 above.

    I made a font out of my handwriting system a few years ago. If anyone is curious, here is an image chart of the font. =)

    I'm curious what other more "efficient" writing systems may exist out there (other than standard and cursive). Does anyone know of any others?

    --
    Please consider making an automatic monthly recurring donation to the EFF