Slashdot Mirror


Statistics For Data Entry: The Brave New Step

A reader writes:"First there was Dasher, a novel application of statistical theory that lets free texts be written using only a pointing device. Dasher works by predicting the continuations of the text being written, based on what has been written so far; there is a probability associated with each offered continuation and the presentation is designed to make it easier to choose more probable continuations. A big advantage of statistics-based interfaces is that they automatically enforce correctness, because correct strings are more probable than incorrect ones. Now the same approach has been extended to writing maths. Apropos is a Javascript application (it supports IE6 & Firefox) to create mathematical expressions. It represents the math using MathML, the official XML spec for mathematics. It is definitely clunky when compared to Dasher, but better than MS Equation Editor etc. It is interesting to consider if this approach can be extended to other XML vocabularies (for example, a model for HTML that suggests the markup as you go along - a properly trained one will make it harder to create pages with blinking text, loads of images etc.), or to formal languages other than XML (e.g. programming languages). Stochastic modeling can also be used as a basis for speech recognition, with the recognizer using the model to choose a continuation when the speech signal is ambiguous or indistinct."

1 of 121 comments (clear)

  1. Why this isn't the same as T9. by pkhuong · · Score: 4, Informative

    Having used both dasher and T9, it seems to me that t9 only takes into account the keystrokes entered for each word. It then correlates them to a dictionary. Dasher, on the other hand, is based on markov chains (yes, like those word/text generators), and thus takes into account the last [n] characters. That makes it much more accurate, and, interestingly enough, should make it particularly well-suited to editing programs in most mainstream languages, since they have a lot of noise words and frequently used sequences.

    --
    Try Corewar @ www.koth.org - rec.games.corewar