Slashdot Mirror


Statistics For Data Entry: The Brave New Step

A reader writes:"First there was Dasher, a novel application of statistical theory that lets free texts be written using only a pointing device. Dasher works by predicting the continuations of the text being written, based on what has been written so far; there is a probability associated with each offered continuation and the presentation is designed to make it easier to choose more probable continuations. A big advantage of statistics-based interfaces is that they automatically enforce correctness, because correct strings are more probable than incorrect ones. Now the same approach has been extended to writing maths. Apropos is a Javascript application (it supports IE6 & Firefox) to create mathematical expressions. It represents the math using MathML, the official XML spec for mathematics. It is definitely clunky when compared to Dasher, but better than MS Equation Editor etc. It is interesting to consider if this approach can be extended to other XML vocabularies (for example, a model for HTML that suggests the markup as you go along - a properly trained one will make it harder to create pages with blinking text, loads of images etc.), or to formal languages other than XML (e.g. programming languages). Stochastic modeling can also be used as a basis for speech recognition, with the recognizer using the model to choose a continuation when the speech signal is ambiguous or indistinct."

9 of 121 comments (clear)

  1. Re:Like t9 by xabi · · Score: 3, Insightful

    More info in the same dasher web site here

    --
    Check populicio.us
  2. Re:correctness? by Anonymous Coward · · Score: 2, Insightful

    Or the poster uses the British form of English where, I believe, this is correct usage. Not everyone is a 'Merican, you know.

  3. I'm not hopeful by Anonymous Coward · · Score: 2, Insightful

    Dasher works because there is a small number of words that are likely to follow on from where you are. The same does not apply to MathML or HTML. The most useful you are likely to get is tab-completion for tag names, attribute names, etc.

    1. Re:I'm not hopeful by weierstrass · · Score: 2, Insightful

      What we write is only predictable to the extent that it is redundant: ie when i type "tomor" into my mobile phone, if it's obvious to the phone i'm going to write "tomorrow", i could just send a msg saying "C U tomor".

      It doesn't seem to me that there's anything like as much redundancy in mathematical formulae as there is in written language. When the professor writes "X=..." on the board, it's very hard to predict the next symbol unless you know what x is in fact equal to.

      --
      my password really is 'stinkypants'
  4. Quick test by potifar · · Score: 4, Insightful
    MathML was never really intended for writing by hand, and even if Apropos makes it easier, I can't see myself switching from (La-)TeX anytime soon. I can enter extremely complex mathematical expressions at least 20-30 times faster by typing them in TeX than I ever could do clicking around an interface like Apropos.

    MathML is a good idea in theory, but until there are good tools for writing and editing MathML, there will be very few people using it (either for publishing or for archival purposes.)

    1. Re:Quick test by potifar · · Score: 1, Insightful
      Do you really think
      <apply>
      <int/>
      <bvar>
      <ci> x </ci>
      </bvar>
      <interval>
      <ci> a </ci>
      <ci> b </ci>
      </interval>
      <apply>
      <cos/>
      <ci> x </ci>
      </apply>
      </apply>
      is easier to read than
      $\int_a^b \cos x dx$
      ?
  5. GIGO would be proud by 10am-bedtime · · Score: 2, Insightful

    data integrity starts w/ data entry. when data entry is reduced to "no" vs "yes-for-now-we-can-fix-it-later", the game is lost; GIGO prevails, then.

  6. Ahem by kahei · · Score: 3, Insightful


    A big advantage of statistics-based interfaces is that they automatically enforce correctness, because correct strings are more probable than incorrect ones.

    In a rigorous, technical environment, being _usually_ correct is not enough and a statistics-based approach to ensuring correctness is not very useful.

    In an informal environment, correctness is not nearly as common as you might hope, so again a statistics-based approach may well not be as good as actually enforcing definite correctness.

    --
    Whence? Hence. Whither? Thither.
  7. Why? by Quixote · · Score: 1, Insightful
    a properly trained one will make it harder to create pages with blinking text, loads of images etc.

    Why should it? What if I want to create such a page? Why should someone (or something) tell me what to say, or how to say it? And who will "train" such a thing? The Government??