Slashdot Mirror


The Future of SiLo's Language Library

i4u writes "Early this morning I had a chance to speak with Ase (pronounced 'Ace') Deliri, curator of SiLo, the world's first digital language library. At its core, SiLo is a mash of Wikipedia and Babelfish, an open database focused on facilitating real conversations with real people. 'If you have 800-1200 words in your vocabulary, you can carry on a daily conversation. That is what we are looking at. How do you get a conversation going?'"

4 of 38 comments (clear)

  1. phrase book by currently_awake · · Score: 4, Interesting

    I think the secret to a universal translator is to have a single perfectly defined artificial language and then to work out how to convert your desired language into that. Because all the translator work is targeting a single fixed target you only have to translate each language once instead of English to Spanish, English to french, English to Arabic, french to English, french to Spanish, french to etc. When converting into your chosen language you also need to track what you know. Some languages have gender to words, some give the married status of women etc but others don't so that will be missing. You'll have to alter/mark the translations to declare when some aspect of the target language isn't known. The common language would be incredibly complex (superset of all languages), but since nobody would use it directly that wouldn't matter.

    1. Re:phrase book by FatLittleMonkey · · Score: 3, Informative

      This has been tried. I believe that many early (crap) machine translation systems were based on that. Apparently it doesn't work. The super-language devolves into a database of one-to-one exceptions so quickly, that you might as well treat each language pair separately.

      (In the same way that human-readable programming languages always end up as just plain programming languages.)

      --
      Science is all about firing a drunk pig out of a cannon just to see what happens.
    2. Re:phrase book by penguinchris · · Score: 2

      It seems they're sort of attempting that. They give the example of a contributed translated phrase (not word, a whole phrase, so grammar gets parsed and translated correctly) from Zulu to English. Then, someone translates the same phrase from English to Chinese. A Zulu-speaking user could then look up the Zulu phrase, and would find translations for both English and Chinese.

      Will there be things lost in translation... of course. But I think it's not a bad idea, because it's relying on human translations, not machine translations. If you run a phrase through babelfish or google translate multiple times, it turns into gibberish (and for many languages just one pass is enough to turn it into gibberish). But if you pass it along a line of human translators, you'll get something usable. Like a game of telephone or chinese whispers, it will surely get modified along the way, ultimately to a point where much of the meaning is distorted or lost. But if there were checks in place to try to limit how far the chain can go (it would be a web, anyway, not a chain) it would be quite good.

      The problem I see with it is that the interface is awful - it's entirely flash, and not a particularly good interface even for flash - and if you're relying on volunteer contributors, you've got to make it easy for them. And flash does not do that. Especially if your volunteers for some languages are sitting in a third-world country with a decade-old computer and a dialup connection - and it doesn't even have to be that extreme an example for a badly done flash website to be a big impediment - plenty of people still have problems with flash on linux, and lots of people around the world use linux.

      Also, there are several options for English, along with different dialects. The top-level options are English (Asian), English (British), English (Caribbean), and English (Controlled), and then each one has different dialects to choose from. Controlled English? There's no explanation of these options. What should I list my contributed translation as for what I'd consider "standard" western international English? I think it goes under one of the "dialects" of Controlled English, but I'm not sure.

      Also, there seems to hardly be anything available. Most languages are blank databases at the moment, apparently.

  2. Re: controlled English by neonsignal · · Score: 2

    Controlled English is probably referring to the subset of English that is a formal language developed at the University of Zurich.