Slashdot Mirror


The Future of SiLo's Language Library

i4u writes "Early this morning I had a chance to speak with Ase (pronounced 'Ace') Deliri, curator of SiLo, the world's first digital language library. At its core, SiLo is a mash of Wikipedia and Babelfish, an open database focused on facilitating real conversations with real people. 'If you have 800-1200 words in your vocabulary, you can carry on a daily conversation. That is what we are looking at. How do you get a conversation going?'"

38 comments

  1. Grammar? by Anonymous Coward · · Score: 1

    If you have 800-1200 words in your vocabulary, you can carry on a daily conversation.

    Vocabulary's great, but it's not enough. You also need to know something about how to put those words together. You need to know morphology and syntax.

    1. Re:Grammar? by toastar · · Score: 1
    2. Re:Grammar? by blair1q · · Score: 1

      Did you understand Yoda?

      Grammar is not essential for communication, unless you need to get logical about something.

    3. Re:Grammar? by Anonymous Coward · · Score: 0

      Did you understand Yoda?

      Yoda's lines were written to be understood. They don't "lack" grammar.

    4. Re:Grammar? by digitig · · Score: 1

      Did you understand Yoda?

      Grammar is not essential for communication, unless you need to get logical about something.

      Good luck trying to explain that you've already been to the shops in Chinese (which doesn't strictly have a past tense) or explaining who hit who in German (where it's not the word order that matters but the grammatical way the words change).

      --
      Quidnam Latine loqui modo coepi?
    5. Re:Grammar? by Gibbs-Duhem · · Score: 1

      I dunno, I was able to communicate reasonably effectively with nothing more than some nouns, verbs, and prepositions. Of course, word order doesn't matter much in Japanese either, but just making gestures while saying a noun is often enough to get across a simple concept. Granted, I couldn't hold a conversation about why someone hates their boss at work, but I could definitely ask for directions, purchase things, ask what things were, and managed to muddle through ordering a pretty complicated train ticket with someone who knew zero english...

      Yeah, you'll sound like an idiot, but knowing the word for "rice" and making eating gestures goes a long way.

    6. Re:Grammar? by Anonymous Coward · · Score: 0

      "Bob", "Tony", and "stole". Who should the police arrest? Grammar contains meaning just as vocabulary does.

    7. Re:Grammar? by FatLittleMonkey · · Score: 1

      "Bob", "Tony", and "stole". Who should the police arrest? Grammar contains meaning just as vocabulary does.

      Compound words can replace most grammar. (Bob-acting, Tony-from, stole. Bob-acting, stole, Tony-from. Tony-from, Bob-acting, stole.)

      --
      Science is all about firing a drunk pig out of a cannon just to see what happens.
    8. Re:Grammar? by Anonymous Coward · · Score: 0

      Using proper grammar, Yoda is. Quite odd however, it seems. Accustomed to the subject first, you are.

    9. Re:Grammar? by blair1q · · Score: 1

      They have an alternate grammar. Might as well be random grammar. You still understood it.

  2. Re:First Trout by Anonymous Coward · · Score: 0

    ANUX -- A full Linux distribution... Up your ass!

    See, this is how ignorant you are! ANUX is an IBM version of Unix!

    Loser!

  3. That's simple... by arf_barf · · Score: 1

    Alcohol!

  4. Re:First Trout by Anonymous Coward · · Score: 0

    Linux powers my anal vibrator you insensitive clod.

  5. wow by SquirrelDeth · · Score: 1

    The porpoise of trolling is to waste other peoples time not your time you are doing something wrong. A good troll can do it in one phrase, a sentence tops. You sir suck.

    1. Re:wow by Anonymous Coward · · Score: 0

      Apparently you've never heard of copy and paste.

    2. Re:wow by Anonymous Coward · · Score: 0

      I love abstracts personified as animals. The dove of peace, the bluebird of happiness, the porpoise of trolling...

    3. Re:wow by Anonymous Coward · · Score: 0

      I only use copy/paste for apt-get or zypper

  6. Last time I used SILO by Anonymous Coward · · Score: 0

    ... I didn't remember having that many words. It was just enough to to boot the Linux kernel on my SPARC systems.

  7. You only need about 120 words by tepples · · Score: 1

    Toki Pona is a constructed language with only 120-odd words plus a ton of idiomatic compounds.

    1. Re:You only need about 120 words by Anonymous Coward · · Score: 1

      How can you offer a conlang as evidence that you only need 120 words. Nobody uses a conlang for day-to-day communication. A more intelligent example would be a creole such as Tok Pisin.

  8. How do you get a conversation going? by Anonymous Coward · · Score: 0

    How do you get a conversation going?

    That's easy: "asl?"

  9. five words by Anonymous Coward · · Score: 0

    You only need five words to start a conversation:
    "So, you come here often?"
    Everything after that is all smiles and nods.

    1. Re:five words by Gibbs-Duhem · · Score: 1

      Also, asking the bartender how to pick up the ladies is usually an entertaining conversation starter.

  10. phrase book by currently_awake · · Score: 4, Interesting

    I think the secret to a universal translator is to have a single perfectly defined artificial language and then to work out how to convert your desired language into that. Because all the translator work is targeting a single fixed target you only have to translate each language once instead of English to Spanish, English to french, English to Arabic, french to English, french to Spanish, french to etc. When converting into your chosen language you also need to track what you know. Some languages have gender to words, some give the married status of women etc but others don't so that will be missing. You'll have to alter/mark the translations to declare when some aspect of the target language isn't known. The common language would be incredibly complex (superset of all languages), but since nobody would use it directly that wouldn't matter.

    1. Re:phrase book by FatLittleMonkey · · Score: 3, Informative

      This has been tried. I believe that many early (crap) machine translation systems were based on that. Apparently it doesn't work. The super-language devolves into a database of one-to-one exceptions so quickly, that you might as well treat each language pair separately.

      (In the same way that human-readable programming languages always end up as just plain programming languages.)

      --
      Science is all about firing a drunk pig out of a cannon just to see what happens.
    2. Re:phrase book by penguinchris · · Score: 2

      It seems they're sort of attempting that. They give the example of a contributed translated phrase (not word, a whole phrase, so grammar gets parsed and translated correctly) from Zulu to English. Then, someone translates the same phrase from English to Chinese. A Zulu-speaking user could then look up the Zulu phrase, and would find translations for both English and Chinese.

      Will there be things lost in translation... of course. But I think it's not a bad idea, because it's relying on human translations, not machine translations. If you run a phrase through babelfish or google translate multiple times, it turns into gibberish (and for many languages just one pass is enough to turn it into gibberish). But if you pass it along a line of human translators, you'll get something usable. Like a game of telephone or chinese whispers, it will surely get modified along the way, ultimately to a point where much of the meaning is distorted or lost. But if there were checks in place to try to limit how far the chain can go (it would be a web, anyway, not a chain) it would be quite good.

      The problem I see with it is that the interface is awful - it's entirely flash, and not a particularly good interface even for flash - and if you're relying on volunteer contributors, you've got to make it easy for them. And flash does not do that. Especially if your volunteers for some languages are sitting in a third-world country with a decade-old computer and a dialup connection - and it doesn't even have to be that extreme an example for a badly done flash website to be a big impediment - plenty of people still have problems with flash on linux, and lots of people around the world use linux.

      Also, there are several options for English, along with different dialects. The top-level options are English (Asian), English (British), English (Caribbean), and English (Controlled), and then each one has different dialects to choose from. Controlled English? There's no explanation of these options. What should I list my contributed translation as for what I'd consider "standard" western international English? I think it goes under one of the "dialects" of Controlled English, but I'm not sure.

      Also, there seems to hardly be anything available. Most languages are blank databases at the moment, apparently.

    3. Re:phrase book by penguinchris · · Score: 1

      I have to correct myself, it's not flash. It's really, really awful javascript. It felt like a badly done flash interface. I think in its current state it's probably worse than flash would be, actually.

  11. Mon aéroglisseur est plein des anguilles. by RogueWarrior65 · · Score: 1

    That's always a good conversation starter.

    1. Re:Mon aéroglisseur est plein des anguilles. by Anonymous Coward · · Score: 0

      Really, why not snakes?

    2. Re:Mon aéroglisseur est plein des anguilles. by FatLittleMonkey · · Score: 1

      Mon avion est plein de la mère putain de serpents?

      --
      Science is all about firing a drunk pig out of a cannon just to see what happens.
  12. OMG by Anonymous Coward · · Score: 0

    Are you the fabled...

    JUNIS?

  13. Re: Missing by chazchaz101 · · Score: 1
  14. Like Japanese by tepples · · Score: 1

    Bob-acting, Tony-from, stole

    In other words, case clitics like Japanese uses. But most languages' case clitics aren't as invariant as those of Japanese, where for example "acting" is -ga and polite past tense is always -mash'ta. One ordinarily has to memorize the different forms of "acting" for each different kind (plural, gender, declension class) of subject and the forms of "from" for each different kind of object, and the different forms of "stole" for each subject (at least plural) and conjugation class. For example, in English, "stole" has "strong" conjugation, which is Germanic-speak for changing the vowel to 'a' or 'o' to form the past tense instead of adding -ed.

    1. Re:Like Japanese by blair1q · · Score: 1

      I said, "unless you need to get logical about something." When the cops show up, that's time for all logic and no ambiguity.

  15. Re: controlled English by neonsignal · · Score: 2

    Controlled English is probably referring to the subset of English that is a formal language developed at the University of Zurich.

  16. language list by neonsignal · · Score: 1

    Seems a bit naive to think that there is a single language called 'aborigine'.