Distributed Translation Project

I like it! by IronTek · 2002-04-05 07:35 · Score: 1

I like it!

Think of it as a Rosetta Stone of the internet age!

Pretty cool stuff!

Re:I like it! by longnow · 2002-04-05 08:21 · Score: 1

http://www.rosettaproject.org is also using volunteer contributors from around the world and already has documents in over 1200 languages. Hopefully this type of collection will be useful to universal translation efforts.

Universal Translator by lxmeister · 2002-04-05 07:36 · Score: 3, Funny

The Universal Translator is finally here! But will they ever release it in fish form?

Re:Universal Translator by lxmeister · 2002-04-05 07:58 · Score: 1

I tried going outside once. I got sunburnt, kicked out of a pub for underage drinking and missed the last train home. I now live in my room with the curtains closed.
Re:Universal Translator by Anonymous+Cow+herd · 2002-04-05 08:39 · Score: 0

But the real question is, did you feed the sandwitch to the dog before you got kicked out?

--
Ita erat quando hic adveni.
Re:Universal Translator by Proaxiom · 2002-04-05 09:12 · Score: 2

I know this is a joke, but I can't help but think that this could eventually be built to that level.
If it were to build a sufficient amount of understanding of a sufficiently large number of languages (dead languages included), it could start doing real linguistic analysis.
Linguists have a relatively good understanding of how languages develop, evolve, and diverge over time. This helps to chart large parts of human history by analyzing relationships between distant language cousins (Sanskrit and Latin are cousins, for example, and by comparing them we can draw inferences about certain unknown cultures who lived up to 5 thousand years ago).
If they were to add a phonological component to a system like this, and then utilize the massive amount of computational power distributed computing can provide, the system could start to do advanced analysis of languages.
What you could conceivably end up with is very much a Universal Translator. Imagine being able to enter in a few dozen pieces of script from some long dead language (say, Linear A), and in a few days have it translated and placed in its appropriate place in the tree of languages.
That said, as good as this idea is, I have serious reservations. The resources required to build such a system would be huge. You would need tremendous linguistic skills and great computer expertise to design the algorithms. I have to put this one in the category of "I'll believe it when I see it."
Re:Universal Translator by Com2Kid · 2002-04-05 09:18 · Score: 1

Dude, first they got to get something that can understand ONE spoken language.

I have yet to find a speech to text translation system that will even let me complete the orientation mine as well actually WORK for me.

Damnable screw up systems. If I say library is pronounced Lye'bear'ee then when I say Lye'bear'ee it had sure as hell better write down library!

--
Need help treating your acne? Come here!
Re:Universal Translator by Proaxiom · 2002-04-05 09:36 · Score: 1

That's a different issue altogether. You can do phonetic analysis without any need for generic speech recognition.
Just think: Does translation software necessarily have to be able to read your handwriting?
Re:Universal Translator by blibbleblobble · 2002-04-05 10:38 · Score: 1

A great idea indeed. And if it encourages research into obscure or dying languages/dialects, that's even better.

People are making jokes about their only language being English, but ask a linguist how useful it would be to have a -detailed- map of dialects. Put "type a euphamism for drugs" as one of the questions, and create maps of dialects.

As the rosetta stone idea goes, one place to start would be webpages with a translation in another language. "Report bilingual pages to study@linguistics.org" or adverts to that effect.

Good luck to them all, and keep us posted when they need help testing it.
Re:Universal Translator by DEBEDb · 2002-04-05 13:07 · Score: 1

Put "type a euphamism for drugs" as one of the questions

And it will suggest proper spelling for
the word "euphemism".

--

Considered harmful.
Re:Universal Translator by Com2Kid · 2002-04-05 15:11 · Score: 2

Does translation software necessarily have to be able to read your handwriting?

All translated to the same internal code. If I say "dog" the software has to REALIZE that I said the word "dog" and not the word "bog".

Hmm.

Poor water on the dog, poor water in the bog. . . . BIG difference. :)

Internally right before a dictionary lookup is being done, the words are going to have the same (type of) storage in memory no matter WHAT use that they are going to have. Be it looking up the proper English spelling of the word or looking up how to pronounce the word in Spanish, if the RIGHT word cannot be looked up. . . . well what good is it?

--
Need help treating your acne? Come here!
Re:Universal Translator by Carnivorous+Carrot · 2002-04-05 18:43 · Score: 1

His eyes open. His sails unfurl.

Tamalak in the garden. Tamalak under the tree.

--
"Has [being a kidnapped teenage girl, raped repeatedly for months] changed you?" - Katie Couric to Elizabeth Smart

Let's get started right now by PD · 2002-04-05 07:36 · Score: 2, Funny

Everyone translate the word "fuck" into your native language.

--
If tits were wings it'd be flying around.

Re:Let's get started right now by Anonymous Coward · 2002-04-05 07:41 · Score: 1, Funny

my jab on this...in my native language its called "embrace and extend"..ofcourse i speak the native language called 'redmondish'
Re:Let's get started right now by Anonymous Coward · 2002-04-05 07:41 · Score: 0

It translates to follar in Spanish (Spain) and cojer in Spanish (Argentina).
Re:Let's get started right now by Have+Blue · 2002-04-05 07:58 · Score: 2, Funny

"Fuck" in my native languge of English is "Fuck".
Re:Let's get started right now by susano_otter · 2002-04-05 08:03 · Score: 3, Informative

Do you mean the verb "to fuck", or the multipurpose expletive "fuck"?
In Portuguese, the translation of the first would be "foder", while the second might be "c'os pariu" (but I'm not up on current slang, so that may be outdated).

NOTE: The multipurpose expletive in Portuguese would be a totally different cognate from the English version.

--
Any sufficiently well-organized community is indistinguishable from Government.
Re:Let's get started right now by Permission+Denied · 2002-04-05 08:11 · Score: 1

French: foutre
Romanian: a fute

The similarities with the other Romance languages are surprising. Certainly puts down the 'Fornication under consent of king' idea. Anyone have the real etymology for this word?

Of course this tells you nothing about usage or conjugation (they're both regular verbs thankfully).

I once read a review about this book that listed cursewords and phrases in all sorts of languages. Like how to say 'You incompetent fucking idiot!' to a Georgian waiter who spills coffee on you. Sounds like an interesting read - anyone have a link for this?
Re:Let's get started right now by Tribbin · 2002-04-05 08:16 · Score: 1

In Dutch 'to fuck' is called 'neuken'. We just pronounce 'FUCK!' as 'FUCK!'.

--
If you mod this up, your slashdot background will turn into a beautiful sunset!
Re:Let's get started right now by t0qer · 2002-04-05 08:22 · Score: 2

1n my L1ng0 (l337 sp33k) 1t \/\/00d 83 f00k.
Re:Let's get started right now by l810c · 2002-04-05 08:24 · Score: 1

For Unlawful Carnal Knowledge?
Re:Let's get started right now by Reality+Master+101 · 2002-04-05 08:28 · Score: 2

Anyone have the real etymology for this word?

Snopes has a page about this.

--
Sometimes it's best to just let stupid people be stupid.
Re:Let's get started right now by Reality+Master+101 · 2002-04-05 08:31 · Score: 2

Unfortunately, the link they gave on their page doesn't work. How about this one instead.

--
Sometimes it's best to just let stupid people be stupid.
Re:Let's get started right now by Dr+Caleb · 2002-04-05 08:39 · Score: 2

In ancient England a person could not produce offspring/have sex unless you had consent of the King (unless you were in the Royal Family). When anyone wanted to have a baby, they got consent of the King, the King gave them a placard that they hung on their door while they were having sex. The placard had "Fornication Under Consent of the King" on it.
So FUCK is an English word.

--
"History doesn't repeat itself, but it does rhyme." Mark Twain
Re:Let's get started right now by Anonymous Coward · 2002-04-05 08:44 · Score: 0

No, everyone knows it stands for
For Unlawful Carnal Knowlege

Or maybe I just like VanHalen
Re:Let's get started right now by Anonymous Coward · 2002-04-05 08:46 · Score: 0

Swedish: knulla.
Re:Let's get started right now by xtremex · 2002-04-05 09:16 · Score: 2

Having been a Linguistics major, If I remember correctly, fuck comes from the Germanic languages..."ficken" which means to fornicate

--
If you're not a Liberal in your 20's, then you have no heart.If you're still a Liberal in your 30's you have no brain.
Re:Let's get started right now by Anonymous Coward · 2002-04-05 09:19 · Score: 0

In what context?

How about translating:

That dumb fucking fuck fucked me over and then fucked a donkey.
Re:Let's get started right now by Joel+Ironstone · 2002-04-05 09:20 · Score: 1

Flamebait? I responded to the question, that's ridiculous.
Re:Let's get started right now by Anonymous Coward · 2002-04-05 09:53 · Score: 0

>> We just pronounce 'FUCK!' as 'FUCK!'.

+5, funny for being creative

The way I read it: (the credit is yours, for it is the idea that counts...)

"Hell-ow, my name's Tinus Lorvalds and I pronounce FUCKS as FOOCKS".
Re:Let's get started right now by yesthatguy · 2002-04-05 10:23 · Score: 2

Even though you're offtopic, you really need to be set straight. Yes, this is a popular myth, and somebody as innocent and reliable as your high school English teacher may have told it to you (mine tried to), but you're wrong. Even just thinking for a second about 'fuck' coming from an acronym, you can see that married couples would not 'fornicate,' nor would the King really have any interest in giving out fucking licenses. The other popular myth, "For Unlawful Carnal Knowledge" can be ruled out because it's a poorly-formed acronym, and also the word 'fuck' predates the time of the popular story by a few centuries.

The word most likely comes down from a Germanic tongue, but finding a precise lineage is difficult - there are many possible options. For more information, do a google search for something like "fuck etymology," or go here.

--
Yes! That guy!
Re:Let's get started right now by jo42 · 2002-04-06 03:16 · Score: 1

For all Linux wankers, its "fork()".
Re:Let's get started right now by PD · 2002-04-08 03:06 · Score: 2

I know it's ridiculous. I posted the original query for translations of the word fuck and it's been modded up and down all over.

Some people simply do not understand the moderations, hence I will give some examples:

flamebait: fuck you and the donkey you rode in on.

offtopic: (an article about telephones on a story about rockets)

troll: A controversial article designed to elicit lots of responses - these articles are usually from the "devil's advocate" position, but not always.

--
If tits were wings it'd be flying around.
Re:Let's get started right now by Joel+Ironstone · 2002-04-08 03:32 · Score: 1

I think its sort of funny that you've been modded up for this post!
Re:Let's get started right now by Anonymous Coward · 2002-04-08 08:57 · Score: 0

moron - he just has enough karma to post that high... I should put you on my foe list just for that comment alone... idiot

Browsing translation by ZaneMcAuley · 2002-04-05 07:37 · Score: 2

So, I can use a plugin that would automatically use this super dooper distributed brain to get all my french pages into english etc?

Currently my favorate web translator is this one :D http://www.pornolize.com/

--
----- Whats wrong with this picture? http://www.revoh.org:1234/whatswrong

i wonder by runtimeerror7 · 2002-04-05 07:38 · Score: 3, Insightful

"This will automatically detect when the computer user is less busy and ask them to translate a word or phrase."

i wonder how its gonna detect when the user is not busy. this software can never be installed on something like my home computer where i leave my DSL on to make it work on SETI.

Re:i wonder by ZiZ · 2002-04-05 07:43 · Score: 1

It will check to see if you're currently reading /., and if you are, it assumes that you're busy. Otherwise, anything you do can be interrupted to do some translation...

--
This flies in the face of science.
Re:i wonder by the_real_tigga · 2002-04-05 11:07 · Score: 1

"my home computer where i leave my DSL on to make it work on SETI"

Actually the SETI team is against such behaviour.
As they clearly state here. They think it important to do the SETI search "in an environmentally conscious way".

--
my .sig is better than yours.

How is this sustainable? by food-n-bev · 2002-04-05 07:38 · Score: 3, Insightful

...believes it could provide a free way to translate the many languages not included in existing online translators...

What's in it for the volunteers? Seems that novelty might bring experts in to volunteer short term, but when businesses, academics, etc. begin using the service in volume, it really will cry out for commercialization. The volunteers won't stick around performing translations gratis forever. At some point you have to pay them per translation or provide some other compensation (perhaps a /. like karma system?)

The related bigger question will be whether this model ultimately proves to deliver quality translations at a lower cost than a traditional translation service. I don't see how this could happen if you have to still have a language expert look at the full translation as a whole to ensure that contextual subtleties are not lost.

Re:How is this sustainable? by Anonymous Coward · 2002-04-05 08:01 · Score: 0

Careful what you say, my friend. To suggest that developers get paid for their work and expertise is considered blasphemy on Slashdot.

It's a great idea as long as it's free. Once you start charging for it, however, it becomes part of "the system" repressing free and open ideas.
Re:How is this sustainable? by morgajel · 2002-04-05 08:37 · Score: 1

they have 3 ways of going about it- the seti way, where they say, "hey, if you find whatever we're looking for, you can name it!" or the dustributed.net's "find it and we give you cash!"

however that won't work because they're not looking for one specific thing.

what might work is something along the lines of the google toolbar, "here, enjoy this free coolass software, and in return, let us borrow a few cycles."

I think if they make a cool enough screensaver to go with it, it'll sell itself. people like walking in to their computer room with a friend and have their computer look like it's doing something super cool.

that's what I like about seti, however I run dnet, because it offers free money. if they can get some sort of fianancial backing, they might have something to entice people- suchas, "for ever 200,000 words you translate, you put you in a monthly drawing to win $1000"

--
Looking for Book Reviews? Check out Literary Escapism.

Been there done that... by southpolesammy · 2002-04-05 07:39 · Score: 1

Babel Fish kinds of translators have already been out for quite some time. The distributed nature of this makes it mmore interesting, but there will have to be a concerted effort for it to supplant what has already been started elsewhere on Altavista and such.

--
Rule #1 -- Politics always trumps technology.

Re:Been there done that... by Anonymous Coward · 2002-04-05 07:42 · Score: 0

Is Altavista still in fucking business?? Good riddance!

And on the sixth day, there was Google..

And ye sinners shall beg for forgiveness at the shrine of Michael Eisner, cuz it ain't gonna be long before them fuckers sell out to Disney!!
Re:Been there done that... by d5w · 2002-04-05 07:43 · Score: 2

Babel Fish kinds of translators have already been out for quite some time.
According to the article, the point of the system is to provide some level of translation for those languages that don't have an available translation system. There are a lot of language that aren't likely to get the attention of translation system developers any time soon.
Re:Been there done that... by Liora · 2002-04-05 07:51 · Score: 1

Exactly. At my company we have often needed to somehow translate email that someone sends us in some obscure language. Romanian, for example, was hard to find an online source for a few years ago... of course that is pretty common now. Although the quality of the translation is of some import, the only real purpose is for me to understand what the person is trying to say; that can be done with any old site. The versatility of incorporating little-publicized languages is rather important to me here.

--
Liora
Re:Been there done that... by southpolesammy · 2002-04-05 07:52 · Score: 1

There are a lot of language that aren't likely to get the attention of translation system developers any time soon.

Right, which is why I mentioned that it will take a dedicated effort for it to become more functional than what is already available. I can see how this would be immensely popular for international trade, or for more mundane things like being able to travel to countries or lands that don't use your language. This kind of product would be a great help to the people of India for example, where there are literally hundreds of languages used within the country.

My concern is that while others may be able to devote time, money, and resources to their translation projects, but on the small scale, I wonder whether it would ever get critical mass enough to stay alive. I think it's a great idea, but it's going to take a lot of effort and dedication for it to really make a difference.

--
Rule #1 -- Politics always trumps technology.

Deterioration of the whole language by Liora · 2002-04-05 07:40 · Score: 3, Insightful

Great! Now we'll have Engrish resulting not just terrible Japanese->English translation, but all kinds of other languages too. Eventually the web will be so filled with bad grammar that the next generation will have no idea how to string a simple sentence together. Looks like we will have to start compiling our correspondance after all... for coherence.

--
Liora

Re:Deterioration of the whole language by IAgreeWithThisPost · 2002-04-05 07:42 · Score: 0

yes but language by nature morphs throughout the years. So what is "proper english" now won't be the same in 200 years. The internet has already changed a lot of our vocabulary. After all, you don't hear or read a lot of Old English anymore do you?

--
security through obscurity = modding down anti-linux posts so maybe noone will see them
Re:Deterioration of the whole language by RetroGeek · 2002-04-05 07:49 · Score: 1

Eventually the web will be so filled with bad grammar that the next generation will have no idea how to string a simple sentence together.

That day is here.

Ever "listen in" on an IRC or chat? The shortcuts and grammar mangling are beyond belief. The excuse is that it is faster to type in, but if you are not in the know, then it looks like gibberish (Hey, ANOTHER language for the project!).

And as for the mis-use of the word "like" ....

--

- - - - - - - - - - -
I am a programmer. I am paid to produce syntax not grammar. Deal with it.

very cool.. but only for hobby use by soap.xml · 2002-04-05 07:41 · Score: 5, Insightful

[snip]"One of the main problems is quality assurance," says Ramesh Krishnamurthy, a linguistics expert at the University of Wolverhampton, in the UK. "Translation is a highly developed skill." [snip] But Paul Rayson, a research fellow at Lancaster University, adds that unskilled translators may confuse the meaning of individual words. "The problem is you generally need the context to get a good translation," he says.[snip]

This looks like it will be a very cool project, but for corporate/buisiness use I don't think it would ever fly.

If you have ever played in the area of i18n then you will quickly understand why this pbly won't work perfectly. There are so many caveats to each language, tone, context etc... This might be a useful starting point for transaltion services, but for the final cut, it would still need to be checked and double checked by a translation service.

I still think its very cool though ;)

-ryan

Re:very cool.. but only for hobby use by Hooya · 2002-04-05 08:29 · Score: 2

I agree. I have worked on some 'i18n' of a 'weblication' that's been used in virtually all countries of the world. (about 40+ languages in all including arabic, chinease, japanese etc..) One of the major problems is that the context and tone of translations make it impossible to make a one to one relation between phrases/words from one language to another. We ended up using a ranking system for the translated phrases where the higher ranked (by way of higher usage meaning higher acceptance of that particular version of the translation) is suggested but leave it up to a local administrator to pick the translation varient. All of which amounts to a bank of plausible translations but eventually need human intervention to actually make that translation. Essentially, the bank just suggests translations the administrator has to eventually 'translate' the word/phrase. Unless the software doing the translation goes beyond just the 'natural language' processing (which in itself is a monumental task) and gets into local conotations, context and tone you'd run into the 'chevy nova' situation in the overall level while at the more subtle levels you would end up with offensive and 'rude' translations of otherwise innocent original phrases.
All in all, I think a good exercise in 'grid' computing if you can call it that (at least utilizing the unused CPU cycles) but futile as far as end-all-be-all translating effort. Call me a luddite but I wouldn't get my hopes very high. I'll have to admit that this is probably a good start.
Re:very cool.. but only for hobby use by JohnBE · 2002-04-05 09:52 · Score: 2

I ambivalent about this. I did a paper on something similar last year and a few of the bigger dictionary makers were really interested. My idea was to use a Thesaurus like system to weight words and sentances, so that sentences could be broken into smaller metric products (a la Decartes). In theory it works quite well but I haven't had time other than the scratch pad paper. But I think that the language comprehension problem can and will be solved. I don't see that as the problem.

Now IMHO the real problem: Dictionary companies, publishers and Universities are the big players in this area. If Oxford University were to give away their dictionary a project would instantly have a massive base of words to work with, but would they? More to the point if they did could this be repeated internationally? I'm loathed to rely on the descriptions given by the unwashed masses ;-), but seriously a strong linguistic and academic base is essential and that is where the Wolverhampton system may do well.

--
e4 e5

Thank god! by PhysicsGenius · 2002-04-05 07:42 · Score: 2, Informative

What machine translation has been missing is big dictionaries. We already have the grammar problem cracked--English can be expressed as a regexp. The trouble was that we were missing translations for all those masses of ordinary words that people use like "daisy" and "pencil". This project looks like the end of that issue once and for all.

I'd also like to applaud them finally including the lost language of Ur in their translation project. For too long the ancient Sumerians have been excluded from contributing to the global society due to their lack of knowledge of English, French, Spanish, Swahili or Chinese.

Where can I download the screensaver so that I can contribute?

Re:Thank god! by Fizgig · 2002-04-05 07:47 · Score: 1

We already have the grammar problem cracked--English can be expressed as a regexp.

You're joking, right? Mathematically, a regexp is less powerful than a CFG. A CFG is used to describe a lanuage like HTML or C. English is much more complicated and can't be parsed correctly using a CFG.
Re:Thank god! by ZiZ · 2002-04-05 07:49 · Score: 1

Regexp? Damn. If (assuming (blatently) such regexps can can English) such regexps can contain (parsable in P) fully English phrasing with (contrived (parseable (sort of (LISPy) (regexpy)))) complete syntax - vital to maintain accuracy - we now can despair of ever understanding politicians without the aid of a computer.
Where can I find this regexp? :)

--
This flies in the face of science.
Re:Thank god! by Anonymous Coward · 2002-04-05 07:53 · Score: 0

Ask the guy who rated this informative.
Re:Thank god! by dvdeug · 2002-04-05 07:54 · Score: 2

English can be expressed as a regexp.

If you count [A-Za-z.?"'!;-]*. I'm not sure how much that helps.

Actually, English can't even be expressed through a context-free grammar (a superset of regexps), in part because it is inherantly ambigious. "The girl touches the boy with the flower" has two possible meanings.
Re:Thank god! by Alzheimers · 2002-04-05 08:46 · Score: 1

Ah yes, the wonderful langauge of Ur.
Feel free to download my contribution here:

attached file: "SNOWCRASH.EXE"
Re:Thank god! by Control+Group · 2002-04-05 08:54 · Score: 2

I agree: Thank God!

With this post, you've finally reassured me that you're consciously full of crap in various prior posts--as opposed to massively challenged in some fashion.

--

Reality has a conservative bias: it conserves mass, energy, momentum...
Re:Thank god! by ptrourke · 2002-04-05 10:24 · Score: 2

Express the following as a regexp:

If English were a computer language, then perhaps it would be possible to represent it by means of a regular expression; however, English is a natural language, with all the ambiguity and complexity which natural languages entail, and so cannot be properly represented by means of any logical construct.

Troll he may be, but since it's modded up "informative", it seemed necessary to make the point lest others fall into the same trap.
Re:Thank god! by blibbleblobble · 2002-04-05 10:44 · Score: 1

Fruit flies like a banana
Time flies like an arrow
Re:Thank god! by Proaxiom · 2002-04-05 11:06 · Score: 2

Kill flies like a maniac.
Time flies like an arrow has three valid syntactic parsings. Only one makes sense semantically, though.
Re:Thank god! by MarkusQ · 2002-04-06 03:42 · Score: 2

"Time flies like an arrow" has three valid syntactic parsings. Only one makes sense semantically, though.
Depends on how deep you push the boundary of syntax; some grammars distinguish article-ambiguities, e.g.
Fat Tony likes a feast.
Fat Tony likes a girl.
which raises the number of parsings to at least four. And for any a priori "the line between syntax and semantics should be drawn here" you can come up with, someone can doubtlessly construct an "easy to please/eager to please" counter example.
-- MarkusQ
Re:Thank god! by tgv · 2002-04-06 19:47 · Score: 1

Guys, you are missing the point. He's not a troll, he's being sarcastic. Daisy and pencil are quite easy to translate (in normal contexts), and he knows it. Didn't any of you see the reference to the Sumerians? As in extinct people?

Nifty by TheRealFixer · 2002-04-05 07:43 · Score: 1

"The new scientist has this history on a new plant to construct to a database of the translation of the multi-language called the wide lexicon the world, using a distributed community of the volunteers. The designer compares it it a distributed design computing and believes it that could more easy making translate languages obscurer."

Can't wait.

but will it translate into Klingon? by JeanBaptiste · 2002-04-05 07:43 · Score: 2, Funny

More people speak Klingon than Navaho...

Re:but will it translate into Klingon? by d5w · 2002-04-05 08:11 · Score: 2

More people speak Klingon than ...
But finding a native speaker of Klingon is a royal pain. And yes, I'm speaking from experience, here, having been at a company that came out with a Klingon speech recognition system once upon a time. The usual practice of collecting speech samples from native speakers had to be ... modified slightly.
Re:but will it translate into Klingon? by xtremex · 2002-04-05 09:26 · Score: 1

But, did you know Klingon is BASED on Navajo? Navajo is a very intersting language. I believe that ANY American Indian language is probably the most difficult labguages in the world. I was a linguistics major, and I remember banging my head against the wall trying to understand some of the concepts of the Athapascan family. Did you know there are over 20 words for "this"? This here, this to the left, this here in the time and place, this thing that I once had, this thing that I dislike, etc.

--
If you're not a Liberal in your 20's, then you have no heart.If you're still a Liberal in your 30's you have no brain.
Re:but will it translate into Klingon? by JeanBaptiste · 2002-04-05 10:04 · Score: 1

Im an ojibway... we call it ojibberish... I had heard that ojibway is the most difficult language in the world...
Re:but will it translate into Klingon? by xtremex · 2002-04-05 10:41 · Score: 1

ANY Amerindian language is so hard that you'll pull your hair out. I used to joke and say that when Satan was cast to the earth and punished, he was forced to learn an American Indian language!

--
If you're not a Liberal in your 20's, then you have no heart.If you're still a Liberal in your 30's you have no brain.

Distributed computing by Anonymous Coward · 2002-04-05 07:44 · Score: 0

using a distributed community of volunteers...

Hm.. I've never heard of this 'community of volunteers' computing platform.. Who makes it? What are the specs? Can you make a Beowulf cluster of them?

Quality by delta407 · 2002-04-05 07:45 · Score: 1

"However, some experts warn that the system may lack the quality of conventional dictionaries." ... "McConnell concedes that this could be a problem and hopes to develop an automatic system for peer review, to ensure that translations are accurate."

Duh.

Think about all the 12-year-olds -- script kiddies or not -- who will pretend to know a language and just type in a random collection of letters. What a great way to provide efficient translation!

Re:Quality by delta407 · 2002-04-05 07:48 · Score: 1

Great -- inserting random words can be automated, easily.

The WWL has been designed using the Simple Object Access Protocol (SOAP). McConnell says this should make it possible to integrate the client software into other computer applications.

Excellent... give the abusers an easy way in. And yes, I can pretty much guarantee that it will be abused.
Re:Quality by SirSlud · 2002-04-05 08:09 · Score: 2

> Think about all the 12-year-olds -- script kiddies or not -- who will pretend to know a language and just type in a random collection of letters.

I dont know if you remember what it was like to be 12, but while I might have done what you'd proposed once, twice, I can't imagine the amount of 'noise' in this translation service coming from 12 years old who finally find their life long mischevious passion of offering 'bogus' translation services.

I mean, really, do you see 12 year olds downloading a distrbuted translation app, translating 'bogus'ly, and getting their jolies from this in any quantity that dimishes the value or effectiveness of this project? 12 year olds have much more important things to do, like learn how great masturbation is, and play videogames, and other forums where 'abuse' is fairly indistiguishable from proper use.

--
"Old man yells at systemd"
Re:Quality by t · 2002-04-05 08:50 · Score: 2

Besides the inherently short attention span of most 12 year olds this will be a non-issue as long as you ensure that there is no direct feedback loop.
Trolling on /. most likely results from the very short amount of time it takes to see people responding to your crap. Most scipt-kiddie like behaviour is similar, when you start a DOS attack the results of your mischief is immediate. This translation service on the other hand will probably prove to be quite boring and thus only those with dedication will be able to commit to doing a translation instead of watching The Simpsons.
t.

It's not going to work... by carm$y$ · 2002-04-05 07:46 · Score: 3, Insightful

It's a matter of days until someone will request a log of people connecting to the server during work-hours... Here is the beauty of the seti@home client: computers can have spare cycles, people don't.

--
-- No sig today

This must be the smartest software ever by Control+Group · 2002-04-05 07:46 · Score: 4, Interesting

If it's going to detect when I'm "less busy." Is this going to pop up a window in my face every time I spend more than a couple minutes mentally composing prose or code? The potential for user annoyance here seems incredibly high to me...

Distributed computing is an elegant and efficient use of otherwise untapped resources--cycles that are literally "going to waste" (in one sense). By hitting up the users, though, you're attempting to use a resource that is anything but untapped: that user's time. It might work, but let's not bill this as anything other than what it is--asking for volunteer work from people.

Which isn't really that new an idea.

--

Reality has a conservative bias: it conserves mass, energy, momentum...

Re:This must be the smartest software ever by t · 2002-04-05 08:53 · Score: 2

So why don't we make a dockapp button with the label "I'm bored." When a user hits it, a menu of the current distributed projects that could use his neurons would be listed.
/. needs to change the two minute between posts thing to an exponentially decresing time span. That way you can spit out a couple of posts but not ten really quick.
t.
Re:This must be the smartest software ever by Control+Group · 2002-04-05 09:47 · Score: 2

That would certainly solve the first problem, but not the second. Although "problem" isn't a fair term, really. It's not a problem so much as a misstatement: there's simply no comparison between this project and distributed computing. The latter is making use of otherwise unused potential; this is making use of the ultimate limited resource in modern society (American society, at least--and, from what I've heard, most so-called "first world" societies as well): time.

*shrug*

Not that it can't work, but it's no more nor less elegant/revolutionary/brilliant/etc. than any other plan that depends on volunteerism.

--

Reality has a conservative bias: it conserves mass, energy, momentum...

Could work, but.... by ThinkingGuy · 2002-04-05 07:46 · Score: 4, Insightful

One of the big issues with translating between human languages is context. While many words have more or less direct equivilants in other languages ("dog"(en) "perro"(es)), you're always going to run into slang, cultural references, and especially, jargon, where the particular usage will not be in a standard dictionary, and only by the context can the actual meaning be inferred (Example: the word "anchor" in the context of sailing versus the context of webpage design).
Not that this can't be overcome with the distributed model the article discusses, but I still think it will be a while before we see computer translation that doesn't require at least some degree of human assistance.

Re:Could work, but.... by t · 2002-04-05 09:02 · Score: 2

There are such things as Kohonen Self-Organizing Maps that can help out in the context department.
Take a look at a websom example. Here you can differentiate pruning from the garden variety fairly easily.
This would allow you to easily make the choice between obviously different usages of the word anchor.
t.
Re:Could work, but.... by Anonymous Coward · 2002-04-05 09:09 · Score: 0

While many words have more or less direct equivilants in other languages ("dog"(en) "perro"(es)), you're always going to run into slang, cultural references, and especially, jargon, ...
It's worse than that. Languages like Swedish and Chinese use combinations of words. Examplum gratium: Swedish uses the phrase ``son of the soil'' to mean strawberry. Chinese uses the ideograms ``early'' and ``green'' together to mean spring. That's how they leverage a relatively tiny core vocabulary. So, even if you can be sure that there is no slang, jargon, obscure cultural references, et cetera, you can't go word-by-word through the original.

Though my knowlege of Chinese is very limited, I think that it is FAR more context dependent than English. Machine translation there is going to be problematic for a long time. Even experienced human translators can have a tough time moving meanings from one to the other.

Too late for sega by s4ltyd0g · 2002-04-05 07:47 · Score: 1

I guess they could have used this on their download page :-)

Universal "intermediary" language? by MadCow42 · 2002-04-05 07:47 · Score: 2

Is there some way to translate into a common universal "intermediary" language, then translate to the destination language?

I'm just thinking that most languages could relate more closely with an "iconographic" type language than with the idiosyncrosies of other languages. For concrete ideas this may work well, but for more conceptual ideas this may fall apart...

Just my $0.02, being uneducated in linguistics...

MadCow.

--
I used to have a sig, but I set it free and it never came back.

Re:Universal "intermediary" language? by querist · 2002-04-05 09:23 · Score: 1

Research has been done on using lojban as an
intermediate language because of its structure that actually makes it more difficult to be ambiguous than to be precise.

Esperanto has also been suggested as an intermediate language for such projects and I suspect that it could be used fairly easily.

The main advantage to this is that we could rely on translation into the neutral intermediate language and then it may facilitate a greater probability of getting a translation into the target language.

The disadvantage is that neither of these languages are ubiquitous. lojban has a few hundred speakers (best guess) while Esperanto has thousands.

From strictly linguistic and technical reasons I would recommend lojban for this task even though I am personally more proficient in Esperanto.
Re:Universal "intermediary" language? by xtremex · 2002-04-05 09:33 · Score: 2

Esperanto is such a language. Esperanto was invented in 1920 I believe to server as a bridge for people who speak different languages , with Esperanto being the bridge. It's quick and easy to learn, with no irregularities. I speak Esperanto, and I think it's a beautiful language. One can coverse in Esperanto after a few days. For example, if you don't know the word for airplane, you could say "the thing that flies"--flugilo (flug--fly, ilo, thing that does something)...which IS the word for airplane! There's even funny slang in Esperanto..bluharulino (Old woman --"female person who has blue hair!")

--
If you're not a Liberal in your 20's, then you have no heart.If you're still a Liberal in your 30's you have no brain.
Re:Universal "intermediary" language? by MadCow42 · 2002-04-05 09:33 · Score: 2

Interesting...

I was thinking more of translation into "concepts" rather than an actual language... it doesn't have to be a real spoken language. This of course is well suited to machine translation, not human translation.

Wouldn't all languages be possible to translate into concepts? I guess it would be highly contextual though, making the process difficult...

Just-brainfarting-the-Friday-away-ly-yours...

MadCow.

--
I used to have a sig, but I set it free and it never came back.
Re:Universal "intermediary" language? by Anonymous Coward · 2002-04-05 10:02 · Score: 0

I'm surprised no one seems to have mentioned UNL, or the Universal Networking Language sponsored by the United Nations. It seems as though this has all of the beginnings of a universal language, since they seem to have a system of removing ambiguities at the time that the text of the language is entered...
Re:Universal "intermediary" language? by Anonymous Coward · 2002-04-05 17:46 · Score: 0

That's English, okay? Why bother with Esperanto and some foolish stuff? Matter of fact, there should be no translation at all - everyone should switch to English and the need for translation will be gone!

Hi! How are you? by spruce · 2002-04-05 07:47 · Score: 2, Funny

I send you this words in order to have your translation

Why this will never work by Anonymous Coward · 2002-04-05 07:47 · Score: 2, Insightful

I'm not a translator but during college I worked with a comparative lit professor who translated novels from spanish into english. The problem with translation is wrestling with the subtle shades of meaning that every single word has and to find its perfect pair in the language you're translating into. Then you have to adress the context in which the word was written (the larger sentence--what information is it trying to convey, what mood (much trickier) is it trying to imply, and finally does this match the author's style and the novel's tone (this is what truly makes translation an art).

This is a bad example but just so you get the idea, it's hard even english to english:

original:

John hurried to the shopping mall.

variants:

John made great haste to get to the shopping centre.

John ran to his destination, the shopping mall.

John rushed to the store.

John spared not the whip in perambulating to the suburban commericial district.

John ran off to waste time at the corporate copyright paradise.

blah blah blah...

Re:Why this will never work by t · 2002-04-05 09:14 · Score: 2

My how narrow minded. Which would you rather have, any one of the various variants you listed or shopi-ngu he ikimasu? My japanese is really bad but you get the point. The point which I will explicitely state for all you ACs is that the goal is not to translate poetry, but first to be able to translate well enough that you can understand what is trying to be conveyed.
You've heard the joke haven't you about the golfer that goes to [insert some foreign contry here] and gets a hooker the first night he is there. This guy is so excited about having his first taste of [insert approprate ethnic reference here] that he jumps on the hooker and starts giving it his all. The hooker starts screaming [insert foreign sounding gibberish here]. This only encourages the guy, he's thinking that she's saying something that means he's great. So anyway, the next day he goes golfing with his business partner that he flew over to meet. During the game his foreign bussiness partner makes a hole in one. So he decides to use the new word he learned last night from the hooker. His business partner turns to him and says "what do you mean, wrong hole?"
t.

What is most likely? by pjkacmar · 2002-04-05 07:48 · Score: 1

Is distributed computing more likely to:
a) Find intelligent life on other planets?
b) Find a cure for cancer?
c) Translate "All your base are belong to us" to Sanskrit?

Nice idea, but I'm not sure how well it'd really work.

Re:What is most likely? by JonWan · 2002-04-05 09:24 · Score: 1

I would be happy if they could translate "All your base are belong to us" to English

in Hungarian... by dukethug · 2002-04-05 07:48 · Score: 1

the roughly equivalent phrase is "basz meg"- although the usage differs. It's more like the sort of thing your grandma would say if she dropped her fork at the dinner table.

On the other hand, maybe I just have a foul-mouthed grandma.

it'll never work. by banks · 2002-04-05 07:48 · Score: 2, Interesting

From the article:

"The problem is you generally need the context to get a good translation,"

This is very, very true. Any competent translator can tell you that it's almost impossible to get a fully accurate translation from just a few lines or words... context is absolutely imperative. This looks a lot like vaporware to me.

And then what about when the smart-ass teenaged year old kid signs up, gets bored and starts translating to obscene or nonsensical results? They'll need some sort of moderation system, if this is to work at all.

Thanks, newscientist, for bringing us another well researched and peer-reviewed story, maintaining the image that a "new scientist" is one who has forgotten about the scientific method.

--
--Use this space for notes--

Brilliant! by sniggly · 2002-04-05 07:49 · Score: 1

Who cares if its accurate now or soon, used often enough and with plenty of user feedback about whats the right and wrong way to translate things this could become a very nifty database and hopefully better at what it does than babelfish which is handy but more than that very amusing :)

--
Of those to whom much is given, much is required.

Some basic information omitted in NS article by brianmsf · 2002-04-05 07:49 · Score: 5, Informative

Hello,

I am the lead developer working on the WWL project. There are actually two components to this project. Overall, the NS article did a good job of explaining it, but it was based on a phone interview so some material got lost in translation, no pun intended.

There are two components to the project.

1. One is a simple SOAP based protocol (WWLP) that will be published soon, in early May. This protocol creates a standard set of methods for discovering and communicating with existing dictionary and semantic network servers (of which there are many).

Think of this as GNUtella for dictionaries. A WWLP aware program starts up, invokes a SOAP method to a supernode to locate Russian-Spanish dictionaries. Then, it contacts one or more of these dictionaries to search for words, synonyms, etc.

The basic goal is to standardize the client/server interface for dictionaries. They all provide the same basic services, but have slightly different front ends. So just doing this will make it easy to incorporate dictionary functions into many types of apps (and also make existing dictionaries more visible to internet users).

The idea is similar to an older TCP based protocol called DICT, except that it is easy to implement in high level languages, SOAP aware scripting languages, etc. It also provides a discovery mechanism so you can automate the process of finding an Urdu-English dictionary for example.

2. The distributed computing (or distributed human computing) project. The NS article mainly focused on this. The idea here is to enlist a large number of internet users to help build and maintain a dictionary (which will also be visible through the WWLP interface).

The goal here is to create a mechanism for collecting definitions and translations for words and phrases in less common language pairs (as well as for slang terms that are not covered by most formal dictionaries).

....

The goal in both cases is to make it easy to find and use dictionary services throughout the web, and create an incentive for people to build their own dictionaries. This is NOT a translation system, although it can be incorporated into translation software (for example, to extend the number of words covered).

Thanks for your time.

Brian McConnell

PS - if you want more information, check out www.worldwidelexicon.org

Re:Some basic information omitted in NS article by Anonymous Coward · 2002-04-05 09:18 · Score: 0

Are you familiar with the Wycliff bible translators? They may well be the largest single nonprofit source of information on oddball languages. They have, also, a number of trained linguists who are dedicated to serving God and man.
This would be a sideshow for them, but they might have some interest in making their work accessible, as long as it didn't take resources away from their primary task of translating the bible into every language in the world. In the process, of course, they have to give every language a written form, and a dictionary. The side effect is that they are preserving records of many languages which would otherwise be lost.
Re:Some basic information omitted in NS article by t · 2002-04-05 09:24 · Score: 1, Flamebait

I've looked at DICT previously. Too bad it's defunct. To me the most useful first step would be a system that can integrate all of the many different dictionaries already in existence. Maybe with some kind of fallback to webster.com or something more appropriate.
Of course if all this work is only available through some server and not an iso that I can download then I'll pay no attention to it. All too often it has happened that these so called public databases disappear on the people who worked on them.
t.
Re:Some basic information omitted in NS article by blibbleblobble · 2002-04-05 10:53 · Score: 1

Like, for example, the Oxford English Dictionary, created by thousands of volunteers in an open-source effort, now available on the web as costly-subscription-only?
Re:Some basic information omitted in NS article by Proaxiom · 2002-04-05 11:18 · Score: 1

Are you sure about this? My understanding was that the OED was developed and is administered primarily by academics, who are well-paid for their services.
My alma mater had quite a bit to do with converting the OED to the digital medium.
Re:Some basic information omitted in NS article by dvdeug · 2002-04-05 15:00 · Score: 3, Informative

I've looked at DICT previously. Too bad it's defunct.

Why do you think it's defunct? The dict protocol works fine, and there are many dictionaries out there for it. dict.org is up and working, if not terribly well maintained. Debian has many packages, mostly named dict-*, that are dictionaries for dict, including a full English dictionary, the Jargon file, a Biblical dictionary and a Russian dictionary. www.freedict.de has a wide variety of bilingual dictionaries for dict.

My Hovercraft is Full of Eels... by Mad+Bad+Rabbit · 2002-04-05 07:50 · Score: 1

Let's hope none of the volunteers accidentally
use Mr. Alexander Yalt's
Hungarian-English dictionary.

"I will not buy this tobacconist, it is scratched."

>;K

--
>;k

why this will never work by Anonymous Coward · 2002-04-05 07:51 · Score: 0

I'm not a translator but during college I worked with a comparative lit professor who translated novels from spanish into english. The problem with translation is wrestling with the subtle shades of meaning that every single word has and to find its perfect pair in the language you're translating into. Then you have to adress the context in which the word was written (the larger sentence--what information is it trying to convey, what mood (much trickier) is it trying to imply, and finally does this match the author's style and the novel's tone (this is what truly makes translation an art).

This is a bad example but just so you get the idea, it's hard even english to english:

original:

John hurried to the shopping mall.

variants:

John made great haste to get to the shopping centre.

John ran to his destination, the shopping mall.

John rushed to the store.

John spared not the whip in perambulating to the suburban commericial district.

John ran off to waste time at the corporate copyright paradise.

John said all your mall are belong to us.

blah blah blah...

distributed translation will be just fine for most short documents but for the longer ones, shades of meaning will be lost and the patchwork of styles will be jarring to say the least.

Speaking of translation... by hsenag · 2002-04-05 07:52 · Score: 1

Check out this NewScientist feedback item. Or just jump straight to the google link they refer to. Can I get anyone a juice of lawyers?

Context, Poison by quinine · 2002-04-05 07:52 · Score: 1

It seems to me that this project has overlooked two tremendous stumbling blocks. The first involves context/ambiguity. Take the English, "it's pretty bad outside" Now, for an English speaker, this is no trouble, since the "it's" is generally held to be referring to the weather. Other languages lack such a frame of reference. Secondly, I believe that the "distributed" property of the system leaves it widely open to poor or intentionally incorrect translations, unless the system is employing some statistical method for finding the "mean translation" of a phrase out of a batch of candidates. While I appreciate this researcher's work on Machine Translation, I think that this might better be served by designing some type of meta-language with a superset of linguistic features from which native translations might be compiled.

Re:Context, Poison by quinine · 2002-04-05 08:06 · Score: 1

I also entirely fail to believe that a lexicon of dying tongues can be constructed over the Internet. The notion of tribesmen with PCs and `net connections brings a dreamlike smirk to my face. I think they call it FIELD linguistics for a reason.
Re:Context, Poison by t · 2002-04-05 09:34 · Score: 2

Take a look at It's raining.
OK, so what about It's raining? It's a kind of construction called a Dummy it. That is, the it has no meaning whatsoever (you're far from the first to be puzzled by it) and is used strictly as a placeholder, like the dummy hand in bridge, or the zero on 101. Why do anything that bizarre? Well, see, English Syntax has this Rule that says -- in ponderous and self-enforcing tones -- Thou Shalt Have A Subject In Every Finite Sentence. And thou must, indeed.

Handling 'it' is quite easy.
t.

Problem with "Universal Translator" by Kphrak · 2002-04-05 07:55 · Score: 2, Insightful

Yes, you can do a word-for-word translation of most words in any language. No, you'll need a very sophisticated system to get the meaning to a reader.

The main problem is that sentence structures are different, idioms get in the way, and words have more than one meaning. A human translator has the power to take a set of words, convert it to an idea, and put out a different set of words, something no machine can do.

Here's a lamebrained example: "The spirit is willing but the flesh is weak." Convert that to Russian and back and you might get, "The liquor will do it but the meat is bad." For a hands-on example, try converting the first few paragraphs of a news article into French using The Fish. On a personal note, I had a conversation with a German guy on ICQ once, using the fish. The results were...interesting. I also read Indonesian newspapers, and I assure you that a literal translator would hurt itself quite badly on this...let alone a less English-like language such as Arabic or Japanese.

That being said, why not use distributed human computing for the thing it's good at? Instead of translating words, how about sentences? You can get at the ideas much better this way. Those sentences that hadn't been translated yet could show up as literal words; those words that hadn't been translated would show up natively. I mean, if you've got human translators for this, you can do things that are not restricted to computers. I can think of a lot neater things the guy proposing this can do with this idea than what he's come up with so far.

--

There's no sig like this sig anywhere near this sig, so this must be the sig.

Re:Problem with "Universal Translator" by RetroGeek · 2002-04-05 08:19 · Score: 1

Instead of translating words, how about sentences?

Now THIS sounds like an idea.

Translate every sentence that is used, into every known language.

Yes, the DB would be HUGE, but its distributed. And I don't think the number would be that large (relatively speaking). Well, maybe just common sentences.

Ok, it would be large. And distributed.....

--

- - - - - - - - - - -
I am a programmer. I am paid to produce syntax not grammar. Deal with it.
Re:Problem with "Universal Translator" by Anonymous Coward · 2002-04-05 10:27 · Score: 0

That's why an interactive translator, such as UNL's editor is a good idea. The editor allows the user to resolve ambiguities at the time that he or she is entering the text, so that the text can be translated into the proper universal word and then the hopefully unambiguous translation can be further translated into the destination language with a minimum of mistranslations.
Re:Problem with "Universal Translator" by Anonymous Coward · 2002-04-05 10:40 · Score: 0

Hmmm...could have sworn I replied to this...post just got thrown away...

Already done in Monty Python? by Torgo's+Pizza · 2002-04-05 07:55 · Score: 1

Isn't this similar to the Monty Python sketch where a team of people work to translate the funniest joke in the world from English to German? One person accidently saw two words and was put in the hospital for a few days.

Will their QA keep the trolls out? by BACbKA · 2002-04-05 07:55 · Score: 2

The article never elaborates on the aspect of the QA fighting the trolls - important to deal with for any knowledge base compiled from various level expertise sources (like comments to a /. article - some are right on the nail, some are incompetent, some are intentional trolls). Unfortunately, even robust technologies which were designed with such attacks in mind sometimes fall in the face of the clever poisoning attacks (see the /. article Google bombing).

You need a lot of "mod" and "metamod"-like activities to work; it looks to me that the peer review system shouldn't be too "democratic" to succeed (i.e., there is always a need for some top-level superusers, who are trusted automatically because they are essentially the system builders).

Anyone has an example of such a system with its founders going berserk (say, think of CmdrTaco starting daily trolling :-) )?

--

VKh

Re:Will their QA keep the trolls out? by Amarok.Org · 2002-04-05 09:26 · Score: 2

Anyone has an example of such a system with its founders going berserk (say, think of CmdrTaco starting daily trolling :-) )?

Example: April 1, 2002

--
-- "Other than that, how was the play Mrs. Lincoln?"

ha! by Joe+the+Lesser · 2002-04-05 07:56 · Score: 1

We can finally get started on that Tower of Babel project again!

--
"I only speak the truth"
Karma: null(Mostly affected by an unassigned variable)

weird reporting by prizzznecious · 2002-04-05 07:56 · Score: 1

Wouldn't it be standard to include a link to the site where you can sign up or at least find out more information about this thing? I find the lack of ligature vexatious, to say the least.

--

visit the hwky website for a lyrical genius infusion.

Re:weird reporting by t · 2002-04-05 09:39 · Score: 2

That's quite the stretch of the meaning of ligature eh?
t.
Re:weird reporting by prizzznecious · 2002-04-05 11:14 · Score: 1

Just because we're talking about computer stuff doesn't mean we can't be poetic.

--

visit the hwky website for a lyrical genius infusion.

Dictionary != Translator by Dominic_Mazzoni · 2002-04-05 07:57 · Score: 2

I think it's a great idea to harness the power of millions of people around the world all contributing a few minutes of their time, to create a gigantic any-language to any-language dictionary.

However, this will do nothing to aid in machine translation. You can't simply translate individual words from one language to another, or even short phrases. Translators such as Babelfish understand the basic rules of grammar in each language in order to handle fundamental differences in the way different languages put sentences together.

But Babelfish and other online translators are still a far cry from doing true translation, because they don't understand the text they're trying to translate.

Re:Dictionary != Translator by prizzznecious · 2002-04-05 08:03 · Score: 1

That's why you should have read the article. While there will be some instances of direct single word transliterations, there will also be phrases for context and likely even idioms.

Your run-of-the-mill gripe is exactly what this project is trying to address. Don't you think these guys already know about Babelfish?

--

visit the hwky website for a lyrical genius infusion.
Re:Dictionary != Translator by Dominic_Mazzoni · 2002-04-05 09:12 · Score: 2

I did read the article.

I'm not presuming they don't know about Babelfish; I just think that the end result of their effort will be a great language-to-language dictionary, NOT a useful translator.

Idioms help a little. They don't address more subtle context issues, or grammar.

I would maybe be willing to concede that a system like this could help to improve an existing translator by building up its library of words and phrases. But I think it would be totally useless for a brand-new language: without any a priori hand-coded rules of grammar built into the system, it would never be able to translate more than 10% of queries.

Unadressed copyright issues by alewando · 2002-04-05 07:57 · Score: 2

When a machine generates a translation, there are no issues of copyright ownership, because machines are not authors in the statutory sense; the owner of the machine can claim copyright and move on.

When individual human translators get involved, there's an entirely different order of complication. Sure, it's possible to use licenses like the OPL (Open Publication License) to navigate these complications, but the compliance problems remain an obstacle to overcome. It'll be tough to remain competitive when babelfish and google don't have to put up with similar issues.

When this is added to all the other problems associated with massively distributed activities relying on humans to function, I just can't see how it'll succeed. Too bad, perhaps, but nonetheless true.

Distributed human computation? by jfengel · 2002-04-05 07:57 · Score: 2

From the orignal source (http://picto.weblogger.com)

While the SETI At Home Project taps the idle CPUs of millions of personal computers, the worldwide lexicon enlists the help of internet users who are logged in, but not chatting. Think of this as distributed human computation.

"Distributed human computation"? Is that like using up all those spare brain cells you weren't using right now?

Re:Distributed human computation? by carm$y$ · 2002-04-05 08:19 · Score: 1

Is that like using up all those spare brain cells you weren't using right now?

No, it's more like not going to bed at midnight and instead lurk around on some internet sites... wait a minute... never mind.

--
-- No sig today

Konquorer integration by Anonymous Coward · 2002-04-05 07:58 · Score: 0

I'm on a team that is working on integrating a "translate" button into Konquoror. Load up a foreign site, hit the button, and voila! In some other language. We expect to have DTP support within a month or so.

This might work... by Mysticalfruit · 2002-04-05 07:58 · Score: 1

If hundreds of people have nothing better todo with their time then translate other peoples stuff.

The biggest problem I see is a majority of people wanting things translated and a minority of people being able to translate. Plus all the other issues. What is some translates a sweet love letter into something that gets the person put in jail.

--
Yes Francis, the world has gone crazy.

If you actually want to sign up by prizzznecious · 2002-04-05 07:59 · Score: 4, Informative

then you should go to their site, which was completely unmentioned in the article: wwl page

--

visit the hwky website for a lyrical genius infusion.

Yes! I'll be one of the first volunteers... by brooks_talley · 2002-04-05 08:00 · Score: 1

...and then I'll reverse engineer the code and made sure it always returns results like "I would like to fondle your buttocks" or "I will not buy this tobacconist, it is scratched."

Heh.
-b

HOW to GET really BAD translations by maggard · 2002-04-05 08:00 · Score: 4, Insightful

First off I'm going to guess that 90% of the folks who will be posting gung-ho comments on this will be unilingual Americans. The folks posting against it will be those who're bilingual and ever read the "same" document in both languages.

It doesn't work. If translating were so simple for machines to do they'd be doing a fine job. However good translation requires context, insight, emotional inflection, etc. Even then each and every one ends up different; sometimes subtly sometimes blatantly.

Just as machine translation sux at these so will distributed translation. Reading a paragraph or a page doesn't tell enough about the feel, flow, or tone of a document. There are numerous words and phrases that can be interpreted multiple ways between any two languages and will be, each time differently by each interpreter.

If you don't know this already then go and look up any document (books and short stories are easy to find, so is poetry) that has been translated more then once. Take a look at the different translations and ask yourself - "Are these really from the same source document?"

Now imagine trying to read something composed of alternating paragraphs or pages from each translation: Incoherence.

Distributed problem solving works for subjects with clearly defined data sets, methodologies, and standards; not human language.

--
I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.

Re:HOW to GET really BAD translations by quinine · 2002-04-05 08:09 · Score: 1

Syntactic translation is mentioned nowhere in the article. This researcher is attempting to build a LEXICON.
Re:HOW to GET really BAD translations by The+Ape+With+No+Name · 2002-04-05 09:13 · Score: 2

As a rare White multi-lingual American, I have learned how egregiously poor translations can be and have actually changed my entire research because of this.

BTW, knowing Klingon doesn't count as being multi-lingual unless you accept this as fact.

--
Comparing it to Windows will be a moot point, since El Dorado is going to have a 40% larger code base than XP.
Re:HOW to GET really BAD translations by dvdeug · 2002-04-05 15:10 · Score: 2

knowing Klingon doesn't count as being multi-lingual

Why not? Klingon is a language distinct from any other. There are Klingon speakers, and you can communicate with them no matter what other languages they may or may not know.
Re:HOW to GET really BAD translations by HiThere · 2002-04-06 05:52 · Score: 2

Unfortunately, it's unrealistically optomistic for even a lexicon. There aren't genuine correspondences, because different languages slice up the world differently even at the word level. Those that have the most history in common tend to have the most language in common (naturally). So you will be less likely to notice this if, e.g., all of the languages that you speak are, say, Germanic (English-Frisian-Dutch-Deutch...). It becomes a bit more noticable as you add in the Romance languages. But try Hebrew, Persian, Sanscrit ... and these are all closer than Chinese, or Japanese. Which are closer than the South Asian Languages. And these are closer than the various African languages.

This may be plausible if one restricts oneself to computer users. But there are a lot of oriental computer users, so don't even count on that.

--

I think we've pushed this "anyone can grow up to be president" thing too far.

Translating words is one thing, by Anonymous Coward · 2002-04-05 08:01 · Score: 0

how are they going to translate correctly _in context_?

i.e. Never let a website go live that translates the word "movement" into French using the wrong context.

Easy answer to language deterioration. . . by czardonic · 2002-04-05 08:01 · Score: 1

Eventually the web will be so filled with bad grammar that the next generation will have no idea how to string a simple sentence together.

Three words: Distributed Grammar Checking

--
Takahashi Rumiko made beats! DON, taku, DON, taku. . .

Tainted Phrasebooks by dmaxwell · 2002-04-05 08:01 · Score: 2

Way to go guys! All of the SlashTrolls know about it now too. What I thought I asked:

"Where is the restroom?"

What the native speaker heard me say?

"I want to slowly and lovingly take your wife in the rectum."

I recall a Monty Python sketch where a guy was put on trial for fraudulent phrasebooks that did that sort of thing. Someone gave the phrasebook guy a tainted phrasebook from his language back into english and he kept insulting the judge. Hilarious.

How far can we trust this translation project once the trolls make a few choice "contributions"?

langauge wiki by Anonymous Coward · 2002-04-05 08:02 · Score: 1, Interesting

You know what would be good is a multi-language wiki where people continually change the mapping of word to meaning. That way the meaning of the word and it's most appropriate cross-language equivelent would be "organic". Most static lexicons suck, because they are dry definitions without any cultural relativity.

Re:langauge wiki by JohnBE · 2002-04-05 09:29 · Score: 1

I covered Wiki in a document I wrote about an intellegent translation system last November, the organic nature is something that I beleive I have fairly well covered:

http://www.freesoftware.fsf.org/cdf/

--
e4 e5

Who needs it :-P by kryzx · 2002-04-05 08:04 · Score: 2

Who needs it? You can already find out how to say "My God! There's an axe in my head!" in virtually every language on the planet right here.

I tried to post the translations themselves, but the "lameness filter" considered it too many "junk characters", even after I removed all the accents and umlauts and such. The lameness filter is lameness incarnate.

--
"I don't know half of you half as well as I should like, and I like less than half of you half as well as you deserve."

Who gave this troll an "Informative"? by maggard · 2002-04-05 08:05 · Score: 2

What machine translation has been missing is big dictionaries.

Nope. Have those. However words, phrases, even concepts don't map 1=1 between languages

We already have the grammar problem cracked--English can be expressed as a regexp

Mebbe in your lack-of-social-circles...

C'mon folks, this is a troll! Who the heck fell for it?!

--
I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.

Re:Who gave this troll an "Informative"? by HiThere · 2002-04-06 06:06 · Score: 2

I thought that it was humor. Look again. Has English been handled with a RegExp? Can you seriously think that anyone really believes that it has? Or would expect others to so believe? It must be humor.

--

I think we've pushed this "anyone can grow up to be president" thing too far.

universal META language by Traa · 2002-04-05 08:08 · Score: 1

Instead of the proposed novel yet stoopid approach of letting volunteers do the translating ("Sie sind sehr hübsch!" ---> bored kid ---> "you look like a pig!") why don't we get some experts together to once and for all design a universal META langauge. Create a dictionary for every language into the META language and from the META language into every language and voila, you can translate every language into every other language. To add a language, however obscure, you only need to add 2 translations (to and from the META language).

For n languages this reduces the need for having to have n*n dictionaries down to 2*n. (for example, to translate every of the 6500 languages mentioned on http://www.ethnologue.com/ you need only 13.000 dictionaries...instead of 42.250.000 if you do it the bablefish way)

Re:universal META language by PD · 2002-04-05 08:17 · Score: 2

This sounds suspiciously like the idea that any unsolved computer science problem can be solved by adding another level of indirection.

--
If tits were wings it'd be flying around.
Re:universal META language by Anonymous Coward · 2002-04-05 10:33 · Score: 0

why don't we get some experts together to once and for all design a universal META langauge

Some experts already have.

Another article and discussion by sjmurdoch · 2002-04-05 08:25 · Score: 1

There's another article and discussion on Advogato

Of particular interest is that it discusses using trust-metrics, in a similar way to Advogato itself, so as to differentiate between good and not so good translators.

--
Steven Murdoch.
web: http://www.cl.cam.ac.uk/users/sjm217/

Why stop there? by l810c · 2002-04-05 08:29 · Score: 1

George Carlins 7 dirty words:
Shit
Piss
Fuck
Cunt
Cocksucker
Motherfucker
Tits

The World Wide Lexicon will be forever grateful.

Huh by Anonymous Coward · 2002-04-05 08:32 · Score: 0

This seems like a really neat idea on the surface but kinda stupid when you begin to try to rationalize out exactly why somebody would want to "volunteer" to translate some random word or phrase. How exactly would it work? From what I've read the app would pop up a phrase to translate when the machine was idle. Well, usually the machine's idle because nobody's in front of it using it. So the typical case will be when somebody goes to sit down for a session that they'll see this pop-up. Well, hey, they don't want to sit there and translate some stupid word, they wanna surf or balance their checkbook at that moment. Something *has* to be in it for the "volunteer" or it just doesn't work.

Anyway, I think the whole thing's really a front to translate sections of The Third Tomes Of Korrath which their CEO, also an avid part-time archaeologist, needs to have done to ascend into the Ninth Sphere Of Power. But that's just *my* theory. :)

Could be expensive.... by mblase · 2002-04-05 08:32 · Score: 2

...a multi-language translation database called the World Wide Lexicon, using a distributed community of volunteers....

As soon as I read this, I immediately thought of Google's pigeon-based page-ranking technology. "I just hope those volunteers can type really really fast...."

www.logos.it by MS · 2002-04-05 08:34 · Score: 3, Interesting

Something related was already done about 6 years ago by Logos. It's not a network like Seti@Home, but it involves lots of people distributed all over the world. It still works - check it out!

ms

context and references in translation by xanthig · 2002-04-05 08:34 · Score: 1

I speak fluent Chinese and Japanese and occasionally moonlight as a professional translator, countless hours of watching Star Trek have lead to extensive hypothesizing on how to construct a universal translator. What's interesting about this article is that is what they're trying to do, but they seem to be going off half cocked. It's not enough, and that's why I have big doubts about the usefulness of this project, to simply translate between languages with a one to one parity. Babelfish is a good example of the results of this. Good language keeps in mind that language is completely symbolic and very referential. Words not only change meaning in context - think "booty" pirate treasure vs. "booty" T&A, they often symbolically refer to their use in other contexts, which leads to nuance. In fact literacy can be though of as the thorough understanding of the nuance of a given phrase. For example if I say, "screw your courage" it not only has meaning in the immediate context, but refers to a line in Macbeth as well. Literate understanding comes only from understanding the references, although without the referential meaning the intended meaning comes across as well. Sometimes referential meaning is direct opposite to the meaning of the words taken alone - translation is tricky business. Another thing to keep in mind is that whether we are aware of it or not all language has levels of politeness. In English this is often quite subtle: chug -drink - imbibe- quaff, they can all be used to mean the same thing but vary in politeness levels. In Japanese this politeness level in Language is painfully obvious. Keeping this in mind, a Universal translator, like the one mentioned in the article, could be an effective tool but it would have to be a lot more complicated. First, every word would have to be assigned a politeness level. Second, multiple meanings would have to be assigned to each word for multiple contexts. This would go a long way to moving something like this away from the babelfish problem without having to do anything like correlating literary references between cultures. Unfortunately, here they seem to be trying to create a distributed babelfish focused on developing a dictionary for obscure languages using the labor of people who probably aren't even on the Internet.

Spanish, huh? by Anonymous Coward · 2002-04-05 08:35 · Score: 0

It also translates to culear in Spanish (Chile).

Thank God for the "Post Anonymously" checkbox... :)

There is no grammar for English by rufusdufus · 2002-04-05 08:35 · Score: 2

I used to believe in the whole idea of grammar. Until I went into Speech research and learned more about language. One of the big breakthroughs in speech recognition was when they went to hidden markov models for language. This language modelling technique is now used on all modern recogniziers is statistical in nature, not grammatical [rule based]. Grammatical models are never flexible or robust enough to represent true spoken speech.

The fact is that English is an organic language, and has organic properties. It grows. It changes. It has fuzzy boundaries. We must expect language constructions to change with time--it has been changing all along! All the rules and regulations you learned about grammar are generally context senstive, and do not hold up in all contexts, most notably, spoken speech. The rules of grammar are artificial, really imposed by publishers as a standard, but they do not actually reflect the full spectrum of the language.

Re:There is no grammar for English by JohnBE · 2002-04-05 09:22 · Score: 2

Neologisms for neologisms sake are a pain though. While I believe firmly in the organic nature of English (otherwise I'd still be writing with long and short s) I do dislike 'management talk' and suchlike because I feel that there are existing descriptive words covering the same subject. So by creating redundant words you can cliche the rest.

But ultimately most of the people that object strongly to overtly bad grammar and neologisms are the same people who 'had a go at' the great writers of our time. The writers having filled holes existing in their contemporary language by changing, bending and creating new rules. The pedants are the kind of people that extract some kind of self esteem from the minor foibles of others. Ideal teacher material I imagine.

The same people probably objected to all kinds of things and all.

--
e4 e5

Who the heck fell for it?! by Rupert · 2002-04-05 08:38 · Score: 2

You did.

Unless you are also a troll, in which case the answer would be me.

--

--
E_NOSIG

it wont work too well by Anonymous Coward · 2002-04-05 08:39 · Score: 0

Translation between human languages is an art, not a simple skill. Speaking several languages, and having done much translation in learning them, it's easy to mistranslate words or produce translations which are inaccurate copies of the source document. Accurate, idiomatically correct translations only really come from fluent human speakers of both languages, who are familiar with the text and the subject domain.

Already available at linguru.com by paulikoira · 2002-04-05 08:41 · Score: 1

It is an interesting idea. Very similar to one we have been working on for 3 years at Linguru. The difference is you can download and use our software today (if our bandwidth holds out :-) ) from Linguru. Our application is cross platform (pure Java) with installers for Linux, Solaris, Unix, Windows, and OS/X. A web browser is included for rapid lookups in a foreign language. You may edit any entry, ask questions, add words and translations. Your changes are distributed in real time (1-2 seconds) to all other users. Version Beta 3 (Welsh and English dictionaries are included) is available now and well tested after 2 years of careful work. Version 1.0 is in final debug as we speak. We really need your support to make this work especially if we are to continue with full support for minority languages. Share your thoughts here or by posting on our message board in the software. Comments, suggestions, and rants to me: Paul.Houghton@linguru.com

Re:Already available at linguru.com by ProduceGuy · 2002-04-05 10:36 · Score: 1

Linguru puts more emphasis on the user experience and less on the data collection. It provides parsers, form generators, a dictionary search engine, quick lookup, and such. It doesn't automatically solicit translations.

context and reference in translation by xanthig · 2002-04-05 08:42 · Score: 1

I speak fluent Chinese and Japanese and occasionally moonlight as a professional translator, countless hours of watching Star Trek have lead to extensive hypothesizing on how to construct a universal translator. What's interesting about this article is that is what they're trying to do, but they seem to be going off half cocked.

It's not enough, and that's why I have big doubts about the usefulness of this project, to simply translate between languages with a one to one parity. Babelfish is a good example of the results of this.

Good language keeps in mind that language is completely symbolic and very referential. Words not only change meaning in context - think "booty" pirate treasure vs. "booty" T&A, they often symbolically refer to their use in other contexts, which leads to nuance.

In fact literacy can be though of as the thorough understanding of the nuance of a given phrase. For example if I say, "screw your courage" it not only has meaning in the immediate context, but refers to a line in Macbeth as well. Literate understanding comes only from understanding the references, although without the referential meaning the intended meaning comes across as well. Sometimes referential meaning is direct opposite to the meaning of the words taken alone - translation is tricky business.

Another thing to keep in mind is that whether we are aware of it or not all language has levels of politeness. In English this is often quite subtle: chug -drink - imbibe- quaff, they can all be used to mean the same thing but vary in politeness levels. In Japanese this politeness level in Language is painfully obvious.

Keeping this in mind, a Universal translator, like the one mentioned in the article, could be an effective tool but it would have to be a lot more complicated. First, every word would have to be assigned a politeness level. Second, multiple meanings would have to be assigned to each word for multiple contexts. This would go a long way to moving something like this away from the babelfish problem without having to do anything like correlating literary references between cultures.

Unfortunately, here they seem to be trying to create a distributed babelfish focused on developing a dictionary for obscure languages using the labor of people who probably aren't even on the Internet.

Swimming pool analogy by Jucius+Maximus · 2002-04-05 08:44 · Score: 1

They may be able to control who enters the pool...

...but they cannot control who pees in it.

I would be worried about rogue translators or people using MSWord with all overactive autcorrectors turned on. Once once of these things changed the world 'elbow' into 'Ebola' without telling me! For quality translations, turn to paid professional translatiors or learn the language yourself. (Alas, it is harder to learn a spoken language than it is a programming language.)

i will be impressed when by Anonymous Coward · 2002-04-05 08:45 · Score: 0

it translates

1) eskimo, with all the different snow/ice etc
2) navajo (?) -- the native american language which US WWII encryption is based on (go watch Windtalkers)
3) klingon

Re:i will be impressed when by xtremex · 2002-04-05 09:49 · Score: 2

Technically, the multiple translations for snow/ice in Eskimo is a misnomer. Amerindian languages all have multiple words for almost everything.An example:
to cook has maybe 8 or 9 translations depending on the Amerindian Family...to cook fast, to cook raw meat, to cook fish, etc
My take on Amerindian languages being like this was for survival. There could be no doubt what the speaker was trying to convey. The snow thing with eskimo describes different kinds of snow, wet snow, slushy snow, snow that you could sled on, snow that a dog pissed on, etc. Tlingit (Washington State & Canada) has the same thing. The Tlingits are known for the Totem poles.They have different words for the verb "to fish"

--
If you're not a Liberal in your 20's, then you have no heart.If you're still a Liberal in your 30's you have no brain.
Re:i will be impressed when by Anonymous Coward · 2002-04-05 10:29 · Score: 0

This whole thing is called the Worf-Sapir hypothesis, and is largely discredit in modern linguistics. For instance, how many words for snow are there in English? Let's see, there's snow, sleet, packed powder, etc. The Inuit terms are roughly the same, except perhaps slight more detailed. Likewise the whole idea that the Hopi language cannot express the concept of time - Whorf's work on this has been very effectively dealt with. Typical slashdotters, think that just because they can program C they know something about REAL languages.

en slovenska(slovak)... by localh0st · 2002-04-05 08:59 · Score: 0

the infinitive form would be jebat...jebac would be fucker...so, jebat ta, federalies!

--
Loopback Fighters- paving the way for the revolution, one instance of linux at a time.

Ok, but... by Anonymous Coward · 2002-04-05 09:11 · Score: 0

They should also include translations by sentences or phrases, not just "unrelated/out of context" words. Or you wil get things like:

English: I love you.
Spanish Translation: Yo amo tu. -> (me love you)
It reminds me of Tarzan.

I can just imagine dinner at your house... by Anonymous Coward · 2002-04-05 09:14 · Score: 0

Grandma: Fuck! I've dropped my fork!

Grandpa: Mind your mouth, you cunt!

Grandma: Go fuck yourself, asshole!

Grandpa: Eat shit and die.

Grandson: You rude motherfuckers!

Granddaughter: Yeah - what a pair of shits.

"Fuck" in slashdot. by Anonymous Coward · 2002-04-05 09:15 · Score: 0

It means "moderator"

"The [moderator | fuck] that moderated the parent as 'Offtopic' is a total [moderator | fuck].

navajo by JeanBaptiste · 2002-04-05 09:21 · Score: 1

I guess it is obscure enough that even I dont know how to spell it... besides, I dont think most /.ers care h0w u spe11 th1ng$

Super! by Anonymous Coward · 2002-04-05 09:25 · Score: 1, Funny

But will it be more useful than http://www.megablog.com/translate.php?

And what about UNL? by Anonymous Coward · 2002-04-05 09:26 · Score: 0

UNL stands for Universal Networking Language. Its an UNESCO project and its goal is to create a computer language able to translate anything in every language of the world. You can check out www.undl.org for more info.

Re:And what about UNL? by JohnBE · 2002-04-05 09:38 · Score: 2

Lacklider also did a lot of papers on this in the 1960s, Xerox PARC did huge amounts of research and lots of other people including myself have worked (http://www.freesoftware.fsf.org/cdf/) on this problem. What I will say is that although lots of us have worked on it there are very few working systems (mind works a little but needs huge developement), if these guys succeed it is good for us all.

So good luck Wolverhampton!

--
e4 e5

Oh boy! by Bjarke+Roune · 2002-04-05 09:33 · Score: 2

The implications for quantum computing are overwhelming!

--
Bjarke Roune

nitpicky... by Anonymous Coward · 2002-04-05 09:46 · Score: 0

Fuck you!:
"foda-se" or "vá se fuder" or "vai te fuder"

"foda-se" also means "fuck it"

"Foda!" or "É foda!" means "Shit!"

OTOH, "go to the whore who bore you" is "vá pra puta que o pariu", which is an entirely different story.

Note that:

a) when addressing two+ people: "vão pra puta que os pariu" -- although in English there's no change from singular to plural;
b) "c'os" is the contracted form of "com os" == "with the" -- doesn't make sense here, but sounds like "que os";
c) "pariu" is not exactly like "bore", it's used for animals (hence, offensive when applied to humans).

Hope this helps. :-P

Impossible. by Anonymous Coward · 2002-04-05 09:58 · Score: 0

Do you know why?

Because some concepts simply are not used in other cultures/languages.

In my language motherfucker makes no sense as an offense... we have others that wouldn't make sense in English.

We have _other_ list of words...

PS: But, of course, mf is terrible, don't say it in my language. ;-)

I hope ... by dynoman7 · 2002-04-05 09:59 · Score: 1

...it doesn't work as poorly as this beast.

--
Blarf.

Not possible by Tungz10 · 2002-04-05 10:01 · Score: 1

First there are even significant differences between similar languages like englishromance languages cause there to be no direct translation between the two.

Consider for example, that many European languages have two forms of the word "they" to distinguish among males and females. Sometimes in english we use "they" to mean a single person whose gender is unspecified.

Nouns don't have exactly the same meaning, for example /.ers will refer to their machine as a "terminal" which could also be interpreted as the location of their gate in an airport. They might not require a subject in all their sentences, so to translate to English you have to figure out from the context who's doing the action.

What is proposed is to translate to languages much less related to ones in common usage. Even parts of speech don't correspond. Some concepts that we think of as verbs could be adjectives in other languages (Think "I drive my car" compared to "I am , and I'm in my car" type of thing. Many native american languages are highly morphological. That is: concepts that we express in several words can be rolled into one in these languages.

By the way, all of the above assumes that your input is perfectly grammatically correct (whatever that means) Introduce human error, and the result is hard enough for a human, let alone any kind of machine.

I started taking linguistics courses to increase my comprehension of language (I'm monolingual and always had trouble with foreign languages). The main thing I learned is how incredibly complex language is, so massive that you can't comprehend all the inherent rules, yet our brain somehow knows how to process it.

Noble goals, like the Prague Manifesto. by Yekrats · 2002-04-05 10:04 · Score: 2

(Dang, left my flame suit at home. Oh, well.)

It seems like the creators of this system have noble goals, and I appreciate their efforts. It reminds me of Esperanto's Prague Manifesto. "Every language both liberates and imprisons its users, giving them the ability to communicate among themselves but barring them from communication with others."

I think anything that can bring the disparate world together is a good thing. But we woulnd't need technology like this if everyone got off their duff and learned a second language. For the purpose of learning a common second language, Esperanto is ideal. A smart kid like you can learn it in just a few hours of study.

I've used it to communicate with people from Brazil, Korea, and Germany, without having to learn Portuguese, Korean, and German. We just learned a simple middleware language to help us communicate. The Esperanto community offers Free Tutored Courses to help you get started. It's well worth the small investment to become bilingual.

But don't take my word for it. In the words of Tolkein: "My advice to all who have the time or inclination to concern themselves with the international language movement would be, 'Back Esperanto loyally.'"

-- Yekrats

--
Ceci n'est pas une pipe.

Seven words... by DEBEDb · 2002-04-05 10:27 · Score: 0, Offtopic

All your base are belong to us.

--

Considered harmful.

Definitions and parts of speech by airship · 2002-04-05 10:27 · Score: 2, Interesting

The best way to do this would be to take each source language sentence and first SPELLCHECK it (something rarely done on /.) then mark it up as to meaning and sentence structure. For example:

"I went to the store."

might become:

<noun struct="subject" def="first person pronoun">I</noun><verb tense="past" def="to go">went</verb>...

etc.
Granted, the first markup pass would be a killer, but subsequent translations could be automated. As an added bonus, kids would get to learn grammar again.
(Definitions should really be a URI to a universal dictionary, but then you knew that...)

--
Serving your airship needs since 1995.

The problem with this stuff..... by NeoSkandranon · 2002-04-05 10:59 · Score: 1

English- "I love you"
translated thanks to altavista, through spanish, italian, japanese, leaves us with
"His Matrix"

...I don't know firsthand what some of you guys are talking about when you mention crappy translation of important documents, but I can only imagine...

--
If you can't see the value in jet powered ants you should turn in your nerd card. - Dunbal (464142)

best are people translators by panck · 2002-04-05 11:23 · Score: 0, Offtopic

I think it will be many years before a computer translation will ever be as good as a human translator!

--
"What thou shalt not, I shalt did!" -Bart Simpson

There's a way...and it has been done. by prozhead · 2002-04-05 13:47 · Score: 1

We call it "KudoZ".

Professional translators, in their spare cycles, translate each others' toughest terms in whatever language. Points, peer review and other techniques are used to incent and arrive at the "best" translations. So far, it's scalable (1000 questions / day.)

Please don't slashdot it too hard, or the moderators will start squashing you guys. :)

Re:There's a way...and it has been done (link) by prozhead · 2002-04-05 13:51 · Score: 1

http://www.proz.com/kudoz/

The Easiest Way by meggito · 2002-04-05 14:41 · Score: 2

The easiest way would be if you have everything translate to one 'central' language, and from there have a reverse. That way you wouldn't need 1 two-way for every language each language had to contact (ie, 50 per language), but rather one two-way for each language. I think something would be lost, but this makes the project infinately easier to do, and to expand on (not having to write 50 programs to put in 1 more language).

In my opinion, the best approach (NOT best result), and the most likely to succeed.

In Croatian by Anonymous Coward · 2002-04-05 17:43 · Score: 0

Jebiga, jebes^

Pretty little girls school by HiThere · 2002-04-06 06:03 · Score: 2

There are supposedly 17 different meanings for "pretty little girls school", and that's before you get beyond syntax.

One interesting attempt at a language was Loglan. It had an computer grammar. It had regular syntax and simple phonetics. It was designed to be easy for anyone to learn (though significantly easier for English speakers, and secondarily fro other Indo-European language speakers). Unfortunately, to my mind what this clearly proved was that there was no good theory of semantics.

Well, that was 20 years ago. Perhaps the theories of semantics have improved since then. I haven't been watching. But I really do have my doubts.

--

I think we've pushed this "anyone can grow up to be president" thing too far.

Semantic Web Interface? by Randym · 2002-04-06 18:27 · Score: 2

The goal here is to create a mechanism for collecting definitions and translations for words and phrases in less common language pairs (as well as for slang terms that are not covered by most formal dictionaries).

So wouldn't you want to also capture information that indicates, say, *metaphorical* usage? For example, "die Tote Hose", (dee TO-tah HO-sah) in German might be accurately rendered in the New York City dialect of American English as "Fuhgeddaboudit!" [It means -- literally --"the dead trousers" and -- metaphorically -- "old news", "not worth talking about", etc.] This indicates the necessity for some level of meta-information, which is precisely what the Semantic Web is all about.

It seems like this could benefit from a Semantic Web interface of some sort. As other posters have noted, capturing contextual information is vital to adequate translation.

Perhaps this Semantic Web interface could be a third component, somewhere between the first SOAP protocol and the second SETI-like protocol, designed to give volunteers some kind of contextual clues to increase the accuracy of their translation.

BTW, some posters have also raised the question of "Trolls". Perhaps this could be avoided by first asking volunteers to rate the accuracy of other volunteers' translations. Maybe having a high meta-mod score would lead to increased "first translation" opportunities and decreased "this must be checked" translations.

--
DNA is a Turing machine. You, however, being dynamic and emergent, are not.

Thanks! by Anonymous Coward · 2002-04-08 00:28 · Score: 0

You were the only one who got it.

CYC by junge_m · 2002-04-08 08:00 · Score: 1

The dictionary/definition approach was already tried by the CYCorp. Reading their papers one gets an impression of the complexity involved as well as the potential--when it will finally be working. After the computer is able to understand a text translation should be easy.

Up to then I used Google on my homepage.

Slashdot Mirror

Distributed Translation Project

216 comments