Babelfish Mutations
Zen Master Nate writes "You are probably familiar with the BabelFish translation page, where you can enter a phrase and have it translated to or from several languages. Quickly. Inaccurately.
This page automatically feeds the results back into the machine, resulting in exponentially erroneous translations. " I've been feeding it quotes from American Pie, Star Wars and Seinfeld. I should get out more.
One can't help but wonder if such translation 'errors' don't hide deeper linguistic truths. Here's a couple of examples that ring true to me:
The first was the result of early ('first wave') AI research. Mid 50's or so, right at the height of the Cold War. For some reason, the Feds thought English/Russian translation would be easy as (warm apple) pie. Alas, the phrase:
"The spirit is willing but the flesh is weak."
came back from its round trip as:
"The vodka is strong but the meat is rotten."
Pretty good intelligence-gathering, I say.
Better still, some philosopher translated the line from Shakespeare"
"Many a bad wedding has been prevented by a good hanging." [possibly I have mangled this one myself]
into German. Some other philosopher translated it back as,
"It is better to be well hung than ill-tempered."
which is so true, it may be the one thing all slashdotters could actually agree on!
Shakespeare is great when run through babblefish. For some reason it seems to do Macbeth reasonably well, but totally mangles Hamlet. "To be or not to be" would become unrecognizable, whereas I couldn't get it to mangle "Tomorrow, and tomorrow, and tomorrow". Actually I plugged the whole of those respective soloquies in, and Macbeth still came out readable.
I've finally had it: until slashdot gets article moderation, I am not coming back.
I'm the author of the page in question. Several people have very kindly offered to host the thing, and the first has gone ahead and done it. So the original URL now simply redirects you to a random mirror (of which there are now *one*, but I expect more soon).
Thank you, Brent Meshier.
JonathanHasn't this been posted before? Perhaps it would be a good idea to compare URLs in new and already posted articles to avoid multiple postings of the same subject/URL.
"I love my job, but I hate talking to people like you" (Freddie Mercury)
Here's the perl script, for anyone who's interested -- it hits bf once at the start, plus twice for each language available. Just enter an English phrase, and you'll see it translated to the other languages and back.
Some choice examples:
English: Why does the author use the phrase "kick-ass" constantly?
Italian: Why the author uses constantly the d of the phrase "soccer-ass"?
English: I chop down trees, I eat my lunch, I go to the lavatory. On Wednesdays I go shopping, and have buttered scones for tea!
Spanish: Edge under trees, I eat my I have lunch, I I go to the service. Wednesday I am going to make purchases, and I have scones greased with mantequilla for the tea!
English: I'll get you, and your little dog, too!
French: I will obtain you, and your puppy, therefore!
German: I receive you and your small dog, also!
Portuguese: I will start, and its small dog, too much!
English: No blood, no foul.
French: No blood, no stinking.
Spanish: No blood, no revolting one.
The source follows. /. won't let me use <PRE>, and I'm not going to format it by hand. Use "View Source" to copy/paste it.
cheers,
mike
#!/usr/bin/perl -w use Data::Dumper; use HTTP::Request::Common 'POST'; use LWP::Simple; use LWP::UserAgent; use HTML::TokeParser; $url='http://babelfish.altavista.com/cgi-bin/trans late'; print STDERR "Getting language list: "; $page=get $url; my $p=HTML::TokeParser->new(\$page); while ($toke=$p->get_tag("option")) { next unless $toke->[1]{value} =~ /^(\w\w)_en$/i; $lang=$1; ($name=$p->get_text) =~ s/ +to +English.*//is; push @langs,$lang; $names{$lang}=$name; print STDERR "$name "; } print STDERR "\n"; die "No languages found!\n" unless @langs; undef $/; $text=join ' ',@ARGV or ($text=`fortune`) =~ s/\n\s+--\s+.*\n\s*.*\n?$//; if (length $text > ($max=768)) { warn "Text longer than $max characters, truncating...\n"; $text=substr($text,0,$max); } die "Text must contain words!\n" unless $text=~/\w/; $text=~s/[.\s+]*$/./; printf "\e[1m%12s\e[0m: %s\n",'English',$text; foreach $lang (@langs) { printf "\e[1m%12s\e[0m: ",$names{$lang}; $trans=fetch($text,"en_$lang"); print fetch($trans,"$lang\_en"),"\n"; } sub fetch { my ($text,$lang)=@_; $text=~s/[.\s+]*$/. XYZZY./; my $ua = new LWP::UserAgent; $ua->env_proxy; my $req = POST $url, [ doit=>'done', urltext=>$text, lp=>$lang ]; my $page=$ua->request($req)->as_string; my $p=HTML::TokeParser->new(\$page); my $ok=''; my @tokes=(); my $toke=[]; while ($toke=$p->get_token and !$ok) { next unless $toke->[0] eq 'T' && $toke->[1]=~/\S/; $toke->[1] =~ s/\s+$//; $toke->[1] =~ s/^\s+//; push @tokes,$toke->[1]; $ok=$1 if $toke->[1]=~/(.*\S)\s*XYZZY\./s; } $ok=~s/\s+/ /g; die "Bad Response!\n". Dumper(\@tokes) . "\n" unless $ok; return $ok; } __END__
French: A backed up penny is a gained penny.
German: A penny, which becomes secured, is an acquired penny.
Portuguese: A currency of a conserved cent is a currency of a cent earns.
Or, you can feed the newly translated english sentence again through the next language, for more strange stuff.
English: A penny saved is a penny earned.
French: A backed up penny is a gained penny.
German: Support penny is a won penny.
Italian: The penny of support is a gained penny.
Portuguese: The currency of a cent of the sustentation is a currency of a cent earns.
Spanish: The modernity of a cent of the sustenation is a modernity of a cent wins.
Fun :-)
Spoon not. Fork, or fork not. There is no spoon.
Mutate "This sentence is recursive." using German. It recurses. Don't worry, the script eventually gives up.
--
Win dain a lotica, en vai tu ri silota
try this recursive one:
"Pimping ain't easy!" in Italian.
Quoth the Penguin, "pipe grep more!"
Aahhhh... American Pie quotes ;-)
Someone needs to mirror the panix site - e-mail the guy and snag a copy of his perl script etc....
Anyways, here's my contribution to the humor:
Original:
And one time, at band camp, I stuck a flute up my p*ssy.
French:
And once, to the camp of tape, I stuck a groove to the top of my cat.
German:
And once, at tape stocks, I adhered a flute up mean Pussy.
Italian:
And once, to the encampment of wrap, I have attacked one rabbet on mine pussy.
Portuguese:
E a time, in the encampment of the band, I pierced a flute above of my pussy.
Spanish:
And once, in the field of the bandage, I stuck one flauta upon my kitten.
Notes: The portuguese "E a time" should be "And one time" - dunno why it can't translate the portuguese word for and.... weird....
Some other words didn't work too well - these are more obvious though. (Like "flauta")
--TheOrangeSquid Is it any wonder things seem so awry? We swim in a sea of confusion and don't have to think to survive
On his site:
English As She Is Spoken
FLASH!
I've been Slashdotted[tm]!
I was hacking away at some work-related stuff, when I got this note from an administrator at my ISP:
Subject: Your CGI featured on Slashdot
From: marcotte@panix.com
To: jdf@panix.com
Date: Sat, 28 Aug 1999 01:20:39 -0400 (EDT)
Your CGI which was featured on slashdot was causing all sorts of
problems on our servers. Since it needs to make many queries to
another server, it takes a long time to run. That coupled with the
fact that slashdot is a very high volume site caused your perl
processes to back up and cause both of our web servers to stall. It
is quite likely that babelfish was overloaded also.
I needed to rename it so what it wouldn't cause the servers to go down
again. You may want to put up some examples of what the script can do
instead. At least until the traffic levels off.
In general we need you to let us know when you expect a large increase
in traffic (I'm assuming you submitted the link to slashdot).
Thank you.
--
- Brian
Panix System Administrator
So, I'm honored to have been featured on Slashdot. Now, is there anyone out there who's willing and able to host my little toy?
In the meantime, if you like, please submit your email address via this form, and I'll send you a note when the script is up and running again somewhere.
Darn, and I wanted to try it out too.