Domain: altavista.com
Stories and comments across the archive that link to altavista.com.
Comments · 1,157
-
Re:Hmmm
I am invite?
To what, exactly? -
Try this for size
-
Re:Obvious answor
According to some information at the german news service Heise, the operational costs for the ring are at about 1millionUSD per day.
You probably want to reconsider your bid?:-)
As mentioned in previous posts, Heise has in depth coverage of the story, mostly german, but there is always a (...) fish.
-
Re:So.... confused...
This looks like gibberish to me, but here's a rough translation.
Steffi may not take MSN user off [07.12.2001 15:17 ]
Eddi likes Steffi. And Eddi knows well Photoshop. Thus it produced center this yearly falsified Porno pictures of the German tennis icon Stefanie count and made it available under its MSN.de[1 ] Community side of the world. The predominantly male part of the spectators had a first-class locker photo little hand vibrating to however not also be content, but could over the function "my photo center" additionally to buy.
A glowing Verehrer of the former tennis queen sent after benefit of such pictures a obszoene Mail to his Idol. This awkward kind of the affection stating encountered however little approval: Steffi.Graf switched its lawyer and within fewer hours was Eddis "Fakes OF of star" history -- at least with MSN. the Microsoft managers threw all found star Pornographen out, refused however the delivery of an omission explanation. Thus those wanted to prevent count by a punishment of 500.000 Marks also in the future that its face on different bodies is abused with MSN.
The regional court Koeln[2 ] decided now to favour of Steffi.Graf (Az: 28 O 346/01). Since MSN in its general trading conditions the rights to use leaves itself to stopped contents of transfers on from the Usern and Community contents by Frames and Logos to look, as if are ms slopes of offers, must the Microsoft GmbH ensure that itself no naked Steffis more in the Microsoft network raekeln. The Unterschleissheimer daughter of the software company from the USA tried everything, from Rezitationen of its AGBs and referring to non-liability up to the deportation of all debt to the company nut/mother in talking moon or on the InterNet altogether. But the court remained hard, gave Mrs. count Recht and took up the "concrete unlawful act to the prohibition tenor" for the provisional order. Even Microsofts reference that Mrs. count could not make a repetition danger convincing, did not let the court apply. The company already affirmed the repetition danger by its contradiction against the original omission explanation ( cgl[3 ] /c't) -
Re:The real reason why Microsoft lost...The following is a babel fish translation, of the german article.
---
Steffi may not take MSN user off
Eddi likes Steffi . And Eddi knows well Photoshop. Thus it produced center this yearly falsified Porno pictures of the German tennis icon Stefanie count and made it available under its MSN.de Community side of the world. The predominantly male part of the spectators had a first-class locker photo little hand vibrating to however not also be content, but could over the function "my photo center" additionally to buy.
A glowing Verehrer of the former tennis queen sent after benefit of such pictures a obszoene Mail to his Idol. This awkward kind of the affection stating encountered however little approval: Steffi.Graf switched its lawyer and within fewer hours was Eddis "Fakes OF of star" history -- at least with MSN. the Microsoft managers threw all found star Pornographen out, refused however the delivery of an omission explanation. Thus those wanted to prevent count by a punishment of 500.000 Marks also in the future that its face on different bodies is abused with MSN.
The regional court Cologne decided now to favour of Steffi.Graf (Az: 28 O 346/01). Since MSN in its general trading conditions the rights to use leaves itself to stopped contents of transfers on from the Usern and Community contents by Frames and Logos to look, as if are ms slopes of offers, must the Microsoft GmbH ensure that itself no naked Steffis more in the Microsoft network raekeln. The Unterschleissheimer daughter of the software company from the USA tried everything, from Rezitationen of its AGBs and referring to non-liability up to the deportation of all debt to the company nut/mother in talking moon or on the InterNet altogether. But the court remained hard, gave Mrs. count Recht and took up the "concrete unlawful act to the prohibition tenor" for the provisional order. Even Microsofts reference that Mrs. count could not make a repetition danger convincing, did not let the court apply. The company already affirmed the repetition danger by its contradiction against the original omission explanation. ( cgl
/c't) -
Why MS is (kinda) responsible
Ok. The linked article lacks relevant information. Here's a slightly longer version. You might want to translate it from German, using the fish.
The written reasoning of this judgement will only be released in two days. But when those images were posted, the old version of the MSN EULA was active. It stated that the copyright of any material posted via MSN would be transferred to MS.
If MS has the copyright on the material, they are IMO partly responsible for whether or where it is published. The same is not true for a normal ISP. MS can't have it both ways. (Although I kind of wonder whether an EULA like that is really enforceable in Germany.) -
Why is this surprising?
Everything new and exciting that gets done in the new releases of IE, Netscape and (yeeeeach) Mozilla has been done for longer and done better in Opera. Why can't you little slash gay idiot wanna be linux leprechauns get this? It's not hard, just check out the Opera site. The ads get you down? Keys are posted here every time a Browser War inducing post is written. Or hit up Altavista. Easiest place to find cracks for the noob. Simply put, Opera paves the way and the rest just ride the coatails. Wake up to that fact and you'll see. Just check it out and you won't go back.
-
What a bunch of morons!
It is a really good thing that the court has protected the movie studios by stopping people from linking to DeCSS!. I mean what kind of chaos could ensue if people could link to and find a copy of this evil program? I mean even companies like Disney would go out of business if people kept distributing this program! I am so glad that linking to DeCSS is a crime! I feel much safer now.
-
Broken Link
Of course most people probably know this, but the babelfish link should be: fish.
The editor left out the http:// -
Re:Video Games as Art
The german newspaper "Die Zeit" published an interesting article upon video games and art some weeks ago, pointing out that some games could in fact be considered as art.
The article is available online here. It's german, so you might need a translator -
Serial ATA
Seems that those longing for SATA controlers may be in luck. Use the fish and go here watch.impress.co.jp/pc/docs/2002/0514/3ware.htm.
-
Re:Looks nice
-
Re:Having seen the movie...
Heh. I wonder if the dialog could be improved by translating it to some other language in Babelfish and then translating back to English. It's worth a shot!
-
It's going partner now.
-
more positive reviews from overseas, too
check out this Babelfish of a European perspective. it's quite cavernous in its depth, with a fiery red tone throughout. overall, it's quite positive.
-
Did anyone notice the Air H" card?
There's a picture here, it appears to be a wireless card of some sort, only operating at 128kbps.
Here's more info on the card on the vendor's site.
Translated with the fish here. -
This sounds so familiar..."Consumers have nothing to fear," says Brilliant Digital's Bermeister.
Using The Fish I was able to find two separate translations:
one: "All your base are belong to us!"
two: "Resistance is futile!"
This means something, I just know it. -
In Russia it's routinely used...there is a lot of such production for example on
English translation of the site is, for example:Babelfish translated
So it's at least some prior art present...
-
Peru and OpenSource
Hi,
Peruvian Congressman Villanueva has proposed this law (in Spanish. Use the Fish) that will change the way Peru buys its software. The origin of the Law and it's "travel" within the Peruvian Congress is in this timetable
Congressman Villanueva's Law will ask for any software to be bought by the Government of Peru to provide data in open formats. It will also ask for the source code and the hability to modify the code, to adapt it to the necessities of the Peruvian Republic.
The idea behind this is (liberal translation from Spanish):
"We, the Governemnt, cannot allow any company -foreing or domestic- to ship software that can hide spyware. We, the Government, cannot allow a private company to own the data that belongs to the People of Peru. We, the Government, have special needs and obligations: provide the best 'bung for the buck', allow any Peruvian to audit the source code of our applications to make sure there's nothing hidden that endangers Peru, and to make sure that the data is available even if we change the software supplier. Any software that do not abides by this law will not be used by any Peruvian Government agency".
Also, check what Microsoft Peru had to say about it. And what Congressman Villanueva answered to them.
Go, Peru! -
Re:Douglas Adams predicted this circa 1979
And Babelfish does this for most web pages. As to the whole quote I reproduce it below:
"The Babel fish," said The Hitch Hiker's Guide to the Galaxy quietly, "is small, yellow and leech-like, and probably the oddest thing in the Universe. It feeds on brainwave energy not from its carrier but from those around it. It absorbs all unconscious mental frequencies from this brainwave energy to nourish itself with. It then excretes into the mind of its carrier a telepathic matrix formed by combining the conscious thought frequencies with nerve signals picked up from the speech centres of the brain which has supplied them. The practical upshot of all this is that if you stick a Babel fish in your ear you can instantly understand anything said to you in any form of language. The speech patterns you actually hear decode the brainwave matrix which has been fed into your mind by your Babel fish.
"Now it is such a bizarrely improbable coincidence that anything so mind\-bog\-gin\-gly useful could have evolved purely by chance that some thin\-kers have chosen to see it as the final and clinching proof of the non-existence of God.
"The argument goes something like this: `I refuse to prove that I exist,' says God, `for proof denies faith, and without faith I am nothing.'
"`But,' says Man, `The Babel fish is a dead giveaway, isn't it? It could not have evolved by chance. It proves you exist, and so therefore, by your own arguments, you don't. QED.'
"`Oh dear,' says God, `I hadn't thought of that,' and promptly vanished in a puff of logic.
"`Oh, that was easy,' says Man, and for an encore goes on to prove that black is white and gets himself killed on the next zebra crossing.
"Most leading theologians claim that this argument is a load of dingo's kidneys, but that didn't stop Oolon Colluphid making a small fortune when he used it as the central theme of his best- selling book Well That About Wraps It Up For God.
"Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloddier wars than anything else in the history of creation." -
Re:Long on Talk, Short on SubstanceWell put. Ironic that while the article bashes in the press and pundits who dare attach e to words, the author spends many column-inches predicting e-democratization.
How often do most Internet users take take advantage of the fact that most major world newspapers are online, and a fish away from being comprehensible in their own language? I certainly can't speak for any Chinese, but the case for truth and light coming shining through the Internet seems vastly overstated to me. I think the reason is that putting the case for lux et libertas et machina is that you get to hack the firewall and call it progress, instead of cleaning up oozing wounds on people afflicted with AIDS.
I admire the concern for social problems and the desire to get the tech community (indisputably among the world's richest few percent) involved. Let's just remember: technology won't solve a problem unless the remainder of the infrastructure exists to do the task at hand. You could definitely build a massive shipping database in Equatorial Guinea, but that wouldn't get shipments anywhere any faster than the donkey walks.
-
important WineX info
-
Re:Translation
If you don't speak japanese, use the fish!
-
more cyberterrorism links
here's one in German and its English translation
-
Re:What were the articles about?
-
We'll see if Sony's Vaio "U" beats the OQO...Blurb here:
http://www.pcworld.com/news/article/0,aid,88563,00 . spPictures to give an impression of size near the bottom of this page:
http://www.watch.impress.co.jp/pc/docs/2002/0311/s ony3.htmMore interesting descriptions...
http://www.zdnet.co.jp/mobile/0203/11/n_vaiou.html Babel Fish can help you with the Japanese text:
http://babelfish.altavista.com/~v
-
Re:BAD BAD BAD
it's chink babble for "sexual ass pussy" as translated by Babelfish. it is closer to "gender donkey cat", i'm told.
-
Translation
If you don't speak japanese, use the fish!
-
Dictionary != TranslatorI think it's a great idea to harness the power of millions of people around the world all contributing a few minutes of their time, to create a gigantic any-language to any-language dictionary.
However, this will do nothing to aid in machine translation. You can't simply translate individual words from one language to another, or even short phrases. Translators such as Babelfish understand the basic rules of grammar in each language in order to handle fundamental differences in the way different languages put sentences together.
But Babelfish and other online translators are still a far cry from doing true translation, because they don't understand the text they're trying to translate.
-
Problem with "Universal Translator"
Yes, you can do a word-for-word translation of most words in any language. No, you'll need a very sophisticated system to get the meaning to a reader.
The main problem is that sentence structures are different, idioms get in the way, and words have more than one meaning. A human translator has the power to take a set of words, convert it to an idea, and put out a different set of words, something no machine can do.
Here's a lamebrained example: "The spirit is willing but the flesh is weak." Convert that to Russian and back and you might get, "The liquor will do it but the meat is bad." For a hands-on example, try converting the first few paragraphs of a news article into French using The Fish. On a personal note, I had a conversation with a German guy on ICQ once, using the fish. The results were...interesting. I also read Indonesian newspapers, and I assure you that a literal translator would hurt itself quite badly on this...let alone a less English-like language such as Arabic or Japanese.
That being said, why not use distributed human computing for the thing it's good at? Instead of translating words, how about sentences? You can get at the ideas much better this way. Those sentences that hadn't been translated yet could show up as literal words; those words that hadn't been translated would show up natively. I mean, if you've got human translators for this, you can do things that are not restricted to computers. I can think of a lot neater things the guy proposing this can do with this idea than what he's come up with so far. -
Brilliant!
Who cares if its accurate now or soon, used often enough and with plenty of user feedback about whats the right and wrong way to translate things this could become a very nifty database and hopefully better at what it does than babelfish which is handy but more than that very amusing
:) -
Re:Hi-techAnd the babelfish translation:
Destined to those employees who only make excrement, the Office Throne comes with Computer, Lentium processor, access the Internet, telephone is clearly, fax. (the paper of the fax can be used to clean the cagadas ones)
Hilarious, if not 100% decipherable. -
Re:I don't think so
I seem to remember thinking the same thing about Altavista.... nobody can kill IT... I don't go back there unless I need something translated.
-
Correct babelfish link
The link to the babelfish translation was broken for me. However this link works fine.
-
Re:Sony did NOT leave!
Apparently the fish doesn't like direct linking, so you'll have to enter the URL yourself: AltaVista Babelfish
-
Sony did NOT leave!
Citing the the ZDNET article: "On Sunday morning Sony started packing up its 27 PS2s. The show, in Hannover, Germany, officially finishes on Wednesday." This is very misleading if you read it sloppy. Sony did not leave the show. They removed the PS2s, no more no less. The entire Sony booth is 2000 square meters, only 100 square meters where dedicated to the PS2. The rest is still there. This article by German magazine c't explains the situation in much more detail. Use the fish translation if you don't understand German.
-
Re:Timeline
The proposition of this happening at any time in the future may be considered optimistic to say the least. All you need to do is look at the current methods using text, such as babelfish, and you'll realize how far there is to go. Many times people who make comments about so called Machine Translation have no clue what is actually involved in the process. It involves resolving between many ambiguities. This is heightened in spoken translation, where homophones must also be accounted for.
Take for instance the following sentence: "It's in the pen." Do you know what it was talking about? Would you even if I told you what "it" was? A pen could mean a pig pen, or it could mean a writing instrument. Granted, if you knew I was talking about a spring, you'd know it was a writing instrument, but teaching a computer to know that is no trivial task.
For more info, visit http://www.essex.ac.uk/linguistics/clmt/MTbook/ -
Re:Also try...
-
Is that really possible?
... that attempts to generate computer summaries of all of those news articles on different web sites. The project is called Newsblaster and the summaries are excellent Is it relly possible that a computer created an excellent summary of a news article? All the computer generated articles I've read were almost unreadable... Just look at babelfish translation. Is that really english? Hope that NewsBlaster will be much better!
-
Ultrafast searches using IE powertools
There's a much easier way. If you want even more speed then feed your search directly into the search engine's CGI using IE4 powertools "Quick Search" that it's why I still use IE4
;-) It's way better than the Google toolbar.
Simply feed in the URL and the location of the search parameters using %s and you're set - despite using Google several times a day I haven't seen it's front page in months (except when I used my friend's computer).
For Google the script is:
http://www.google.com/search?q=%s&sa=Google+Sear ch
I type g nuclear physicsFor altavista boolean search the script is:
http://www.altavista.com/sites/search/web?q=%s&r =& pg=aq&search=SearchI type ava (nuclear NEAR physicists) AND scientist AND (glow NEAR in NEAR the NEAR dark)
For Google you can't beat getting good hits at the top, but then with Altavista the boolean queries are so good that I sometimes get *ONE* hit and it's exactly what I was looking for. For example try this precision search I'm surprised that none of the
/. crowd PERL and shell hackers have mentioned using a script like this. Bajeeeeeeeesus. -
Re:only one thing seperates them for me
AltaVista Advanced Text-Only Search
6801 bytesGoogle Advanced Search
12324 bytes, not counting the logo!Right, it's nowhere close!
-
Re:The right tool for the job
I'm astonished. I use google for generic searches, but any time I need a specific answer, google is the one I definitely would not use, as it never returns the link I want in the first 3 pages.
So I have a list of twenty-something search engines I use for specific purposes as they all have their sweet spot.
Here are my top 7:
ask.com
altavista.com
findlaw.com
lycos.com
metacrawler.com
alexa.com
alltheweb.com
etc etc -
Re:only one thing seperates them for me
he advanced search page is all text, not even a banner ad so it's almost faster than google to load.
Huh? The advanced search page I see has not only their logo, but a banner ad, and some tables. So not only is it "almost faster" than google to load, it's nowhere close. -
They still have text-only
Go here.
-
google works for me
I'm usually satisified with the search results I get at google. I suppose I'd say that if I find it, I find it at google.
I haven't used Altavista much, except for babelfish, but after reading this I may have to give it a try sometime. -
GooglepiphanyEach time we visit Google, it is with held breath. We have seen the bold 1990s freedom of the Internet dwindle into a thousand fragmented pieces where only the strong survive. Advertisements are everywhere, intruding into our mindscape. The ten thousands of images a year we see, advertising everything from Goodyear-on-a-blimp to online gambling protruding out of your Yahoo mail, are all designed upon the principle of mindless repetition.
It is well understood that the more times you see an image, the more likely you are to purchase its related product when you are wandering down the store aisles, wondering what to purchase. You've had the moment when you're standing in front of seven different brands of raisin brans, and you opt for one or another, little calculating that the one you purchased was simply imprinted upon your brain more times in recent advertising.
Google strides like a valiant and noble knight, a Don Quixote on a mission from heaven, to clear the mindscape of all those lurching, fragmented thoughts: "buy me!" "buy me!" "buy me!"
Like a gift from another universe, where things are cleaner, and evaluated by merit rather than popularity, Google presents an elaborate algorithm for sorting websites into fields of clarity. So insightful is their methodology, other larger search engines have bowed to this upstart. Even the mighty Yahoo, the first big engine on the 'net, has Google under the hood. So do a dozen other search engines, and thousands of sites who have turned their proprietary search functions over to the agile Google churner. AltaVista, Lycos, metacrawlers, and a few other great ones keep the American principle of competition solid, yet here we behold the miracle of Google.
We programmers watched Google come from behind, for we needed a relevance-based engine long before anyone else did: we had to have it so we could put it in the hands of others who needed our services; we were developers: we knew the information was out there, and were willing to spend hours tracking it down. Somewhere along the way, we'd stumble across this small search engine called Google, and discover that it turned up amazingly relevant searches, time and time again. No advertising. Quick.
So we bookmarked it, then we earmarked it, and finally we began to deliver the most precious kind of advertising which can be earned: we told our friends about it. And we delighted in the lack of advertising. Truly a geek's machine; sleek and relevant.
We watched the Internet bubble come crashing down around its own self- exuberance; we all know at least one programmer humbled by the rapid withdrawal of venture capital.
And so we watch Google carefully now, knowing that it is still bearing fruit for its venture capital investors, yet also knowing that our economy is continuing to draw inward, and as carefully as we form our sentences regarding the future of our welfare... we hold our breath when we visit Google each day for its wealth of free, friendly, and advertising-free three billion interrelated facets of information.
We watched Google handle the September 11 tragedy, worried that it might spark them into becoming a news portal, since their cache ability made them compete with sites like CNN which were swamped with 50,000 hits per second... and we saw Google come out cleanly, building on the crisis in a noble, not-capitalizing-on-the-crisis, manner. Now you can visit Google and find current information; it's a portal, yet ever so quietly, since there are no advertisements. Portals have become synonymous with a barrage of advertising, so what do we call this gallant creature who will not stoop to capitalism?
It's just a humble search engine: A search engine which points the way into a future with a clean mindscape. We may not all make it there; spammers prove that they'll come into such a future kicking and screaming for attention, and since we know that we all have to arrive together or else we none of us can arrive, we tolerate them.
Yes, we hold our breath each time we visit Google, lest they make that sad plunge into our noisy world instead of rising above it. And we are continually surprised by the improvements which they are making. These are not trivial improvements, simple cosmetic additions; one by one they have expanded our notion of how powerful a search engine can be, how it can nimbly reach into the deepest crevices of the Internet and produce a slew of relevant information on obscure topics. Search within groups. Search for images. Search only for images which are wallpaper sized from sites in Europe and are black and white.
The essence of the Internet, the information revolution, has somehow been bestowed upon the novel minds working for Google. We look at their job offerings, and yearn for the day when we can deserve such benevolence as to work for Google. Certainly only the best of the best work for Google (or id). They play hockey in their parking lots, and eat catered food every day. Ah, there we begin holding our breath. We like to have fun at work, but too much fun is a sign of venture capital.How do they do it, how do they keep going, and going, and going without losing integrity by selling ads or trying to do too much? Google quietly inspires us to consider a world without advertising. Oh, they take advertising alright, yet look at it: it's extremely targeted, intended to be relevant to the searcher. With a thick black line separating advertising and content. No advertiser images. None of this irrelevant barrage. Looking for a new ISP? Here's twenty links, and over here in the corner, ten folks who've paid us money to be listed when you search for ISPs. Google drew a distinct line between the advertiser content and their own content. And they steadfastly looked toward our needs when they tolerated no images. Text- based. Get the information into the hand of the gentleman while he needs it, and trust that he will come back later with a thank-you note in hand.
Well, here is one thank you note. I hold my breath each time I visit Google, and I use it extensively, and have for years. I was Googling when Google wasn't yet cool, and I'm delighted to see it surviving. I hope they remain solid in their condition of accepting no image-based advertisements, and pray they will continue to inspire us with clarity on the concept of what it means to serve.
The cache concept, now firmly entrenched in the way we conceive of the Internet, is perhaps the greatest aspect of the information revolution: You once published a site, but now it is defunct. Or your site is presently being slashdotted or DOS'd. No problem, visit the Google cache for the site, and there's your info, as clear and sometimes quicker than the original version. The folks at archive.org have taken this idea and run with it, yet I must admit the first time I realized how profoundly differently we were going to be processing information in the future came when I understood what Google was doing with their cache. I prayed then, and the prayer was answered, that the cache would not be shut down because of re-publishing rights issues. Now Google has enough momentum that it would take an act of Congress to shut off their caching.
Take a look at Google. Unlike most companies with bold pretty mission statements hiding inner corruption, Google somehow matches their ten operating principles with immediate proof. They do it right; they work hard for their money.
-
How I Learned to Stop Worrying and Love the PanoptHow I Learned to Stop Worrying and Love the Panopticon
How much ass does Google kick? All of it.
Remember when searching the Internet was hard? The dark days when we relied on dumb-as-sand machine intelligences, like those on the back-ends of AltaVista and Lycos, to rank the documents that matched our keywords? The grim era before Google, when searching was a spew of boolean mumbo-jumbo, NEAR this, NOT that, AND the other?
God, that sucked.
Lucky for the Internet, Google figured out the One True Way to make sense of the Internet, to defeat gamers of the system and send info-free brochureware plummeting to number n - 1 out of n results.
They did it with our help. Google's near-magical ordering of the Internet is built around the notion that computers are good at doing repetitive, uncreative things -- fetishistically counting things, for example -- and rotten at understanding why they're being asked to do these boring tasks. By contrast, human beings are great at understanding why they're doing something, but they're woefully deficient in the do-the-same-thing-perfectly-and-forever department.
AltaVista tried to get computers to do both the repetitive parts (capturing billions of documents) and the creative parts (figuring out what the documents are about). This yielded the largest collection of randomly organized documents in the world, a Web-accessible version of a library where all the books have been re-shelved by axe-grinding illiterates who wanted to make sure that no matter what you were looking for, you'd find porn.
Yahoo tried just the opposite, getting human beings to manually identify and describe all the documents comprising what was meant to be an exhaustive index of all the worthwhile pages on the Web. There were "scaling issues" involved in this laudable effort (for "scaling issues" here, substitute "catastrophic failures"), and over time, Yahoo's directory dwindled to an increasingly marginal sliver of the Internet's vastness. At the rate that Yahoo's army of indexers work, and at the rate that the Internet's unwashed horde of writers is adding to the noosphere, it's only a matter of a few years before every human being alive will have to pass his or her every working hour contributing to Yahoo's index, just to keep its sliver from dwindling into utter pointlessness.
Let humans do what they do; let computers do the same.
Google bridges the divide between human-generated indexes and machine-generated analysis.
Y'see, the Web is full of people like you and me, making links between documents; human beings, making decisions about documents, voting with their links. When I link to some arbitrary document, it's an indication that I think that it's in some way authoritative. When you link to a document I wrote, you're indicating that I'm in some way authoritative. The Internet is already structured in a meaningful way, but that structure is obscured. Google teases out the relationship between the URLs, examining the webs of authority: this person is linked to by 50,000 others, and he links to this other person over here, which indicates that person one is a pretty sharp individual, one who's inspired 50,000 human beings to take time out of their busy schedules to link to him; and person one thinks that person two is on the ball, which suggests that person two knows what she's on about.
It's a best-of-both-worlds solution. The computers at Google are asked to tirelessly count and re-count the number and destination of links on every page that Scooter, the Googlebot, can lay its user-agent on. Those links are made by human beings, doing what they do best, link by link, drip by drip, layering a film of order over the Internet.
The approach works well. Eerily well. Enter a couple of search terms, and biff-bam, the most authoritative documents containing those keywords are served up in an instant. Nearly every document on the Web has a human decision associated with it for Google to glom onto; that's because nearly every document on the Web has a human author. Human authors don't just put documents onto the Web; they put them into the Web, into the meshed hairball of incoming and outgoing links, indicating not only what keywords the document contains, but also who the document's author believes is authoritative, and vice versa.
It's quite elegant.
An imperfect forgettery
Meatspace ASCII, the revered printed word, has many things going for it:
- It's high-resolution: Whether scrawled with a toddler's crayon or hammered out by a quaint, humming Selectric's print-ball, a traditionally printed word is an order of magnitude sharper and better-defined than the phosphors marching across your screen.
- It requires no specialized reader: A printed word can be read by any literate human being during daylight hours without any particular technological assist, specialized readers, or even electricity.
- It is hard to make obsolete: Printed works don't staledate the way that electronic words do. It's difficult to apply "digital rights management" schemes to the printed word that will stymie generations to come with bizarre cryptosystems that seek to circumvent posterity.
As someone in possession of tens of thousands of books, I understand why people get misty and sentimental about dead-tree libraries. As someone who has moved twice in the past 18 months, I feel compelled to point out that the printed word has a couple of major downsides:
- It is fragile: We print books on the same substrate we employ for cleaning our nether regions after excreting. Think about that for a second: Paper is considered degradable enough to flush billions of sheets of it down the crapper every day, and yet we entrust our precious words to a material that auto-incinerates if you put it into contact with oxygen.
Well, so what? We've got mass production techniques that will let us preserve our most important documents by making millions of copies of them. Which brings us to the next problem:
- It is bulky. Moving-box companies sell specialized shipping boxes for books, boxes that are smaller than all the other species of boxen. That's because books are freakin' heavy. They're made from trees!
Every year, storage media increases in density, decreases in size, and gets cheaper. I can fit all the hard drives of all the computers I've owned, plus all the floppies for all the computers that I owned before hard drives were common, onto the hard drive of my latest laptop, with storage to spare. Hell, most of that stuff will fit on my iPod! The data that previously occupied a roomful of storage devices now fits comfortably in my pocket.
In a world of degradable storage, replicating copies is the surest way to guarantee longevity. Whether your data is in atoms or bits, the more copies you make of it and the more widely you disperse it, the greater the likelihood that your data will persist forever. (That's why Jaron Lanier jokingly proposed encoding printed matter into the DNA of the notoriously prolific cockroach, as a means of ensuring archives through a nuclear war and beyond.)
With bulky printed words, only the commercially successful (and hence prolific) and very lucky works are likely to survive the voyage through history. All the words we write try to crowd into the lifeboat, but only a lucky few survive.
The historical forgettery is something of a blessing, though. Many's the word that's been penned, in casual correspondence or published works, that is best forgotten. I know that I've written a few things I'd rather no one ever saw. Much of it is embarrassing; most of it is banal. History flenses away the great bulk of utterance and leaves behind a barely manageable archive that we can get our heads around.
Words-as-bytes need not be forgotten! Storage is cheap, storage is compact, and the lifeboat has got plenty of room for every jot and tittle keyed into the Internet. Brewster Kahle built an archive with several copies of the Web at different times, using off-the-shelf PCs and standard drives.
This is a good thing, but it's also a pain in the ass. Our embarrassing excesses, drunken rants, typos and brain farts and flames no longer vanish into our sub-consciences, but rather hang around like embarrassing relatives, undeniably ours, with us forever.
There's an upside, of course. The enduring presence of our publicly stated positions acts as an accountability system, making us own up to our errors and perhaps encouraging us to think carefully before putting our fingers on our keyboards. Old Usenet clients used to have a standard warning that would appear the first time you used Usenet to send a message, a dire warning to the effect that your words were about to pass from your computer and onto the computers of thousands of other people, and are you really sure that you've expressed yourself adequately?
Perfect surveillance
Jonathan Lethem's Motherless Brooklyn features Lionel Essrog, a private detective with Tourette's Syndrome whose obsessive-compulsive illness makes him ideal for long, boring stake outs and wiretap parties. Once the compulsion to listen for a keyword in the soup of a rambling conversation or to continually re-check a staked-out doorway for a suspect has been planted in Lionel's Tourettic brain, he is unable to do anything except listen and watch until the compulsion has been satisfied.
Boring, repetitive, endless tasks don't actually require someone with a compulsive disorder to do them; computers can do them just fine. A computer can sieve through the torrent of packets passing over the Internet and look for keywords like "terrorism" and "anthrax" and "fissile" and "child-porn," then flag them for later consideration by law-enforcement officials at spooky three-letter agencies.
Law enforcement doesn't really need any specialized equipment to surveil the average netizen. Google does it better than anything else possibly could (dirty snitch), and it doesn't cost a cent.
But Google only acts on the public data that human beings are free to link to and that the Googlebot is free to discover. Private documents (email, instant messages, internal memos) are off-limits to Google. Even if you manually poured them down the Googlebot's throat, the absence of incoming or outgoing links to these documents means that they won't be placed in any meaningful context in the Googleverse.
Increasingly, law-enforcement agencies are pushing for (or owning up to) the creation of really creepy spyware projects like Eschelon, Magic Lantern, and Carnivore, systems that are placed on your computer, at your ISP or at a major Internet backbone, and used to indiscriminately capture all of the data they encounter, shunting it off to shadowy bunkers where the secret masters of the universe can use it to shine a light up the skirts of your privacy and, possibly, that of criminals, too.
People are, rightfully, very upset about all of this. Continuous wiretapping of the entire Internet is a revolting idea, something like the Panopticon, a prison where the warders can see your every move from perfect obscurity. It's enough to make you want to draw your blinds and curl up under the sofa.
AltaVista for them, Google for us
But what do they do with all of that data that they collect? Filter it for keywords? Fat chance. The volume of false positives (e.g., people talking about child pornography who aren't child pornographers) far exceeds the volume of actual criminal activity. Even creaky old Lycos gave up on plain-old keyword matching a long, long time ago.
Maybe they manually check it. After all, that approach worked for Yahoo, right? Oh, right, it didn't work. Scratch that.
Then they must use some hybrid approach: human editors and AI (Artificial Intelligence or Almost Implemented, take your pick) working in concert to tweeze out the most relevant material as quickly and efficiently as possible.
Right. AltaVista.
Poor bastards.
-
Re:Just a thought...
I wonder if babel would do a good job (i.e. readable) of translating whole pages at a time for this, which would make it rather easy to translate into numerous languages.
-
Re:SINCE WE'RE ON THE SUBJECT...
OK... apparently, I am a moron... well, maybe not a moron, but LAZY. I got off my arse and did some poking around. Look what I found.
I found a few application level proxies -
ftp.proxy - This looks very well done.
smtp.proxy - done by the same guy as tcpproxy below.
For the generic tcp proxy -
nportredird - This looks very promising.
aproxy - looks a little too simple, but it's perl! (English can be found via babelfish.)
tcpproxy - This one seems the most complete and designed for a firewalling environment.
I found a whole slew of different app "level" proxies (Quake, POP3, etc.), but most seemed a bit basic. Some of the POP3 ones were cool (proxy auth support).
I was not able to find a good udp proxy - with multi-source/multi-destination (proxy with an ACL). I've a small local port udp redirector (I have no idea where I got it) that I use on my home network, but it's not something I could use at work. So... there ya go.
-
Re: Another interesting articleAnother interesting article concerning the kazaa/morpheus mystery can be found in an article by the German mag telepolis (Run by the well-renowned folks from heise.de. (Beware, though, it's in German, you might want to try the fish, for an, albeit clumsy translation).
They basicly appear to cite the zeropaid.com article mentioned earlier, but try offer a more neutral comment on what the facts are, and what is speculation.