mbauser2 · Slashdot Mirror

The name isn't the problem, the poor usability is! on RSS Web-Feeds, The Next Big Thing? · 2004-03-01 17:31 · Score: 1

The name RSS isn't a problem: The terms HTTP, HTML, and World Wide Web don't make any sense to newbies, either. The Web didn't suceed because it had a better name than its predecessors (FTP, gopher, etc.), it suceeded because it was easier to user than other ways of using the Internet.

Before the WWW (protocols and software) pulled it all together, using the Internet required separate tools for separate protocols, and/or separate viewers for different file formats. (Not a problem for the average Slashdotter, but a barrier for newbies.) With a web browser, on the other hand, all the newbies needed to know what that one program showed them everything, and if they clicked on the blue text, something useful happened. "Using the Internet" was simplified to 2 programs: a web browser and an e-mail program, which usually came bundled together. It couldn't get much simpler than that. (10 years later, and most users are getting by with just browser, e-mail, and instant messaging. As far as most users are concerned, the Internet is three programs.)

And even when proprietary formats showed up that weren't integrated into browsers (like PDF and Flash) showed up, the software companies involved had the sense to use plugin-based solutions, which used the new formats' MIME types to launch helper applications inside the browser. Furthermore, webmasters using those formats had the sense to include warnings like "Viewing this file requires Acrobat Reader. Click here to download". The warnings are annoying after a while, but they work: People download the plugin once, and get back to their point-and-click lifestyle. All in all, it's fairly easy to use.

Compared to that, the inventors and implementors of RSS haven't done anything right.

Most links to RSS files are hidden behind obscure, uniformative links, like that stupid "XML" icon. That tells users nothing about what the link is for.

Should users actually click the link, they'll either get an XML file they don't understand, or an error message because their browser doesn't understand the MIME type. (Here's one of the unwritten rules of web design: If clicking on a hyperlink can't produce a useful result, it shouldn't be a user-visible hyperlink.)

If users actually figure out what the file is for, they still have to find, install, and use a program that doesn't integrate well with their browser experience. Most aggregators require users to cut-and-paste links from the browser to the feed reader. That's just awkward.

If technical people want RSS to catch on with non-technical users, they need to improve RSS's usability.

What webmasters need to do: Lose the stupid "XML" icons, and start telling people what that RSS link actually does. While you're at it, recommend some software so that users are fumbling in the dark. If you can, give your RSS file a MIME type besides application/xml, so that software authors can start making MIME-aware aggregators.

What software authors need to do: Integrate the subscription/aggregation software into the browser better. Making users launch a fourth program is not a usable as making their browser, e-mail programs, or instant messengers notify them of news. Also, write programs that are MIME-aware, so that when users click on an application/rss+xml file, the aggregator does something productive (popping up a box asking if the user wants to subscribe to a site.) Making software that works in a pipe (and/or reads RSS files declared on a command line) wouldn't hurt, either, because it would allow the distribution of RSS by non-HTTP methods.

A new name isn't going to help RSS (it's been tried before). Better usabilty will.

(Yes, I know this lecture was long, but trust me, it was even longer the first time I ranted about syndication.)

Re:W3 compliance? on Yahoo! Vs. Google: Algorithm Standoff · 2004-02-25 16:58 · Score: 2, Insightful

Google doesn't give a rat's ass if a page complies to W3C standards. That would be a stupid way to run a search engine, because that would let junk sites boost their rank for superficial reasons while punishing relevant sites that have minor mistakes. Google is about content, first and foremost, and following standards doesn't improve content.

When it comes to web design issues, Google does not punish naive mistakes. If somebody's HTML is so weird that it must be an attempt at manipulation (like making an entire paragraph out of H1 elements), it might get penalized. Other stuff, Google doesn't care about, because any strategy that penalizes most of the web is counterproductive to their goals.

That said, Googlebot is computer program, so it probably does a better job of parsing pages that are well-formed (in the XML sense), and otherwise "easy to parse". Following standards is a good way to achieve "easy parsibility", so Google occasionally gives the "check your HTML" advice to people becuase it's easier than writing everything I ust wrote.

Deleting ccTLDs -- works 2/3 of the time! on Niue WiFi Network Gone, .nu TLD May Follow · 2004-01-11 23:03 · Score: 5, Informative

Actually, the status of .su is debatable -- IANA froze the domain so that no new .su domains could be created, but it was reopened by .su administrators a few years later, even though IANA & ICANN didn't recognize it as an active TLD. .su still isn't listed on IANA's public list of ccTLDs, but it's listed the in whois.iana.org database because .su's administrators are too stubborn to give up. (The .su root servers are also .ru root servers, which makes them hard to ignore.)

Using the ccTLD of a "deleted nation" is kind of iffy. The ccTLDs are supposed to be based on ISO 3166-1, and the ISO is allowed to reassign old codes to new nations. If IANA let ccTLDs outlive their nations, they increase the chances of having two claims to one ccTLD. Sooner or later, somebody would get accused of ccTLD-squatting.

For the record, ccTLDs have been sucessfully dissolved before: .cs in 1995 and .zr in 2001. (Also, I'm told .dd was dissolved when the two Germanies unified, but I'm not sure .dd was ever active to begin with.)

If the end of Niue's independence led the ISO to drop nu from ISO 3166-1, IANA and ICANN probably would try to freeze or delete .nu, depending on how active it remained and who was willing to keep managing it.

Keep in mind, though, ISO 3166-1 doesn't require political independence for a region to have a geographic code, because it's still useful for "distant regions" to have their own codes for non-Internet purposes (like air travel and shipping). There are completely uninhabited islands that still have ISO codes! As long as people are living on Niue (and New Zealand doesn't ask for deletion), the ISO will probably leave nu on the list.

They're trusting customer laziness, duh. on AOL's $299 PC · 2003-12-04 20:23 · Score: 4, Insightful

I used to work in a camera store that sold cell phones, too. (Don't ask me why a camera store would bother will cell phones, because I'm still not sure.) One thing I learned: mobile phones still cost the phone companies more than the 99 dollars they sell them for -- the companies are swallowing the cost of the phone because they hope to make it up with a few years of phone bills.

It works, too, because... (wait for it)... people don't want to change their phone numbers. (Another reason the cell companies dislike number portability). AOL is assuming the same thing will happen with their service -- customers will decide it's too much work to change it, stick with AOL for years, and repay AOL's investment.

(Besides, AOL is an evil megacorporation. If they don't make their money back, they'll just find a way to write it off on their taxes.)

Re:Step 1: Lose the Moveable Type on Spam Rapidly Increasing In Weblog Comments · 2003-10-27 17:56 · Score: 1

"anyone using proprietary weblog software such as Moveable [sic] Type has brought this on themselves"

What? Do you expect every person on the Web to write their own software? If so, why stop at weblogs? Why not make everyone write all their software so that we never have e-mail viruses, DDOS, worms, or other widespread attacks?

If you want to attack MT for being too easily spammed, that's one thing, but attacking users for not being programmers is just fucking stupid.

(Irony alert: I'm only /. right now because I'm taking a break from testing the weblogging script I'm writing.)

I'll see your "Hmm", and raise you another "Hmm". on FBI Investigating Lamo Via Patriot Act Provision · 2003-09-29 20:05 · Score: 2, Informative

Sure, we Americans have more enummerated rights than you Brits, but we've also got a higher percentage of our population in prison than you. In fact, we've got the highest confirmed prisoner per capita rate of any country on Earth.

Numbers like that make me wonder if we're somehow missing the point here in the States. Rights on paper are nice, but they don't tell the whole story.

(Here's a big chart of imprisonment figures, if anyone wants details.)

Re:Great journalist acid test on FBI Investigating Lamo Via Patriot Act Provision · 2003-09-29 19:49 · Score: 1

What is scary about the article, if it is true, is that the FBI is using the Patriot act to demand that the journalists preserve their information to hand over to the Department of Justice and threatening them with prosecution for obstruction of justice if they refuse to comply.

You know what's even scarier? The fact that they're waving the PATRIOT Act at journalists in a case about financial damages to corporations, but the DOJ barely seems to care that Bob Novak is outing CIA agents. I wonder which crime would help terrorists more....

Wacky homeless hacker? You get the PATRIOT Act thrown at you.

Wacky right-wing columnist? You get a shrug.

I don't even know where to begin in complaining how wrong that is.

Re:Here's my essay on Mr Anti-Google · 2002-08-29 19:10 · Score: 1

So "Everyman", we meet again.

Time for me to critique your article again. I'm mostly going to cut-and-paste from our discussions earlier this month at Search Engine Forums and alt.internet.search-engines. Once again, here are the key problems with your essay:

The argument that PageRank buries sites in the results is useless. Any method of choosing sites out of a large database is going to bury listings at the bottom. Getting rid of PageRank won't change the fact that "somebody has to lose", it just changes how they lose. (Which, as everyone else here has pointed out, the real problem you have with Google: You want someone else to lose.)

Nitpicking the definition of democracy is a cheap rhetorical trick. Go look up the definition of the word. It's both simpler and more complex than your inane "one man, one vote definition" -- democracy is based on the will of the people, but there are a lot of ways besides "one man, one vote" that the will of the people can be measured. (If democracy was as simple as you think it is, we've have to disband the U.S. Senate for disproportional representation.)

You're attacking Google for not being democratic enough (by a shallow definition of democracy) when it's the most democratic major engine on the Web. Linkpop is one of two approaches (the other being clickpop) that makes the opinions of others important in the algorithm, and the only one judges content-producters based on the opinions of their peers. In essence, content-producers are judged by other content-producers, not by Google. Google is not a tyrant, Google is a naive populist that's trying to quantify peer review.

The repeated mentions of dot com failures and Altavista is an even cheaper rhetorical trick. You're repeatedly mentioning failed and suspect ("false promise") businesses to prime the reader into assuming Google must be crooked, too. Why don't you just ask if they've stopped beating their wives? It would be quicker.

You have a weird definition of "objective". (So weird, I'm not sure what it is.) The FTC definition of objective is "not influence by money". Yours seems to be "not influenced by anything". You're never going to convince the FTC (or anybody else) that it's illegal to judge content by its reputation.

The complaint that Google doesn't crawl everything is just, well, stupid. Nobody crawls everything. Singling out Google for a "flaw" that every engine has is foolish. That's not a failing of Google, it's just the state of the technology today. For every webmaster who says search engines don't crawl enough of his site, there's a webmaster who says search engines crawl too much too fast. Google and the other engines are still trying to find the balance between the two extremes.

Wrapping things up, none of your proposed "reforms" will create a better engine, they'll just create a different engine. More importantly, it would be an engine that's more dictatorial than the current models, because it would depend on text-analysis. Text analysis is the most autocratic way of finding websites, because it's controlled entirely by one actor (the engine).

You're not arguing for democracy or for the people. You're just arguing for a return to the kind of autocracy you think will benefit you more. You want content-providers to matter less not more, and to give all the power back to the engines. That's not a step forward.

As for cookies, you're insisting on singling out Google for something lots of sites do. Checking my Temporary Internet Files folder right now, I see 10-year or longer cookies from Altavista, Yahoo, and Looksmart, as well as from non-search companies like TV Guide, Sprint PCS, and Metafilter. Long-lived cookies aren't a government conspiracy, they're just a (widespread) sign of lazy programming.

You're just fixating on Google because Matt Cutts used to work for the NSA, and your obvious obsession with government intelligence agencies is affecting your judgement.

Re:Quantum Slashdot Effect on Google on Modern Day Search Engine Manipulations · 2002-08-15 06:25 · Score: 1

One of the Google ranking algorithms is based on how many people click on each offered link.

No, it's not.

Look at the source code of Google results pages. Google links directly to the pages listed. If Google was ranking by clickthrough, it would have to link through a redirect like Yahoo does. (Or, if you want to get really complicated, put a web bug on every page listed in the index, but that would require the cooperation of every webmaster with a site in the index. Not gonna happen.)

Google does through in some link-tracking every few zillion results, but (according to them) that's for quality control, not click-ranking. If the random spot-checks show that people are having to dig deep into results, Google will tweak their alogrithms, not just move sites up in arbitrary searches.

Re:My guess is that it's a problem with IP numbers on Modern Day Search Engine Manipulations · 2002-08-15 06:18 · Score: 1

If, indeed, Google is ranking links by IP address then it's a brutally flawed concept to begin with

I didn't say Google is "ranking links by IP", I said it's confusing 2 sites that have occupied the same IP address. In fact, Google is doing all it's normal ranking procedures (text analysis and link analysis), but screwing up the very last step by associating the rank with the wrong URL. Yes, this is a big error, but it's easy to spot: If you click on a Google link, and the site you find is about what Google says it's about, this problem hasn't affected your search.

I actually doubt the problem is entirely Google's fault. Very few people have reported this problem, and whenever I've tried to help them, they turned out to be "strictly end-user" types who couldn't tell me anything useful about their server configuration. Therefore, I haven't been able to exclude the possibility that this Google "error" is prompted by misconfigured web servers.

You would be shocked as some of the silly misconfigurations enacted by commercial web hosting companies. For example:

Apparently a bunch of hosting companies have decided "404 Not Found" errors are obsolete, and started returning "403 Forbidden" responses when browsers/robots requested non-existant files from the web host customers. Unfortunately (for their customers) Googlebot interpreted those respones differently when it comes to robots.txt. "404" meant "no restrictions, come on in", while "403" meant "stay the hell out". So a bunch of customers who didn't know anything about robots.txt (and shouldn't need to) suddenly got their sites kicked from Google, because their hosting company confused Googlebot.

(Google has, in fact, recently changed their policy on 403 errors because of mistakes like this.)

He was too writing about PageRank. on Modern Day Search Engine Manipulations · 2002-08-14 18:19 · Score: 1

Funny, but I read an article that was talking about the ranking of pages on Google, not on the PageRank algorithm.

Here's how the original article author described Google's ranking:

The Google ranking technique, in a nutshell, is that every link provided to a site is a vote for the site, with the weighting of the vote being determined by the number of votes that the voting site itself has received

Compare that to Google's definition of PageRank:

PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."

They're the same explanation. Whether he called it that or not, the article author was writing entirely about PageRank. His theory is all about links (and the incorrect assumption that being on the same domain is a link). His article is about PageRank, and to pretend otherwise is deceptive.

This whole discussion is about the obscurity of Google's search result ordering, and how people are taking advantage of it (such as in the URL), and here you come coming to save the day and explain how the PageRank algorithm explains it all away. Oh, but wait, no it doesn't...it's only one of the many mysterious machinations at Google...Pick an argument and stick with it.

I have one and only one argument. You're just not bright enough to understand it. The argument, again, is: The links to a site are not the only criteria determining a page's placement in search results. It's not even the first criteria! (Go read this paper, and you'll see that Google does text ranking, then factors in PageRank.) That guy's entire theory is based on a flawed assumption.

quit spamming Slashdot with your link

I didn't include a single link in the post you're replying to.

And you're an anonymous, disingenuous idiot. on Modern Day Search Engine Manipulations · 2002-08-14 17:52 · Score: 1

He GAVE SPECIFIC EXAMPLES of the "page-to-page" relationship being disproven

He gave two bogus examples. Two searches on oddball terms where AOL and Geocities pages rank well isn't proof that pages gets high ranking from their hosts. I could come up with thousands of instances where AOL and Geocities pages don't appear in the top 10.

Do a search on virtually anything, and a good portion of the results will come from aggregate sites: DO SOME TESTS, MORON.

I've probably done a great deal more tests than you or the guy who wrote that article.

Random Test #1: 24,200 results from AOL for the word "beer", but the top ten results for "beer" include no AOL listings.

Not-As-Random Test #2: 22,100 results from AOL for the word "Ford", but the top 10 results for "Ford" include no AOL listings.

Even-Less-Random Test #3: 3,020 results from AOL for "Britney Spears", but not a single result from AOL in the top 10 results for "Britney Spears" that the original article fixated on.

Is are any of you Anonymous Cowards getting the point? AOL has so many members that there's probably an AOL page for almost any subject you can think of. If Google ranked pages high just for being on AOL, AOL would be topping every search. It doesn't. His theory is disproven.

This guy built a theory on a few examples, and didn't even bother to try to disprove his hypothesis. He didn't even spot the clue in his "Britney Spears" search. He's a nitwit. He doesn't know how to collect evidence and he doesn't understand two key facts about Google:

1) Google does text analysis, too, and it does it before taking PageRank into account. That page that tops the "Ford transmission" search really is entirely about Ford transmissions. It probably scores well in text analyses.

2) Not all searches are equal. If you use combinations ("ford transmission") or obscurer terms ("mantid"), it's easier for a small page to place high, because there's less competition overall.

Gee, it's not like something could have changed in 3 years.

I didn't say nothing has changed in three years. I said there's enough in the published papers to show that pages don't get their PageRank from being on AOL.

Many people are still unsure what the effects of domain names are in Google rankings, yet clearly they have a profound effect.

Hmm, let's see: If I wanted to know the effects of domain names in Google, where would I look? How about the very paper cited in the WebmasterWorld discussion you linked to? It says, among other things:

There are two types of hits: fancy hits and plain hits. Fancy hits include hits occurring in a URL, title, anchor text, or meta tag. Plain hits include everything else.

So, again, the Google team said back in 1998 that keywords in the URLs (and thus, keywords in domain names) are more significant than keywords in plain text. It's amazing what you can learn when you do the background reading.

PageRank is NOT A PUBLISHED ALGORITHM [webmasterworld.com].

You know, typing in all-caps doesn't actually make you smarter. There are actually two papers that have the basic algorithm published in them. It's been tweaked since then (probably, mostly by tweaking the dampening factor on pages flagged as problematic by other algorithms), but it's still the same basic equation.

(While we're at it, if you actually read Webmasterworld on a regular basis (instead of pulling that link out of Google, like I'm sure you did), you would know that not a single sane, professional webmaster in the world believes that AOL sites get the gigantic boost that the original article author claimed. There's been a gigantic amount of research done on this subject by a considerable number of professional search engine optimizers, and every single one disagrees with this nitwit.)

Re:not right on Modern Day Search Engine Manipulations · 2002-08-14 14:30 · Score: 1

As the article stated, the empirical evidence clearly shows that obscure little pages earn the base ranking of their encapsulating site (there was a mantids example as well as a Ford Transmission example : Neither of which can be reasonably explained by actual links to them)

That's not empirical evidence; that's two anecdotes and a flawed premise. PageRank is not the only criteria for ranking pages. Pages get ranked by text analysis first, then PageRank. Those pages are probably ranking high because they've got good text scores on the subjects, either because they use the keywords on the page, or other sites link to them using the keywords.

If you use the Google toolbar to check the mantid search results, you'll actually see lower-PR sites in the results ahead of the page this guy is fixating on. That is empirical evidence that PageRank isn't everything, and that high-ranking results aren't necessarily the highest-PageRank pages, especially for obscurer search terms.

In fact, it's not a big deal to have brand-new pages, with only one or two incoming links land in the top 30 results at Google, especially with oddball searches like "ford transmission". (I did it last month for the term "lens filters".) In situations like that, it only takes a few good links (one of the transmission page's links is from About.com) to push a page into the top 10.

AOL has the largest userbase in America. If all it took to hit the top of search results was an AOL page, AOL would be the top of every search!

My guess is that it's a problem with IP numbers. on Modern Day Search Engine Manipulations · 2002-08-14 14:15 · Score: 1

While Google gives Shavlik extra bonus points for those looking for Britney Spears,

Google doesn't give points for "looking", it gives rank for "linking". Shavlik has a high PageRank because lots of people have linked to it (or Google thinks they've looked to it).

it seems likely that they probably also apply those bonus points for any other search as well : i.e. If Shavlik puts up a page on monkey mating, they'll start off with a very high score due to their Britney Spears earned bonus.

Yes, a page's PageRank affects its ranking in all relevant searches, but the point is that Google shouldn't be showing Shavlik in a search for "Britney Spears" at all, unless it has indexed some text that associates Shavlik with Britney Spears. Google results are a "two-pass" system: Google analyzes text to find which pages are relevant to a search, then uses PageRank to finish ranking them.

In fact, looking at Google's cache of Shavlik's home page, we can see that Google thinks other sites are linking to Shavlik using the words "Britney Spears". That is why Google is associating the site with Britney Spears.

There is some anecdotal evidence that Google's robot (Googlebot) can get confused when IP numbers are reassigned. Googlebot caches IP lookups longer than normal: If an IP number gets reassigned from one domain to another, Google (temporarily?) thinks both domains are the same site, and mixes up their listings. Given that shavlik.com is being confused with a defunct domain, it may have accidentally inherited the IP address of britneyspearsnow.com.

(I have now used the words "Britney Spears" more in this post than I have in any conversation in my life.)

Nitwit didn't do the reading. on Modern Day Search Engine Manipulations · 2002-08-14 13:57 · Score: 0, Flamebait

He says:

I do not have access to Google's page ranking technology, and apart from some partial details on their site, they keep their ranking techniques tight lipped to avoid intentional rank manipulating. As such, everything I say in this article is purely speculative based upon analysis of search results for various terms and phrases

No details? They published the algorithm in 1999! If he looked it up, he would have understood PageRank is a page-to-page relationship (not site-to-site), and avoided the idiotic statement "Is it really a democracy that every page on these megalinked aggregate sites become premiere voices of their topic?".

Apparently, this moron didn't even search Google -- the paper is the third result for a search on "PageRank". Why are we taking search engine advice from some imbecile who doesn't use search engines?

The Google team publishes more inside information than any search engine. There's a whole ODP category for Google research papers.

To put that into perspective, there are some 750 pages dealing with mantids that are linked from Google, and that limit is simply because that's the maximum results that Google will return for a particular search term.

That's not even true. Google will return up to 1000 results in a search. Can this guy even count?

There are a lot of better resources about Google on the Web. Why did Slashdot go with this guy?

Re:disposable == cheap? on Hop-On Hops Back On the PR Bandwagon · 2002-07-29 13:31 · Score: 1

Will the signal strength be any good, especially in rural areas, with these disposable phones?

Hop-on's phones (like most of the budget networks) is digital only, meaning reception will be non-existant in most rural areas. The digital networks are still concentrated in urban areas and highways connecting them. (Serious nationwide coverage still requires a dual-band phone that can switch to analog mode in the country.)

Since Hop-On is planning to buy access to an established network (probably Sprint or Verizon, according to the CNN article), signal strength and coverage probably won't be much different than the major digital carriers. Just don't expect it to work in the middle of the desert.

Re:Jurisdiction? on Latest UDRP Stupidity: Unix.org, Canadian.biz · 2002-07-11 04:19 · Score: 1

The UDRP has no built-in appeal process. One go-round is all you get. On the other hand, Paragraph 4, Section K says the UDRP doesn't negate your rights to use regular courts. That's pretty much just stating the obvious: ICANN isn't really a government, you know.

In that ICANN lacks an army or a police force of its own, national governments have a fair amount of authority. While they can't neccessarily make ICANN do what they want, they can exert legal control over registrants, registrars, and registries within their jurisdictions.

So the Egyptian government would have potential authority about any case involving an Egyptian resident (as complainant or respondent), and a lot of control over the .eg top-level domain. (Worth noting: Most of the ccTLDs aren't using ICANN rules for domain disputes yet. ICANN is trying to get everyone on board with the UDRP, which many ccTLD operators see as an American powerplay.)

In the United States, the Anti-Cybersquatting Protection Act lays out the laws about domain disputes, and can be used to challenge UDRP decisions in court. Since the com, net, org, and biz registries are US-based, U.S. courts are now the main venue of appeal for UDRP decisions.

Re:Government challenge? on Latest UDRP Stupidity: Unix.org, Canadian.biz · 2002-07-11 03:53 · Score: 2, Interesting

Possibly a poor choice of examples. Anheuser-Busch actually does claim to have exclusive rights to use the word words "king" and "kings" when referring to beer in the United States. They smacked-down a beer store in Oregon this year for using the phrase "Beer of Kings" in a advertisement. As the market owner puts it in the article, "Basically, she [AB's rep] told me that anything to do with beer and kings, they owned".

(In Europe, "Beer of Kings" is actually the centuries-old slogan of Budjovick Budvar N.P, the Czech brewery that Anheuser-Busch stole the name from in 1876. Budvar didn't officially give AB permission until 1911, meaning Anheuser-Busch built its empire on trademark infringement. It's a small irony, but a painful one.)

That's a completely bogus question! on What's It Like to be Google's Boss Techie? · 2002-06-20 09:43 · Score: 1

You don't understand how Google works at all, and you've created a completely useless hypothetical situation. Submitting that question to Google is a waste of time.

Google doesn't rank "sites", it ranks "pages" (URIs), and it doesn't rank them according to "hits". Google ranks a page according to the number (and type) of links to the page, combined with analysis of the page content.

In fact, topic-based pages often beat "news" pages in Google results because they're more stable. Pages that stick around longer generally acquire more incomig links.

Re:opting out on The Wayback Machine, Friend or Foe? · 2002-06-19 11:12 · Score: 1

Those instructions are seriously messed up. Those META values will keep Google (and at least a dozen other search engines) from including a page in search results. You'll make the web page completely "unfindable" in the major search engines.

Web page authors who want to prevent a page from being archived/cached should use this tag:

<meta name="ROBOTS" content="NOARCHIVE">

That'll stop Google, the Internet Archive, and a couple of other caching sites from caching the HTML file, while still allowing them to index it for other purposes.

(As an aside: "stealing your pages"? Yeah, right. Have your pages disappeared from your server? Is the Wayback Machine claiming they wrote your pages? Get a grip.)

Archiving since September 1996 on The Wayback Machine, Friend or Foe? · 2002-06-19 10:59 · Score: 1

I'm killing 2 quotes with one fact:

"where did they get such old copies of my websites"

and

"I know for a fact that they have pages back at least as far as 1996"

ia_archiver (the bot that collects files for the Internet Archive) was unveiled in September 1996, just a few months after the Archive was founded.

Here's a a copy of the original robot annoucement from 5 Sep 1996.

Alexa ~= Wayback Machine on The Wayback Machine, Friend or Foe? · 2002-06-19 10:50 · Score: 1

The Internet Archive and Alexa were founded more-or-less simultaneously by Brewster Kahle in April 1996. (I'm really surprised you haven't heard of Alexa. It's old news by now.)

Alexa crawls the web with a bot named ia_archiver as part of their site analysis. archive.org and alexa.com are legally separate organizations, but Kahle runs both, and Alexa still donates a copy of everything they crawl to the Archive.

On the other hand.... on Tech Support Getting Even Worse · 2002-04-29 12:57 · Score: 1

I used to do tech support for a domain registrar that moved its support department from the IT side to the Sales/Marketing side. After years of telling us that "TSRs people don't do sales, tell the customers to order online", Sales decided that TSRs do do sales, but didn't bother to give us any sales software (or real sales training). If we convinced a customer to buy a .biz domain, we had to log onto the customer site and use the customer's password to order it for them.

Then, a month later, the company started firing TSRs who didn't meet a sales quota that they never told the TSRs they had. Bastards.

(For the record, I'm not bitter. I'm consumed with hate and anger. My therapist says there's a difference.)

In fact, farther down in the FAQ. on Google Releases Web APIs · 2002-04-12 08:31 · Score: 1

Question 15 of the Technical FAQ is:

15. What if I want to pay Google for the ability to issue more than 1,000 queries per day?

Google is only offering the free beta service at this time. If you would like to see Google develop a commercial service, let us know at api-support@google.com.

So, yeah, they're definitely interested in it if the developer community is interested in it.

Searches coming from Canada are affected, too. on Google Publicizes DMCA Takedowns · 2002-04-12 08:22 · Score: 1

Scroll down. The DMCA notice is at the bottom of the results page.

In theory they would only need to remove the 'offending pages' from the .com results since the DMCA or whatever laws don't apply in other countries... in theory

And the DMCA does affect Google sites outside the U.S., because they're all drawing their results from a database compiled within the United States.

It doesn't matter where the end-user is. American servers == American jurisdiction.

Slashdot Mirror

User: mbauser2

Comments · 78