The Anti-Thesaurus: Unwords For Web Searches
Nicholas Carroll writes: "In the continual struggle between search engine administrators, index spammers, and the chaos that underlies knowledge classification, we have endless tools for 'increasing relevance' of search returns, ranging from much ballyhooed and misunderstood 'meta keywords,' to complex algorithms that are still far from perfecting artificial intelligence. Proposal: there should be a metadata standard allowing webmasters to manually decrease the relevance of their pages for specific search terms and phrases."
as Natalie Merchant says: "Because the night belongs to us"
This sounds like a good plan but i dont think anyone would be willing to risk having their page show up lower in a search when someone was intending to find it. Plus anyone that finds the page in a search by accident is just a new potential customer.
t to the hirdpost
NO AC, FIRST POST WITH INTEGRITY
+ Donald Gunth
+ Email: dgunth@quicktek.net
"Caffeine is the greatest lubricant ever created." -ESR
All the other first post attempts can suck it! This is the only first post that counts, mother fuckers!
Oh well, back to dowloading pr0n...
Pr0n K1ng
Just shitlist any site that is obviously reaching for hits? If a porn site has the words "Alan Turing" in its metadata and doesn't mention anything about Turing later in the site, list them as not being allowed to participate in your search.
Hell, an engine that did that would almost be useful.
Ok, now is that kind with the three spikes on their heads or is the kind with the really long tails that can swim? Either way I'd love to have been able to seen one of those things face to face.
Oh, wait.... Wrong "aurus".
heldlikesound
Cloud City Digital: DVD Production at its cheapest/finest
You're idea's been tried before. It got lost somewhere down there amongst all the other stupid ideas.
This post in support of tr0ll tu3sd4y. Come on board and join the fun.
Oh well, back to dowloading pr0n...
Pr0n K1ng
Google seems to do a good enough job of filtering out irrelevant responses as it is.
Proposal: there should be a metadata standard allowing webmasters to manually decrease the relevance of their pages for specific search terms and phrases.
Okay, pretend I'm a webmaster. What's my incentive to have my page show up LESS in anyone's search results?!
If someone didn't want my site, why do I care if they get it? And if someone wants my site, I don't want to take any chance with an "anti-thesaurus" that might end up excluding my site!
Well it's not as good/effective an idea as what this fellow is suggesting, but you can have a lot of fun with people based on their Referer fields. for instance, use it to just bounce them back to their queries, or bounce them to a different query (one for porn sites is always fun), or bounce them to a more relevant page, or fuck with them however you like. If you've ever had to set up Apache to block people from linking your images, you already know how to do it.
Wouldn't it be better to put more effort into describing what a site IS about, rather than what it ISN'T?
After all, if you describe your site, a good search engines will use this information well (so you shouldn't get too many erroneous hits). However, if you list your non-words, a bad search engines will just see this list and treat them as keywords!
When I first read this, it seemed like a good idea. However, it quickly dawned on me that this is a solution in search of a problem. How many people are actually complaining about too many hits to their web site?
.edu connection and aren't allowed to make a profit off of it. Otherwise you're just throwing money away.
Please forgive me for mentioning capitalism on Slashdot, but a website that receives many misdirected hits is perfect for targeted marketing. Think of the possibilities: if your web site is getting mistaken hits for "victor mousetraps," sell banner ads for "Revenge" brand traps and make a killing on the click-throughs. With a little clever Perl scripting, determine which banner ad to show based on which set of "wrong keywords" show up in the referer. Companies will pay a lot of money for accurately targeted advertisements. Selling these ads would undoubtedly pay the whole bandwidth bill and probably make a profit to boot.
So no, unwords are not necessary. Unless you're running a website off a freebie
~wally
Not such a bright idea to whine about too much traffic on your website and then get a link to your site from a slashdot article.
Mod my comments down. It'll be fun.
If I think that this is just a retarded stupid idea.
The people whose web pages are being thrusted to the top of the query lists are the people who are polluting the metadata and other tags for the sole purpose of getting their sites higher in the search lists
So lemmy get this straight: you want all good and honest people (who aren't causing the problem in the first place) to opt-out of common searches (which they'd never want to do), and this will thus remove the legitimate entries from the pool of queries, returning an even more polluted list from your search engine.
am I missing something here?
Although there are a few people who would be helped by removing absolutely irrelivant queries, the vast majority would actually suffer if they used this.
If God gave us curiosity
when it realizes that all the TERRORISTS have to do is put the following bit in their HTML: to conceal their web-based activities....
Marking up pages with information about the meaning of the terms on them is the main thrust of the work on semantic web - see http://www.daml.org/ (for DAML - the DARPA Agent Markup Language), http://www.semanticweb.org/ (One of the main information sources) and finally the new W3C activity on the subject: http://www.w3.org/2001/sw/.
How far, how fast it will go is another matter but there's certainly a lot of interest in creating a more "machine readable" web.
.sig
you fuck him, not me. i'm not like that. fag
The main power technique, at least on google, is utilizing quotes and AND/OR to limit search results. Rather than spewing a line of text, enclosing specific "phrases" often gives more accurate results.
Then again, I have been able to simply cut n' paste error messages into the groups.google.com form and immediately receive accurate, useful hits. I think that though the internet and webpages and generally disorganized and uncentralized, an outside entity can impose order given enough bandwidth, time, energy and intelligence. In the future, web services, probably based on CORBA and SOAP, will allow sites to return messages to searchers or indexing services, thus doing away with a lot of the mystery in the current system.
All that said, I have had excellent luck with google finding about 95% of all the information I have searched for in the past couple months, showing that a well-written spider and intelligent classification and rating can circumvent the problem of so much untagged, nebulous information.
The internet is something like the world's largest library where anyone can insert a book and random organizers may (if they wish!) go through and make lists, hashes and indexes of the information for their own card catalogs. Right now, each search service maintains its own separate list! The crawler is like a super-fast librarian who can puruse the book. The coming paradigm will be fewer, more accurate and useful catalogs along with books that "insert themselves" into these schemes intelligently and discretely after a validation of informational content.
I reckon his site can handle the superfluous hits.
My friend found that one of the highest things people were finding his webcomic by was "Digimon Porn"... And his comic has no "digimon" or "porn" about it...
With all the terrabytes a day coming into the Wayback Machine (http://web.archive.org), plus the tons and tons of stuff they have from ancient times (as far back as 1996!) it would be awsome of it was searchable. Even some kind of mundane type of search. Sure, Google's index is great, but this blows Google way out of the water. I've found sites in there I made in middle school and never wanted to see again, but data is data.
NerfOnline - Because Nerf Guns aren't just for kids -
[nt]
For example, if I'm looking for info on a Toyota Supra and too many Celica-related pages come up, I'll type:
toyota supra -celica
On a related note, does anyone feel that Google's built-in exclusion list of universal keywords (a,1,of) is really aggravating when Google excludes those words in phrases?
I just heard some sad news on talk radio - open source hero Mike Bouma was found dead in his San Francisco home this morning. There weren't any more details. I'm sure everyone in the Slashdot community will miss him - even if you didn't enjoy his work, there's no denying his contributions to the open source comunity. Truly an American icon.
If you replace <meta="keywords" content="mickey mouse"> by <meta="nonwords" content="bestiality mouse-fucking zoophilia kinky ....>, you might draw more Disney lovers and less perverts to your site, but I suspect your HTML file will grow quite a lot bigger ...
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
It not only could it be used to make some pages better but it would also be interesting to see how it would dumb down legal jargon such as laws to see if the average person can read them without banging thier head against the wall repeatedly over a parking ticket
From: frankie3327@aol.com
To: staff@cs.here.edu
Subject: help!
i have a lexmark 4590 and it wont print in color.
it only makes streaks. also the paper always
jams. how do i fix it? please reply soon!
The senders never had any connection to the college or the department. We'd reply telling them we had no idea what they were talking about, and that they should seek help elsewhere. It was rather annoying.
We eventually figured it out. The department web site maintains a collection of help documents for users of the systems. One of them talked about how to use the department's printers, what to do if you have trouble, etc. At the bottom it listed staff@cs.here.edu as the contact address for the site.
You've probably guessed it by now. That page came up as one of the top few hits when you searched for "printing" on one of the major search engines (I forget which one). Apparently lusers would find this page, notice that it didn't answer their question, but latch on to the staff email address at the bottom, as if we were an organization dedicated to helping people worldwide with their printers. Furrfu!
I think we reworded the page to emphasize that it only applied to the college, and we haven't received any more emails lately. But if we could have kept search engines from returning it, that would have been even better. Since in our case the page was intended for internal use, we don't care whether anyone can find it from the Internet. Our real users know where to look for it.
So in answer to your question: When a search engine returns a page that doesn't answer the user's question, the user will often complain to the webmaster. That's a clear incentive to the webmaster not to have the page show up where it's not relevant. Also, it's not the goal of every site simply to be read by millions of people; some would rather concentrate on those to whom it's useful.
So I would suggest that he could think about checking the refferer as this site is showing and maybe directs all users that come from a search engine to a page where he offers a search engine that is limited to his site. Since the referrer also includes the whole search string he could maybe even use it to fill out his search form.
I would even prefer this method because it often happens to me that I enter a site via link from a search engine and then I find out that the result page is just a part of a frameset and its missing properties like Javascript variables. If I would redirect search engine users to a defined starting point on my site they would have less troubles (Don't start a disscussion about the sense and use of frames here :-) )
THE OFFICIAL TACO-SNOTTING FAQ
By The WIPO Troll, $Revision: 1.9 $
What is "Taco-snotting?"
Good Lord. And what is a "Circle-snot"?
Ewwwww. Why have I been receiving emails from CmdrTaco asking me if he can Taco-snot me?
I can't stop receiving these emails from CmdrTaco!?
Have you ever been Taco-Snotted?
That's horrible. Does "Taco-snotting" have anything to do with CmdrTaco's "special taco"?
Does Jon Katz get involved in any of this? I thought he was a paedophile, not a homosexual.
What's that screaming I hear coming from your basement?
No, thanks. I'm already CmdrTaco's boi toi.
________________________________________
READER COMMENTS
by egg troll on 2001.11.18 22:27 (#2582054)
Having masturbated *twice* to this post, I'm still incredibly aroused! Come over for a Taco Snot. I'll be wearing my crotchless Clifford the Big Red Dog outfit!!
For more info check out this
by Anonymous Coward on 2001.11.18 12:03 (#2580822)
add more links to goatse and to cowboineal's site to make it better. a link to rotten.com would be nice too
by Anonymous Coward on 2001.11.18 12:18 (#2580832)
and a link to michael's site and to jon katz's site if he has one and homo's site. i dont know what else to say. maybe a few links to phallic.org they have nice penis pictures! a link to the planet quake site or whatever. really make the reader feel this faq really answers their questions. oh yeah, and when you talk about cmdrtaco snotting you, say he brought you to "orgasm after sweaty orgasm". describe it more is all i'm saying. and use more italics and bolding! and when you talk about jon katz shitting or whatever have a link to fecal japan on rotten.com
other wise a great job wipo troll! keep up the good work!
by Wil Wheaton on 2001.11.18 6:41 (#2580438)
Hi. Let's be buddies.. butt buddies.
--
WIL WHEATON DOT NET
by dead_puppy on 2001.11.18 5:33 (#2580342)
Here is an e-mail I received a week ago:
From: malda@slashdot.org
To: puppy_dead@hotmail.com
Subject: were where you last friday?
I thought we where supposed to meet at Backdoor's at 8-ish, sugar-lips? You could've at least told me that you could'nt make it! I was even in my favorite pink skirt for you, honey-cup... next time, you could be more considarite and tell me you cant come... bastard.
--
CmdrTaco (malda@slashdot.org)
You finding Ling-Ling's head?
by Big_Ass_Spork on 2001.11.18 4:53 (#2580300)
I do it wrong
Laying here in the shadows of my room, I squint up at my love. My Ms. Portman. I am sore and tired after fucking her for eight solid hours. My chapped and aching dick is soaking in grits to relieve the pain. She gets on her knees and starts lapping the grits up out of the bowl. She places her beautiful hands on my penis and starts to lick the grits off my achy piece.
Massaging my nutsack she....
WAIT, I DO IT WRONG!!!!
Yanking my dick out of her mouth I throw her to the ground and shove it in to her gaping freshly fisted ass. [goatse.cx]
"OH BIG ASS SPORK!! Fuck my ass, fuck my ass good. DEEPER, my stallion, deeper!! Make a Beowulf cluster of sperm on my back!!"
"Imagine a Beowulf cluster of this baby!"
I DO IT WRONG!!!!
---
All your Sporks are belong to Big_Ass_Spork! What you say?! All your Sporks are belo... forget it...
by j0nkatz on 2001.11.17 22:54 (#2579596)
I just heard some sad news on the radio -- famous queerbait Rob Malda was found dead in his Holland home this morning. The details were a bit hazy, but it seems that he drowned in jizz while Taco Snotting his friend Hemos. I'm sure everyone in the
I wanna Open Source sex so it won't be worth a shit either.
by Anonymous Coward on 2001.11.15 6:38 (#2567601)
No no no, the correct term for that is "donkey-punch". I have eye-witnessed this amazing eye-popping event demonstrated on unsuspecting hose-monsters by my frat brothers in the past.. .
by AbsoluteRelativity on 2001.11.15 5:31 (#2567457)
The WIPO Troll
Slashdot and the Karma Lottery - News for uber monkeys, by uber monkeys.
by Anonymous Coward on 2001.11.13 9:27 (#2557632)
Oh, man that's just sick !
by Anonymous Coward on 2001.11.13 9:03 (#2557604)
TELL ME WHERE I CAN GET AN ANONYMOUS proxy please WIPO Troll. Maybe later i will join you in a snotting at my place.
by vikool on 2001.11.13 7:43 (#2557495)
what is this bull shit,i feel offened that some people feel so so senseless to post stuff like these esp when such a tragic incident has occured
by I.T.R.A.R.K. on 2001.11.11 22:38 (#2551890)
Where the fuck do I sign up?!
- I throw rocks at retarded kids
"Adequacy.org: Where congenital stupidity is not an option, but a requirement."
by Anonymous Coward on 2001.11.11 21:53 (#2551753)
this shit is hilarious..keep up the good work.
by rockwood on 2001.11.11 21:49 (#2551746)
OMG! That is the most disgusting thing I have ever heard! WHo in their right mind would sit down and waste the time to construct such a replusive story. I guess I'll be skipping lunch and dinner today.. and possibly tomorrow also. The game doesn't affect reality. Reality affects the game.
by Anonymous Coward on 2001.11.11 14:43 (#2550701)
dude, this is crap-flood material if i ever saw it.
duuuuuuuuudddddddddddddeeeeeeeee.
by Anonymous Coward on 2001.11.11 8:16 (#2550266)
horny_rob_6969@hotmail.com
Ah, so that's what the alt.binaries.pictures.erotica.horny-rob newsgroup is about!
by egg troll on 2001.11.11 5:34 (#2550024)
+5, Arousing
For more info check out this
by Anonymous Coward on 2001.11.11 4:39 (#2549891)
WINNER>
by Anonymous Coward on 2001.11.11 4:37 (#2549887)
I love you. Why do you use your bitchslapped account, rather than signing up for a new account to post at +1 before getting bitchslapped by the censors here? I guess I should speak for myself, but I don't want to log out and lose all my slashdot customization properties, nor do I want to lose my 50 karma yet.
by Anonymous Coward on 2001.11.09 9:19 (#2542412)
you fucking rock! right down to the expanded cvs id!
WIPO trolls > linux
________________________________________
J. Wipo Troll, Esq.
Crapflooder Associates
Slashdot.org
Someone quick!, I have a program due in PROLOG in about 5 hours!
ok, I just need to convert a string to all caps so I can compare it to its reverse (simple palindrome program)
I've gotten everything to work except converting the string to all caps, or all lowercase, or finding a caseless compare statment. 1 of the 3 will work and save my ass.
Thanks for the help!!!
... you could just get people to switch to Google instead.
On my idea notepad I said this:
"Technique to negate words in a document for increased searching. For instance, include files that cause a phrase like 'How we converted to XHTML 1.0' to show up on every page. Only the page with actual information, should show up in search, not every page with the include file."
[news for me, stuff that doesn't matter]
To further clarify, search engines should search for patterns of words wich indicate it is being over-used. May be very difficult, but I think recognizing include files/libraries might be feasible.
[news for me, stuff that doesn't matter]
Extensions: Unless you are modifying the java interpreter, even the 'core' libraries (on my platform, anyway) must be in the classpath. So 'extending' the language consists of putting a jar file in the classpath. C# has the same thing, called the global assembly cache. - now, before you say, yes, but you have to add a reference to it, I want you to remember that you have to reference every assembly you use, including System.dll - there is a (customisable) set of references appended by default by the c# compiler.
Dynamic class loading: you skip over Reflection everywhere, as far as I can see, and here is no exception: I have written an app that finds all the .dll's in a directory, instatiates each class in those dll's that implement an interface or have a certain (custom) attribute, and then calls methods and responds to events from those classes. It is possible, using reflection's emit classes to have your code write those classes before calling them. I have used this same thing to accept url's of web services to call them dynamically (for testing). How is it possible you missed something so major to the language? (check out Assembly.Load(), Object.GetType(), and Type.Invoke..)
It makes me wonder if I can trust the research done on the rest of the article. Thanks for the effort, much of it is very well written... but if I can't trust it all, it's not much use to me.
Sincerely, Mike Bouma
"Officials acknowledge that there are very few examples of terrorists actually using public records to glean sensitive information, but they say that the terrorist attacks prove the need for extraordinary caution."
"We have to get away from the ethos that knowledge is good, knowledge should be publicly available, that information will liberate us," said University of Pennsylvania bioethicist Arthur Caplan. "Information will kill us in the techno-terrorist age, and I think it's nuts to put that stuff on Web sites.
"Indeed, chemical and water industry groups are lobbying the Bush administration to curtail regulations providing public access to the operations of public facilities, data that environmentalists say are critical to ensuring safety."
I use filenames all the time on google to find what I want. Sometime's I get lucky and find the file in a directory, with many other files related to the files I am looking for. Another added bonus is I don't have to wade through annoying banner ads or popup windows.
If someone wants to commit a violent act, they can easily succeed WITHOUT a "how to" manual. They may not get away with it but that hardly matters if the violence results in deaths.
Take away documentation on bridges, buildings, weapons and whatever you want. They'll ALWAYS figure out another means of attack that wasn't considered.
In fact, the current state of affairs can be considered a side effect to their attack that the terrorist probably hadn't considered but is surely welcome news to them regardless. Terrorism has infected America and its affect is spreading from within. Terrorists attack our way of life. We'll destroy our way of life by trying to protect ourselves from another such attack.
How about this: Let's just completely dispose of the Bill of Rights, right now, in the name of national security! I mean, really, we may all die because of the freedoms it allows. Do away with freedom and we'll live forever. Freedom isn't all that it's made out to be anyway. Take Cuba and China for example. They're wonderful places to live. All the people throughout history that died fighting for their freedom must have been idiots, huh? The people that died for America's freedom and ultimately the Constitution and Bill of Rights. What a waste when all they've done is ensure our death at the hands of someone that has learned to build a bomb from publicly available information.
I prefer to die free, fighting for freedom, than to "live" shackled and bound.
The problem isn't' information availability. The problem is how we treat each other that can infuriate someone to the point of hatred.
Given a particular word on a particular website, it's fairly easy to decide if it's relevant or not. How? By looking for links to that website from other websites which mention the same word. That's the idea behind Teoma and a number of other search algorithms. Sites which "unintentionally" get hits for unrelated topics simply don't register on these engines. Link analysis provides much more accurate metadata, because it's based on other people's opinions.
Another problem with metadata in general, of which spam is but one symptom, is the fact that creators of content often have no idea of how their content appeals, or fails to appeal, to other people. Did Mahir have any idea that his name would become a top-ranked search term? Does anyone have any idea how his content should be ranked for a given search term (besides number one, of course)?
What is the number one piece of metadata found in spam messages? This is not spam.
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
We presume to live in an open society, where both personal and information freedom are allowed and valued. We must all simply accept that the risk to such a society is that those freedoms may be abused by some.
The only remedy is everyone's constant vigilance that those freedoms are taken advantage of OR unduly restricted, if we want to maintain our open society. It's not a cliche, it's a truism.
The approach of restricting such public infrastructure information is hopeless. It cannot work. A few years ago one of the major scientific journals looked at possible vulnerabilities to terrorist attack in North America. They came up with a laundry list of infrastructure weaknesses that would be both crippling and impossible to defend against without restructuring the entire continent and imposing a security-obsessed state.
My favorite was the electrical grid. A key type of transformer - of which there are not many in demand and thus are not easily replaced - apparently normally sits out in the middle of nowhere with only a chain-link fence protecting it. One person with a deer rifle and a single bullet could destroy it.
Such a risk is impossible to reasonably defend against at this point. You just have to accept and realize it exists.
On a related subject, I've been looking for a domain name that is a) easy to remember and b) does not generate a zillion hits if you type the name in a search engine. (and c) is not a silly long string of words).
:o(
;o)))
It's funny how most people thing that common word domains are valuable, but forget that if you have a name that, when typed into a search engine, jumps out as the only result is pretty valuable too. Especially if it sounds like it is spelled.
Maybe not the best example, but since the 4 letter TLD's are practically all gone, I was going to register duxo.com. Unfortunately one of the many domain hogs got it the day I was going for it.
I got an other one though, but it's not up yet so I won't tell what it is!
Movies, such as "The Dambusters", are clearly usable as training material for further attacks, and must be withdrawn, lest the secrets of how to utterly obliterate the landscape escape the control of the UK and US Governments.
Other movies, such as "Robin Hood: Prince of Thieves", encourage hostility to the duly-established authority. "Star Wars" even does so, openly in the name of religion! Incitement and fermentation of potential dissidents.
Photographs, from aircraft, is used in archaeology to locate buried ruins. The same technique could easily have a more sinister purpose. To prevent terrorists from gaining potentially useful intelligence, all arial photography needs to be banned!
Super-cooled and/or parallel arrays of computers can perform amazing feats of computation. To protect secret information, there is no recourse but to ban the use of electricity.
Thoughts cannot be monitored, and there are no accurate or reliable methods of profiling. The primary chemical involved in terrorist thinking is oxygen, which must now be reserved for official use only.
ObTrivia: NASA tested a satellite designed to detect intelligent life on other worlds, by pointing it at Earth. The probe returned a definite negative. Might we now consider the possibility that the probe was, indeed, correct?
I hate to see restrictions on information availability. but one must understand, that it is the unbalanced distribution of information that gives one entity power over another - Privacy advocates should not expect free access to information...
I believe total, omnidirectional, societal transparency is the way forward, given the existance of surveillance technology - rather than that, we seem currently headed to a "Big Brother" scenario, with a ruling body which has total access to surveillance of the public, but a public with no access to surveillance of the ruling body. This gives far too much relative power to the government.
A trivial example: The CCTV networks that have sprung up all over the country in the U.K. should be real-time public-access. And there should be public-access cameras in police stations too. That way, everybody can watch everyone else. It would also have fringe benefits - supposed to meeting someone? check if they're there yet by patching into the a monitoring station... I would predict that in such a society, ordinary day-to-day privacy concerns would not be much of an issue - some oddball getting off on watching people use the toilet would also know that he was being watched, and this would make all but the most strange people behave decently... And everyone would know who the wierdos were...
David Brin explores this in his book, The Transparent Society. Chapter one is available online here. I urge, strongly, that people read it before mouthing off on either issues of freedom of information or privacy, since otherwise, they may not be aware of the logical inconsistencies of their position. Don't eat yellow snow
Conspiracy theorists of the world unite! A few bigshots decided to let September 11 to happen, or even encouraged it to happen, in order to pull a quick one on the American public and legalize an incredibly powerful and invasive government reminiscent of Orwell's 1984 while encouraging a stupidity in the public. Restriction of liberties will continue until everyone of intelligence moves to Australia. Sound likely? Timmy
This article was published in the Sydney Morning Herald (an Australian news source). It's an editorial piece regarding the changing face of the Web (which, despite what some say, is still fairly dominated by US content). It details how some popular search engines are chopping and changing information to create 'a morally acceptable view of the world'. Anyone who's read 1984, that book that keeps getting mentioned now, will recognise something called revisionist history.
As for the common "this wouldn't have stopped the 911 terrorists" remarks. So? Are you saying that we should wait to implement such a measure until the terrorists realize that sensitive information is readily available to them directly from the Great Satan itself? Until its too late? Again? Do not condemn yourself to a purely reactive and therefore inefficient government. One where every policy must be written in the blood of innocents who died because it was not enacted soon enough. The government should be looking at how it "does business" everywhere and reform/restructure areas where that "business" could aid potential terrorists.
The truth is that the current government realizes that instead of sitting on its ass, like the previous administration, government needs to be continuously reforming itself. The government that the current administration inherited was badly in need of reforms that the previous adminstration had promised in 1992 but never carried through with. Multiple bureaus like the patent office, FAA, DOE, and INS are badly in need of serious reform. And they have been in need of it since long before 9-11.
Now the question is if some of these regulations are changed to reflect that the "public" may not be safe, what are they going to be changed too. For instance, if the public cannot audit waste and water facilities directly through what mechanism can they do so? Authorized public auditors? These need to be changes and reforms not restrictions on the rights of the citizenry. To do such a thing would be a step backwards.
But if we could have kept search engines from returning it, that would have been even better. Since in our case the page was intended for internal use, we don't care whether anyone can find it from the Internet. Our real users know where to look for it.
http://www.robotstxt.org/wc/exclusion.html
More hits is almost NEVER a bad thing for a site's main purpose (getting people to see it, and hopefully take an interest in what's there)
For just the same reason as the automotive industry has made clean fuel vehicles standard, and the very way our capitalist world operates. For the time (money) it takes to implement this thing to make the world a better place, the costs can not be substantiated. Granted, if a lot of sites did this, there would be more time for everyone to spend playing with their dog rather than dig through irrelevant search results. But Joe webmaster's company is never going to pay him to do it, and he's not going to spend his free time doing it when he could be spending time with his dog.
That's the way the world is working right now, and people who want to change the world to a better place will probably spend their time doing other things rather than putting unwords in their web documents.
Saving bandwidth, perhaps? For a hobbyist's website hosted cheaply (and thus having a low transfer limit), it might be quite desirable not to attract too many visitors who aren't actually interested in the site's contents. Of course, that's not a very common scenario, good search engines will give such sites a low priority anyway because they're not linked to very often.
The illegal we do immediately. The unconstitutional takes a little longer.
--Henry Kissinger
Definition: What is commonly referred to as "laser eye surgery" actually refers to two separate procedures. Photorefractive Keratectomy (PRK) was the first procedure developed, followed by the newer Laser-assisted Intrastromal Keratoplasty (LASIK). There are other procedures as well, but they are much less common than LASIK and PRK. PRK is an older procedure and is rapidly being replaced by LASIK for most patients. PRK involves using a specially tuned laser to burn the surface of the eye until it matches a predefined shape set out by the doctor. Since this procedure affects the cornea (the outer layer of the eye), it is often associated with a great deal of pain. However, the amount of pain varies from person to person, usually high-strength pain killers (codine) will be enough for most people. A more detailed explanation can be found here.
The procedure I undertook was LASIK. This surgery involves cutting a corneal eyeflap, and using the same method as PRK to alter the shape of the layer underneath the corneal flap directly. After the re-shaping takes place, the original flap is replaced. Since there are no nerves where the burning takes place, it does not involve any pain. A more detailed explanation can be found here.
Risks: The most realistic risk of the procedure is that you will have to undergo the procedure a second time. Sometimes the doctor will be conservative and under-correct the eyes. In other cases the procedure is successful, but the eyes will regress over time and require some fine-tuning in a second operation. Unfortunately the eyes need to heal before another procedure is done, and there can be several months of wearing interim prescription glasses that is different from the original one. This risk varies depending on your original prescription and the accuracy of the doctor performing the procedure.
The other realistic risk is a slight degradation in nighttime vision. The halo that accompanies bright lights at night will be enlarged for many people, but again it varies from person to person. For the first several months, it may be difficult to drive at night, and to read backlight signs. However this goes away for the vast majority of people after several months.
Another risk is termed "haze" and refers to a slight degradation in clarity, while still retaining good vision. I understand this sounds contradictory, but having experienced this myself, it does in fact occur. You retain 20/20 vision, but everything looks as though it is covered with a thin film, it is similar to a difficulty in focusing. Again, in most cases this usually does not last more than a few months.
With the PRK procedure, there is risk of infection because the surgery is done to the exposed eye. The clinic will provide anti-inflammatory drops that should prevent most problems. With the LASIK procedure there is a risk of shifting the eyeflap, it is critical not to rub your eyes for the first two weeks after the procedure is finished.
There are other risks though. In a small percentage of cases, patients will have to continue to require glasses, often with a different prescription. Also in very few cases the eyes will not be able to achieve 20/20 vision even with. According to my research, there have been no cases of blindness from anyone undergoing laser eye surgery as of this writing.
For a quantifiable estimation of risks, visit this page. Many clinics will tout results much better than this, so these should be considered very conservative.
Benefits: The main benefit it obvious: you can see! But there are countless benefits that are difficult to quantify or predict before the surgery, so I will list some of my personal favorite post-op benefits.
Sunglasses. Before it was required to get prescription sunglasses made, requiring a several hundred dollar purchase, and was restricted by style and lens type. Activities. Water-skiing, scuba-diving, swimming, many sports are possible with glasses, but are much better enjoyed with corrected eyes. Simple Things. The pleasure of waking up in the middle of the night and reading the alarm clock without fumbling for glasses is enormous. Kissing. It sounds silly, but it is really great not to poke your partner in the face with cold glasses while kissing. Others. Not experiencing the panic of losing or breaking glasses or contact lenses. In short, the feeling of having vision unencumbered by an external device is simply wonderful. And you can't quantify that! Where to go: If you decide to undergo the procedure, or are trying to decide if it is right for you, choosing your clinic is an important step. Before my surgery I visited three different clinics, and had a radically different experience in each one. The first clinic was The Laser Center (TLC) who actually denied me surgery on the basis that my glasses-prescription had shifted within the last three years. I appreciated that a great deal, so two years later, I visited the same clinic again. However, they were undergoing a complicated change-of-management so I stuck with the doctor instead of the clinic, and visited The Pacific Laser-Eye Center (Pacific). The price quoted to me by Pacific was $4,000 (CAD) for both eyes, much greater that others in the area.
So I visited a local clinic (which I will leave unnamed) for the purposes of comparison. The difference was outstanding. At the local clinic, the price quoted was $1,900 (CAD), but I was appalled at the business practices. The doctor approved me for the procedure before reviewing my medical history, and when I confronted him on this he derided it as unimportant. The success rate that he predicted was much higher than predicted by TLC or Pacific, which I read as shucksterism, not a genuine prediction. So I chose the more expensive clinic with the doctor that I trusted, even though the cost was more than double the cheaper place. If you are investing laser-eye surgery, I strongly encourage you to ensure that choose a reputable clinic. Try to find other people in your area who have had the procedure done, and investigate the various clinics as much as possible.
The surgery: I chose to have the LASIK procedure, because I preferred the personal risk of rubbing my eyes over the external risk of infection. I visited the clinic on a Thursday morning, and arrived bright and early, and very nervous. First the doctor did a final inspection on my eyes to ensure that nothing had changed in the past days since he last saw me. Then, the assistants took me into a separate room for eye-numbing drops, and awaited my surgery. When the room was prepared I went in the room and laid down underneath the scary-looking laser machines.
At this point I was exceptionally nervous, so they gave me little stress balls to grip. The first step was to cover one eye, and to put a little clamp around the other eye, which felt like a little pinch. This made me unable to blink or move my eyes to any great degree. Then they instructed me to "look at the red light" and the cutting started, as they cut a little circle in my eye around the area. It was quite scary as all of a sudden the little light I was looking at went all fuzzy - the corneal flap was removed! Now the re-shaping was ready to start, and I was strongly reminded to look straight ahead at the red light. Well at this point my blood was pumping so strong and was breathing so hard that they could not continue.
They turned off all the fancy devices, and gave me a sedative, which was great. Thankfully they stopped before I could do any damage, and felt much better after a few moments. Now, they re-attached the eye clamp, and began the re-shaping procedure. The buzzing noise of the laser, and the smell of my own eyes burning was a bit disconcerting, but it only took a few moments, and they re-attached the flap and patched me up to do the other eye. The second eye went without a glitch and I was off the operating table within perhaps 15 minutes. Afterwards, they did some tests to make sure everything was a/ok, and sent me packing. The entire procedure took only 60 minutes including prep time, my freak out, and the actual surgery.
Results: As I have stated throughout this article, my results were a resounding success. Right after the procedure, my eye doctor reported that I had 20/20 vision. Of course, I was patched up at the time, and was terrified of shifting my eyeflaps... but I could see like a normal person. These results were confirmed the next day, and again the following week, month, and ½ year. I returned to work four days after the surgery, and resumed a normal schedule after a week. I'd do it again in a heartbeat.
Note: This topic has been covered before in a different format on Kuro5hin.org, some other excellent testimonials to laser-eye surgery can be found here, here, and here.
Webmasters, however, should be careful with these new "anti-words", as when they mix with their word counterpart, a gigantic explosion results.
In the old days of the internet back when it was run by the government, you could be literally be expelled from using it if you ever did this. Now its a standard practice and many schools ban the newsgroups. This very fabric of how the internet got started and contains valueable learning materials. Why? Well thank these porn spammers! Boy, does that piss me off more then anything else. Anyway I think the indexing metadata is a good one for web searching. It will make searching for valueable data alot easier and give AOL users a reason to switch. You might hate AOL but the users I know who use it say everything is organized right in front of you at your fingertips. No searching needed. If you ever needed to do a search for something specific you can always find what you need immediately. This is quite difficult with the world wide web unless you know exactly where to look.
http://saveie6.com/
Porn sites who promote (through a variaty of means) the words "free, porn, sex" and the like and then demote "pay, fee, membership, credit card".
This proposal will not make the indexing of sites more reliable. If anything it will add to the common confusion associated with meta keywords. Yes it is quite a nice idea in theory but I can't see anyone wanting to exclude words from being searched. The main point in the proposal was that the author felt guilty about pulling in people who had entered search terms that appeared on his page. One would ask why he is publishing information on the internet if he doesn't want people to look at it. A better solution would be to get people to use search engines properly. As an example I will use the stalking on the internet term. If people put these words into google and come up with his page then prehaps they should have modified their query to something like "stalking on the internet" and they may not have found his page. On the other hand if his page contains the phrase "stalking on the internet" it migh be just what the seaker was looking for.
To this proposal I say nay. or prehaps oink.
Just in case you, the reader, don't understand why this is so ridiculous, let me try and spell it out: It is safe to assume any given website is already not about %99.999999999999999999 of all conceivable topics.
Writing down which topics your site isn't about is about as smart as wearing a nametag of all the names that definately aren't yours.
Specs and Bugs and Proc
did you have the page disallowed for search engines? if something is for internal use only, you really ought to have dropped in a robots.txt to exclude it altogether.
if more people used robots.txt, a lot of 'only useful to internal users' sites would drop right off the engines, leaving relevant results for the rest of the world...
just a thought......
Screw you all! I'm off to the pub
Surely this kind of issue is what Tim Berners-Lee and the W3C is trying to address with the Semantic Web.
The problem with content on the web today is that while it is perfectly readable by humans, it is incomprenesible to machines. If Tim and Co get their way, and I for one would love to see the Semantic Web catch on, then we can get rid of kluges like the Anti-Thesaurus, HTML meta keywords and the like.
-- "So, what's the deal with Auntie Gerschwitz et all?"
A long time ago (in a galaxy far away) I kept a playlist of my radio show. I had one page per month. One month I played Prono For Pyros "Pets" twice. Guess which web page in our department had the highest hit count for the next year...
Backups are for wimps. Real men post their data in comments and have slashdot mirror it
Presumably the same could be done for <meta name="keywords"> in HTML.
-- Ed Avis ed@membled.com
In some jurisdictions, you get into trouble if a search engine refers to one of your pages when you enter a trademark (and you are not entitled to use that trademark). This way, you could easily tell search engines not to list your pages when such a trademark is present in the query. Complying with court orders wouln't be a major problem any more.
However, you could show some information if people visit with a certain Referrer header, directing them to more useful pages. This works in the majority of cases, and it doesn't need much cooperation from the search engines.
You forgot to mention Jon Katz's "docking" games, where he places his chopper head to head with another chap, and rolls the other guys foreskin over his own circumcised end ("docking"), providing him with fantasies of actually having his own forskin ...
"Making linux GPL was the best thing I ever did" - Torvalds. I'd hate to see the worst thing...
XML is the correct way to develop search engines. You mark up your page with a standard XMLL based schema, and
For example, you may enter you car with make, color, year, price, engine size etc. on your own web page.
As long as a standard XML schema exists, I could then search, using a standard XML search engine, the net for all pages with cars for sale falling in a certain price range., certain country/state, colour etc. etc. - no B2B centre needed!
"Making linux GPL was the best thing I ever did" - Torvalds. I'd hate to see the worst thing...
Isn't this what robots.txt is for? You disallow all search engines apart from your own from indexing pages that you don't really think people outside your department will want to see. Think how long it would take to put excluded words into every page of your site when a single line in robots.txt would suffice :)
Take for example a search for the string tar, which will yield documents containing:
tar -zxf update.tgz, or cp update.tar update.old, or roofing tar , or jeg tar en øl nu
Each instance of tar above has a different meaning, but the same spelling. When you get into misspellings, spelling variations, and conjugation, then the actual concept is even harder to associate with a given range of strings.
Even Google searches are for strings and not concepts, but Google's ranking algorithm relies on which pages get the most links from pages that also get the most links. However, you'll still get different results for color vs. colour and tyre vs tire. Because the algorithm only reflects how people have chosen their links, it does, from time to time give unusual associations. ;)
Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
2. Some sites have menus on each page listing every topic on the site. You search on a word and get every page in the site returned, including those that mention the topic only in the menu. A tag such as this <nonsearchable> </nonsearchable> surrounding the menus might aid in solving this problem.
Unfortunately, these problems are always better solved by stronger search engines. Even though it is several orders of magnitude harder for a search engine to figure out that those things aren't important, it's several orders of magnitude easier to get google to do it than it is to convince 10 million web page maintainers to do it.
Jack Valenti and the MPAA are to technology as the Boston strangler is to the woman home alone
I believe that most search engines would implement
this by not indexing those words for that page.
It is the only way to do it without increasing the
load on SE. The other way, no matter how efficiently implemented, would add processing needed to produce results. This means more machines need to be added to the clusters.
Very few webmasters complain about users finding
their site because bad search results.
Most of them are happy to have traffic.
Most web sites don't have meta tags, but most web designers do want their clients to see impressive hit counts in their traffic reports. Ummm, so who thinks web designers are going to take the time and trouble to add a feature that will decrease traffic?
Oh you capitalist-thinkers. Spare a thought for Geocities/ Hypermart users who have to start shelling out money if they cross a certain hit threshold.
there should be a metadata standard allowing webmasters to manually decrease the relevance of their pages for specific search terms and phrases."
So, in other words... businesses will want to reduce their exposure on the web? I don't think so.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
Picking out the "irrelevant" words is much harder than creating tags that contain the most relevant ones, which is the main point of meta-tags. Most of us have brains that are trained to pick out what is important, not the opposite, so few people would bother to implement this. Language is hard, computers are dumb and few people have been willing to "explain" language to them to make search smarter. In other words, nothing like works on a significant scale if much effort has to go into it. Tagging important words can be semi-automated with summarization software, which will accomplish much more in terms of relevancy ranking than tagging the ones to ignore. And by the way, this proposal misunderstands robots.txt. The point isn't to conceal the existence of pages, it is to tell *robots*, not people, to stay away from them. (I'm the owner of the mailing list for it.
too much ... makes one blind. 8-)
stronger search engines
The more traditional search engines (not google?) have protections against sites that do extreme things to get to 1 in the hitlist. They have protections against repeating 1 word a lot of times. (META="sex, sex,sex"). Repeating your "exwords" in the normal meta tag so many times should trigger the search engine "spam alert" and decrease the search relevance.
There were a couple of interesting papers at the ACM's SIGIR this year that use only the anchot text that points to a webpage to get a description of the pointed to page and they could do some cool things like language translations with just that data.
Does Google even use metadata? I thought their big thing was external linking.
Those who fail to understand communication protocols, are doomed to repeat them over port 80.
I know of at least one web page that has been very carefully constructed so that search engines won't find it, but people who know what they're looking for will find it easily.
With no subject-specific keywords, however, unless you do know what the author is talking about, you won't have any idea what she's so pissed off about.
No, don't ask: I am routinely pissed off for the same reason, and will not post the URL here.
I wouldn't mind if searches for my name brought up my current web page, rather than the one I had in 1995. But that's another matter.
...laura
Matteo Ricci (he's listed in a bibliography; there is no info to speak of)
While I have occasionally found a source I needed from a hit on a bibliographic entry, one of my pet-peeves, even on Google, is long lists of nothing but bibliographic entries. Usually it's a pretty clear sign that there isn't much on the topic available on the Internet, but sometimes I just need to change my search terms slightly.
But I think nonword is a bad idea. If the website's editors decide to keep a word, and Google's page-rank technology shows it to me, I'm willing to check it out.
Well some docs are here, and the mod_rewrite reference is here.
Here is a goofy example that does a redirect back to their google query, except with the word "porn" appended to it. As an added bonus, it only does it when the clock's seconds are an even number. (Or do the same test to the last digit of their IP address). Replace the plus sign before "porn" with about 100 plus signs and they won't see the addition because each plus sign becomes a space. The "%1" refers to their original query.
Here's another one that checks the user-agent for an URL, and then redirects to it. This keeps most spiders and stuff off your pages since they usually put their URLs in the User-Agent:
Anything you can think of is possible. I think you can even hook it into external scripts.
It's even worse than a lack of incentive to decrease relevance. There's actually a strong incentive not to: advertising.
CPM ads pay the same regardless of relevence. CPC ads tend to pay *even more* for visitors who aren't interested in your content, since they're more likely to click on the ad on the way out.
I googled around a bit and found a Java applet and browser plugin that can do this, but does anyone know of a straight-up IIS service-level configuration method of disabling "image theft," much like the method for apache described in the howto above?
Links to FAQs, HOWTOs appreciated!
For a search engine at a single site, this is very useful. You watch the queries and results. If a page doesn't show up, but it should, you add the search terms to the keywords. If it shows up, but you don't want it to, what do you do? Create an anti-keyword field.
as someone who attends the school in question i can assue you that nobody on the staff is smart enough to create a valid robots.txt file.
funny, really
I don't have mod points right now, but has anyone else noticed that if you use a wheel mouse under windows, you do your mod, and then you "wheel down" to click the moderation button. If you don't remember to click away from the mod box, you end up given the poor person a completely different mod than you intended.
Maybe this is only an Opera issue?
Waltz, nymph, for quick jigs vex Bud.
There's a standard evolving, but nobody's using it.
http://dublincore.org/
-- Ender, Duke_of_URL
IANAL and I don't have specific knowledge of this occurring, but really, what's to stop it from happening?
My suggestion to anyone is that they develop three good domain names that they would be happy with. But for god's sake, do it *offline*! Don't search for them, don't try them in your browser, and don't tell anyone what they are. *Then* just go register one or all of them. Don't wait, don't search, and don't even breathe until they're yours.
Oh, and don't forget to trademark the language in those URLs (can't be plain English remember). If someone sees your new URL and likes it, they could register the TM if you don't. Then they can sue you for ownership of the domain, since you're clearly infringing on their TM; and they'll probably get the domain in the end.
Hey, I don't make the rules...
And my favorite word today is don't.
Please mod this post only if you think others should/n't read this. I have enough ego^H^H^Hkarma. Thanks!
But search engine spammers can do the same thing: buy a bunch of other sites and put links to their target site.
Table-ized A.I.
Appearently, they would prefer that people searching for "BSML" did not turn up my web page. I wonder if they've tried to get the Boston School for Modern Languages to change their name, too?
Now isn't the whole point of properly using XML and namespaces to disambiguate coincidental name clashes like this? If LabBook thinks there's a problem with more than one language named BSML, then they obviously have no understanding of XML, and aren't qualified to be using it to define any kind of a standard.
Maybe LabBook should put some meta-tags on their web pages to decrease their relevence when people are searching for "Bull Shit" or "Modern Language".
-Don
========
From: "Gene Van Slyke" <gene.vanslyke@labbook.com>
To: <don@toad.com>; <dhopkins@maxis.com>
Sent: Monday, November 12, 2001 10:36 AM
Subject: BSML Trademark
Don,
While reviewing the internet for uses of BSML, we noted your use of BSML on http://catalog.com/hopkins/text/bsml.html.
While we find your use humorous, we have registed the BSML name with the United States Patent and Trademark Office and would appreciate you removing the reference to BSML from your website.
Thanks for your cooperation,
Gene Van Slyke
CFO LabBook
========
Here's the page I published years ago at http://catalog.com/hopkins/text/bsml.html:
========
BSML: Bull Shit Markup Language
Bull Shit Markup Language is designed to meet the needs of commerce, advertising, and blatant self promotion on the World Wide Web.
New BSML Markup Tags
CRONKITE Extension
This tag marks authoritative text that the reader should believe without question.
SALE Extension
This tag marks advertisements for products that are on sale. The browser will do everything it can to bring this to the attention of the user.
COLORMAP Extension
This tag allows the html writer complete control over the user's colormap. It supports writing RGB values into the system colormap, plus all the usual crowd pleasers like rotating, flashing, fading and degaussing, as well as changing screen depth and resolution.
BLINK Extension
The blinking text tag has been extended to apply to client side image maps, so image regions as well as individual pixels can now be blinked arbitrarily.
The RAINBOW parameter allow you to specify a sequence of up to 48 colors or image texture maps to apply to the blinking text in sequence.
The FREQ and PHASE parameters allow you to precisely control the frequence and phase of blinking text. Browsers using Apple's QuickBlink technology or MicroSoft's TrueFlicker can support up to 65536 independently blinking items per page.
Java applets can be downloaded into the individual blinkers, to blink text and graphics in arbitrarily programmable patterns.
See the Las Vegas and Times Square home pages for some excellent examples.
Take a look and feel free: http://www.PieMenu.com
The wheels of government and commerce would grind to a halt were they not well lubricated with Bull Shit. So I created the Bull Shit Markup Language and published the BSML web page years ago, putting it on the public domain for the good of mankind. Now somebody has finally taken it seriously, and is trying to monopolise BSML!
He who controls BSML controls the Bull Shit... and he who controls the Bull Shit controls the Universe!
http://catalog.com/hopkins/text/bsml.html
Does anyone know of any prior art pertaining to Bull Shit and Markup Languages? What about VRML -- Maybe I could get Mark Pesche to testify on my behalf? c(-;
Here's a list of the huge faceless multinational corporations I'm up against:
http://www.labbook.com
"IBM, NetGenics, Apocom, Bristol-Myers Squibb, Wiley and other leaders of the life sciences industry support LabBook's BSML as the standard for biological information".
To paraphrase Pastor Martin Niemöller:
First they patented the Anthrax Vaccine
and I did not speak out
because I did not have Anthrax.
Then they patented the AIDS Drugs
and I did not speak out
because I did not have AIDS.
Then they patented Viagra
and I did not speak out
because I already had an erection.
Then they came for the Bull Shitters
and there was no one left
to speak out for me.
-Don
Take a look and feel free: http://www.PieMenu.com
'nuf said
Ok. I'll say some more. For most searches, google's algorithm does a tremendous job of bringing the relevant sites to the top of the list.
In fact, when I look for product info and don't get the manufacturer's site first in the list, I consider that a strike against them - i.e. their web presence is put into question.
"No matter where you go, there you are." -- Buckaroo Banzai
Remember the band 'The The' from the '80s. It would seem to be damn near impossible to find them via normal search techniques. :-)
I did a quick test, here are the results:
Yahoo: A (listed the band site via their web site listings; official site was 4th in list)
Google: F (quoting didn't help)
Northern Light: C (found relevant matches, but the official site was nowhere to be found on the first 2 pages)
altavista: A+ (official band site was #1 in list)
Nowadays, you need to think about "searchability" when picking the name for just about anything. That is, assuming you want to be easily found on the web.
I guess that's where dopey marketing names like 'Itanium' actually make sense. Very unambiguous search criteria.
"No matter where you go, there you are." -- Buckaroo Banzai
Imagine a company hiring hackers to break into competitors sites to put important keywords in the unthesaurus.
For example, what if you hacked 3com's site to put the words 'ethernet' and 'network' in their unthesaurus. It's unlikely that a professional company like Linksys or others would do this, but it is entirely possible.
You could argue that meta keywords should take precedence, but I'm sure the hacker would remove those words from the meta keyword list.
"No matter where you go, there you are." -- Buckaroo Banzai