This Boring Headline is Written for Google
prostoalex writes "The New York Times is running an article on how newspapers around the country find their Web sites more dependent on search engines than before. The unexpected effect? Witty double entendres, allusions and sarcastic remarks are rewritten into boring straight-to-the-point headlines that rank higher on search engines and news-specific search engines. From the article: 'About a year ago, The Sacramento Bee changed online section titles. "Real Estate" became "Homes," "Scene" turned into "Lifestyle," and dining information found in newsprint under "Taste," is online under "Taste/Food."'"
Used to be to start a fire you took two sticks of about the same size and .....
.eu domains, many of which are going to be those article-wiki type affiliate marketing sites and search engines are already crawling them. Sorry guys, but the days of putting up hundreds of pages of content and waiting for Google to do your marketing are gone.
.. nobody is going to find you.
We don't do that anymore. Just like companies that hope to market their news agencies have got to stop depending on search engines to reel in traffic. The sites that attract visitors through searches and make revenue by serving ads are established and have consumed the available market share.
To be successful doing what they do, one of them has to go under right around the time you have something similar already seeding in search engines. Its quite a long waiting list folks.
If you want to reach a niche news market you need to hit people during rush hour in their cars with radio advertisements, or find another way of luring them to your site and when they arrive your titles had better not be crafted for Google.
Look at the explosion of over a million
Don't re-write the titles, take the hint that what you're doing just isn't working. Either change your marketing strategy or re-evaluate the fiscal sanity of continuing to publish.
Insanity is doing the same thing over, and over and over again yet expecting different results. The market is flooded - get creative in your advertising and MORE creative with your content and you may enjoy some success. Otherwise the sad fact is
Go take a look at shitlance and search for "need articles, need articles re-written, SEO content author". Trying to succeed doing what they're doing is like punching yourself in the nuts until you pass out.
Completely *wrong* direction, imho.
I'm boring, straight to the point, and can't be creative even if my life was on the line. Hire me!
PHP121 Instant Messenger - Web Based Instant Messenger
Personally, I can think of nothing that would improve newspapers more than getting rid of those idiotic puns often seen in headlines...
Stop by my site where I write about ERP systems & more
Witty or sensational headlines don't just deceive search engines.
Human readers can get fooled just as easily. Heres an example:
I was doing research to show that Kryder's Law (a kind of super Moore's Law for hard disks that says bit densities have increased factor of 1000 in 10.5 years meaning a doubling every 13 months) is no longer being achieved by hard drive manufacturers. Instead I discovered that Kryders Law was just a creation of Wikipedia's overenthusiastic editors that misinterpreted a single Scientific American headline. Wikipedia editors accidentally invented the "law", and it isn't even correct.
You can read about it at my site here: http://www.mattscomputertrends.com/Kryder's.html
The search engines are dong us all a favor getting rid of this problem.
Why yes! I do taste food...
If a site's content is good, people come regardless.
Slashdot's popularity is an anomaly though...
why on earth would you write an article about the style of headlines in Google's news aggregation? it really isn't like Google is creating its own summary by mashing all the aggregated news articles together. some reporter somewhere wrote that dry headline.
An old-timer with old-timey ideas.
I wish I had mod points right now.
Some formulation of the hard disk law has been around long before the SciAm article. It seems to me that some Wikipedia author remembered such a variation, went looking for "verifiability" found the SciAm article, slapped Kryder's name onto the "Law" and voila! Kryder's Law was born!
I thought most journalists were already "creative" enough without needing to put miserable puns in their headlines.
(What the hell does one find in a "Scene" or "Lifestyle" section anyway?)
"Some formulation of the hard disk law has been around long before the SciAm article"
Yes, and that law was called Moore's Law. I think the role of an encyclopedia is to document, not invent.
to get around this difference in presentation / semantics using CSS (Cascading Style Sheets)?
For example, when I use image-based text for nicer fonts on websites I make, I can use CSS to give the tag an image, and shift the plain-text version of the content off to the side so it doesn't appear on graphical browsers. Slashdot itself uses these techniques in its HTML.
There are, in fact, lots of methods CSS provides to get around this problem, most of which are even supported by junky browsers like MSIE.
Maybe now the articles will be written in a manner which actually resemble a story rather than having a fistful of facts crammed down your throat in burst of staccato like phrases. It would be quite an innovation for the newspapers to tell stories that make you want to read them rather than wrap your fish. Might even include some room for style to enter into the picture.
For some reason my fountain pen doesn't work here.
Would it be that hard to develop a standard (perhaps much like meta-tagging), giving one set of data easily digestible by the bots (and not displayed to the human reader), while retaining an entertaining writing style for human consumption? Computers don't always have an easy time digesting data a human would find simple to understand, and vice-versa. Shouldn't that generally be acknowledged by design? (Disclaimer: I don't do much work with web design. If you do and you know why this hasn't been done or won't work, please let me know.)
To fight the war on terror, stop being afraid.
"Sex" turned into "Scatting on a midget who's being busy with a horse"
May I suggest that everyone tag this story "thankgod"
This is the doubleplusgood way to go. Using too many words can only lead to thoughtcrime.
I thought the boring, machine-readable stuff (i.e., not just headlines) was supposed to be in metadata. No need to do a hatchet job on a descriptive or witty title. Of course, I just may be an old codger in Internet time.
What's more, I thought the whole point of Pagerank was to make your page associated with what others think your page is about... that if your obituary about Gene Pitney is entitled "Tulsa star: The life and career of much-loved 1960's singer." it'll show up in a search for Gene Pitney because (hopefully) that string will be indexed from the page body and that as other people associate your page with Pitney — irrespective of the <title> that obituary will float towards the top. And if they use your witty title, not only will you get more popular for "Gene Pitney", but also "Tulsa Star" as well.
But there are unwashed masses that do use other search engines, but I thought the last people to rely absolutely on metadata were Alta Vista and WebCrawler.
One might ask the same about birds. What ARE birds? We just don't know.
I was going to do a rant about how Google hires Deep Thinkers who can crack bizarre mathematical conundrums, but na'atheless can't write a search algorithm that dredges up the history of toilet paper, but it turns out they've nailed that one. More or less. If you want the sources instead of the dissertations, it ain't so charmin'.
``Tension, apprehension & dissension have begun!'' - Duffy Wyg&, in Alfred Bester's _The Demolished Man_
Is it really a surprise to you that Wikipedia misinterpreted the headline? Your typical Wikipedia editor is no more human than a severely autistic searchbot. Humor, allusion, style, the subtle artistry of prose--it's all lost on those poor binary-thinking hypercompulsives.
More straightforward headlines are better anywyas. Creativity used for headlines is not used at the right place. The article content should be creative.
the author didn't seem to consider the possibility that readers prefer this..
i personally would rather actually know what articles are about based on their headlines, than be tricked into reading something by a misleading headline. most headlines aren't "creative", so much as they are "dishonest" in the newspaper.
i skim through my university's paper every other week, and i usually am reminded why i don't read it more often.
-- lol pwned
That's in essence what happened to BMW.
Google doesn't like you presenting different data to their search engine than the user would find if they visited. And I can easily see why. Sites would abuse the heck out of it.
See this link amongst many.
http://news.bbc.co.uk/1/hi/technology/4685750.stm
http://lkml.org/lkml/2005/8/20/95
(notice my to-the-point headline)
Really, not only is it good for search engines, it's good for my brain's relevance filter for trying to see if I care about the story the headline points to.
Start Running Better Polls
They have to write informative headlines now.
This is really only tangentially about search engines. It's really about people finding things by searching, rather than by browsing, today.
It used to be a potential reader would be standing in front of a magazine stand, or leafing idly through a newspaper. To grab that reader, a witty, slightly hard-to-understand headline was great - it catches your attention and makes you at least look closer since you want to know what that mysterious piece is actually about. And thus you made the single-copy sale, and perhaps, in time, sold a subrscription.
Today we increasingly don't start by picking up a paper and looking within for what we want; we find things by searching for what we want and end up on anyone of a large number of newspapers and magazine sites. The choice of paper isn't the start of the process - the search is. And when we search, that witty off-color headline is going to mislead us since it doesn't actually contain the key terms that would indicate relevance. Making headlines and summaries clear, straight and to the point isn't about pandering to search engines, but of adjusting to the changing behavior of the readership.
It's the reader behavior that has changed. The search engine angle is just a smokescreen.
Trust the Computer. The Computer is your friend.
I personally like them. Give me some dry wit - or "32 Scoot to Shoot with Plane Aflame" (see comments above) - over a boring summary of the facts any day of the week. Personally, I'm apt to think this is symptomatic of the decay within our society - but then again, I'm apt to think that over the latest Steven Spielberg movie as well, so go figure. Really, it harkens back to a day when those who read the paper, read the entire newspaper, and thusly would know the entire news. The headlines were there more to prepare your mind for the inevitable than to attract the reader's eye. This USA Today trend of posting full color buzzwords on the front page, so Joe Schmoe can skim it and knows what names to drop around the water cooler today, has got to stop.
-1 Flamebait out of the way, it's time to go for my weak attempt at +1 Insightful:
Wouldn't it be relatively simple for Google to allow newspapers the use of "alt" or "meta" tags for their headlines? Considering there's a small, reasonably finite number of trusted news sources, couldn't some sort of whitelist be easily implemented?
I can never find George M. Cohan to explain the unintelligible "witty" headlines to me when I come across them.
Obvious solution: use images to display the witty section names (scene) and alt text and hidden span text displaying the boring name (lifestyle). With a little work, the same could be applied to headlines.
The contexts of the medium have changed very fundamentally. Instead of (comparatively) infrequently delivered paper newspapers, readers (consumers) are on-demand access to the news source. The consumption of newspapers has become dramatically less of a literary activity and more of a computational activity: Instead of a quiet evening mulling over the news in an easy chair, more are inclined to rapidly take in as much news as is relevant and necessary. This is a natural evolution towards increased efficiency. The only thing we're losing with this adaptation is creativity, which is, in this case, effectively linguistic and textual innovation. It raises the question of how important (if at all) creativity is to news writing and reading. The intuitive answer is that it is irrelevant to the efficient consumption of news. In the "evening newspaper" paradigm, creativity is often a basic marketing tool: take the NY Post for example. How many people do you see reading the NY Times versus the Post on the subways? It's not just the (substantial) cost difference, it's the attention-grabbing, sensational tabloid headlines. The NY Post has created a niche for the masses that works the same way any sensational media does, and in these cases efficiency is irrelevant to selling the "news" in the first place. My conclusion would be that in the context of online news media, speed and efficiency detirmine pervasiveness, and therefore advertising revenue. In contrast, paper news media need not expect to be consumed as quickly and efficiently as online. Clearly, publications like the NY Post and the sprawling magazine industry draw their crowd with a certain kind of innovation and creative content (like it or not).
Limina.Log
Way back when, they use to shout the news to sell their papers. "Extra! Extra! Read all about it!" anyone? Of course, way back when, they also started a war to sell their papers...
Newspapers should focus on the news. Unfortunately, ours are trying to provide entertainment, sensationalism, titillation, thrills, and witticisms. Lets hope that, after the gimmicky double-entendre headlines are gone, we can also get rid of these other misfeatures of journalism. And, yes, the NYT is one of the biggest offenders.
Sometimes it's very satisfying to obnoxiously say "I told you so". Because this is basically what I said would happen in a comment here january last year (I wrote, among other things, about sites adapting their design -- if not wording -- to Google).
Apparently you haven't seen much of the Internet.
Animated tentacle midgit milfs... with horses?
Search Engine optimisation is a contradiction in term
How come does anybody, not to speak of web designers, get the stupid idea that one has to optimise ones website for search engines anyway? Isn't that totally backwards? I should optimise my website for *users* and their expierience and the general webstandards. If the search engine is to stupid to find content on my site that is relative to a search, then it certainly isn't my job to optimise for them. That's the job of search engines themselves. That's where the name comes from.
Guess why Altavista missed out when Google appeared. The had the more optimised search engine.
I allways thought (and still think) that so-called webdesigners that offer their customers 'search engine optimisation' (whatever that's supposed to be) to be the used-car sales and multilevel marketing lot of IT field. Some shady semi-professionals offering some non-product. Whenever I'm finished building a Web CMS Site for customers I take the time to feed the URL into the searchbots so they do the first scan of the site more quickly, but that's it. If anyone comes to me bickering about the bad search results a searchengine comes up with I usually tell them that if the searchengine sucks, they should use a different one. It's that simple, really.
Bottom line:
If you're doing *anything* on the web, forget about search engines and just build a good site. If your site is good and the search engine is good, both will find each other fast. All else is just bogus.
We suffer more in our imagination than in reality. - Seneca
This is bad news... these puns are quite entertining at times. The subject of this post is an example of one of my favorites: British Left Waffles on Falklands.
I find it hard to believe that posters don't see the value in this sort of word-play. For goodness sake, as a computer scientist, language and grammar are highly important and our wordplay sets us apart from the machine!
-Starfishprime
Come ON, people. When a newspaper has an article titled "Something Fishy About Springdale's New Winter Festival" is there ANY part of you that's fooled for even a millisecond by the pun?
It seems to have become the law that every paper must do this for every headline possible. It makes me want to rip the paper into shreds and piss on them.
Bless your little hearts, Google, if you are indeed having this effect. Give me a straightforward headline over an insipid one any day of the week.
I object to that article, and to the next reply.
...anyone should be able to read a headline and quickly get an idea of what the story's about. Much better to have some snarky news editor misleading us to get us to read their stupid story.
I, for one, welcome "boring, straightforward" news headlines. After all, it's news. Not commentary, not opinion. If I see a newspaper section marked "Scene" I'm not likely to know what it's about.
but have you considered the following argument: shut up.
Snakes on a Plane!
Truthful words are not beautiful; beautiful words are not truthful. Good words are not persuasive; persuasive words are not good
--Lau Tsu
Time flies like an arrow. Fruit flies like a banana.
Oh no! Not the Sacramento Bee! That most famous of all newspapers! Whatever shall I fucking do?
"Google doesn't like you presenting different data to their search engine than the user would find if they visited. And I can easily see why. Sites would abuse the heck out of it."
Except it's a damn lazy proxy measure of spam. Instead of measuring "WHAT", (spam or ham), they measure "HOW", (how is this text delivered). If it's not visible to a user, it must be spam, if it is visible to user, its ham. Hoping that the user will verify the text for them.
I think the following sentence needs to be capitalized:
THEY BANNED BMW FOR HAVING ONTOPIC BMW RELATED KEYWORDS ON THEIR SITE.
You can take their side in this if you wish, but I think they should collect invisible text, calculate the probability of that text being related to the on page text and either discard it or include it based on that calculation. i.e. let sites provide them with the missing information in meta tags, but test it for spam/ham.
That way, instead of having to put every related phrase to a site on a page, you can just tell the damn search engine what the page is about and the newspapers can continue to write text for their readers in the "Real Estate" section, while telling Google the section is about "Home Gardens Houses Condominiums Apartments Flats Chalets", or "Made for TV Movie 2002, Starring Jim Nabors, Nina West"
They've already ruined search engines for us with useless tossing of keywords everywhere, sometimes even giving up all facade and simply placing a few pages of every keyword they could think of (most of which are unrelated to the entire site) on the bottom of the page. Blogs aren't helping either. One guy even figured out how he could make a profit by selling the use of a bunch of blogs to generate keywords linking to a particular site and tricking things such as google's page rank.
Right now, I'm ready just to give up. We are stuck with search engines, so we may as well simply give up all hope of them doing what they were meant to do and just accept that they are still at least better than things like keyword systems (AOL) and having to rely on a web full of portals everywhere.
...or you're tired as hell from the cliche "witty remarks" and idioms in the titles?
I mean how about the limb/fin fish fossil link from yesterday. We had titles like "Fish out of water found" and "Fish gets out on a limb".
How retarded is that? How informative it is? I say yaay for search engines.
... has the worst puns and is the best online newspaper ...
and it's called cloaking. evil?
Those aren't headlines. Those are section headers. Headlines are the titles of individual articles, not whole pages or sections. All this means is lifestyle editors have one less outlet for creative expression when doing a redesign, and in return the newspaper becomes marginally more comprehensible, especially for people who don't read it on a regular basis, ie, users who find it through search engines because of interest in a topic rather than an interest in a region, which is what local newspapers are organized by. Big deal.
I prefer straight to the point titles,no bullshit,no jokes,no long words,just the facts condensed,articles in laconically written text and no speculation,embellishments,or distortions.I just don't want to read articles written to conform authors worldview,or long interpretations from authors point of view.
Relevance is what i see as important,search engines too(its easier to trick a search engine though).
sorry for bad handwriting, -t am mapkinase by the way
1. hail google
2. i do not care fir witty Section titles : they are witty when you read the section title for the f'rs7 time
3. 'I do hot care for favourite news source: I look for information where it exists So 1 do not have personal attachment for newspapers
4. loved The historial reference about the telegraph in the article
5. one Of the best news and The besr articles at slashdct that I have read in a month
Even your own example show that problem's not limited to Google or even the web. The title "Will and Grace" implies that the show is actually about a couple with the names Will and Grace. Rather, if you actually get to the content, it's a situation comedy about a fruit and his turd-burglar acquaintances who play their jokes off of the, paron the pun, straight man (in this case a woman).
Changing headlines or other descriptions to get by is nothing new, even when used to lure people into sexually deviant content like in the show you describe or in the web site you describe. Why should it be news even for Google?
As citizens of this democracy, workers in this nation, and technologist hobbyists, it's hard for all of us to find time to read anything from start to finish. So they're right on that point: the headline is often all you really get out of news. Funny thing is, I know lots of people who are more interested in Matt Drudge's headlines than the NYTimes headlines. He writes better headlines than the NYTimes. They're more timely, more revelent, and often more witty.
Stick that in your Google and search it.
http://tinyurl.com/4ny52
Don't blame Google, they use an algo that tries to pick up on the search patterns of the searchers. Blame the searchers for using such boring search words instead. If you want to see what end users are typing in for searches check out http://www.nichebot.com/, which tracks the most common keyword phrases that are searched. See if a keyword search pulls up as popular, or if some real boring phrase ranks higher.
Can I bum a sig?
http://science.slashdot.org/science/06/04/08/20172 46.shtml
By which I mean - you, the readership of /., and especially those of you complaining that witty/interesting/deceptive/crushingly unfunny headlines should be destroyed for the sake of clarity and efficiency of data delivery.. Are not the target market of that sort of journalism anyway. Try thinking of the puns as like tags, conveying extra information. They are in their own way, micro-editorials which tell you some small amount about the organisation/individual's attitude to the subject.
My second point is this would be another example of the wrongness of how we relate to technology (something Google ought to be more sensitive to than the average - look at GMail's tagline), namely that technology often ends up requiring us to reshape and restructure our world to fit the demands of something we originally created to.. er.. assist in making our world easier to deal with.
fortune -o
Having said that, this boring headline business doesn't seem to have affected The Register. They usually have some clever ones.
planet texture maps and more
Are both the pleasures of reading and the enjoyment of creativity truly lost to so many in the "information age"? For such a largely technical community I'm baffled why more people aren't suggesting improvements on search engine ranking and categorization methods. After all, such methods are neither permanent nor unchanging. Certainly, we collectively have and can develop ways to organize and use keywords that search engines could take better advantage of, rather than just giving so much emphasis to the titles. Web content may be information but it's not just information. Google and the like should add some other dimensions into their search results, such as which articles are more likely to have unique points, humour, seriousness, blandness, or wit, etc.
:-)
(P.S. I only put the d0rky title to annoy)
No one in a newsroom aims to write alternate usage or text. Ever. Period.
News people are lazy as hell. Even if you built them a robust admin system that made the entire process of entire the alternate text and usages a breeze, they still wouldn't use it.
Journalism went to hell in a handbasket after the search engine became a common feature in everyday life. The main reason is that journalists are lazy.
Don't believe me? Watch the idiots swoon around Scott McClellan at the White House press gaggle every day.
These people reprint AP wire copy, press releases and eve junk faxes verbatim because they're too lazy to do what they're being financially compensated to do.
I scream. You scream. I assume that means we're both acquainted with the problem. We proceed.
It is an ages-old rule in journalism that one writes for the least common denominator (or close to it). This explains why the articles in Playboy magazine are written at a 10-year-old reading level...
...when the primary consumer of newspaper articles shifted from intelligent (human) news consumers to the largely autistic (stoopid) search engines, the content had to be dumbed down to their level.
When you let a computer program do the evaluating for you, the results will only be as sophisticated as the algorithm that processed the data. Lacking the breadth of knowledge that humans do, current search engines lack the sophistication of a human reader, missing instances of similie, metaphor, et cetera. Does this surprise anyone?
Be your own media filter!
Yes, but with Wikipedia sometimes it fails and this is one case where it seems to have failed spectactularly. By "some formulation of the hard disk law" I meant some rule of thumb stating the increase of hard drive space over time, not that there are ones concerning the improvement of a technology.
I find it very ironic that while newspapers ar re-writing their headlines to be more machine recognizable -- at the same time they're complaining that Google and the search engines are 'free-loading' on their hard made content.
You can't have it both ways. Either you want your stuff to be recognizable by search engines, or you want the search engines to leave you alone.
Sheesh
Lost at C:>. Found at C.
"I hate to advocate drugs, alcohol, violence or insanity but they've always worked for me" - HST
The Sydney Morning Herald has not only replaced its old-style "meaningless without context" headings with "boring" ones, but it's stuck them into its URLs - which is another SEO idea.
Danny.
I have written over 900 book reviews
Our local newspaper ran a pun competition, with $1,000 going to the best pun.
:(
I thought up some good ones and entered ten times. I was convinced that one of them would win. But, no pun in ten did
-- Trinity in high heels carrying a whip: The donimatrix - there is no spoonerism
ladies and gentlemen! Witness the marvel, the horror, the mind numbing cursiosity that is... Teh T3chn0_b0y!
Half adolescent virgin nerd, half networkable thin client, he is the modren man!
Watch in astonishment as he and his kind mutilate and forever change the manner in which we humans will converse with each other.
Stare in wonder as the written language is electo-digitized and re-analogized for human consumption!
See the wonder of the end of common english!
See Teh T3chn0_B0y!
Xserv
"I love lamp."
Now if they could only do this for job ads, so when I search for "not telemarketing", crap like "enumeration-type work" doesn't show up.
- RG>
Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!
This man tells the truth; I have seen it inside and it was the same. The problem is that a proper democracy is wholly dependent on an honest and aggressive (i.e. not lazy for one) press. Unfortunately, the effects on a democracy which does not have a press as such in place can be seen in out current situation.
People search for information, not puns or idioms. The more direct the information, the easier it is to understand, as well.
Twinstiq, game news
for Sandra Bullock to do tasteful nude scenes, so newspapers can use the headline STARK BULLOCK NAKED.
I tend to use witty headlines when possible on my own web site, and it doesn't half make the search results look strange.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
I think the Karma should just be ditched.
All it amounts to is a censorship by mob.
Don't agree with Global Warming or some part of it or just the proposed solution? Don't post with your username because some psudo-scientist will accuse you of being an ignorant ass and modd your post down to troll.
Pick any other topic and then post contrary to the "popular" view and the same thing happens. Next thing you know, your Karma is in the toilet.
As far as I've seen it in action, it's not being used to sort out informative posts from mindless rants, but to suppress contrary opinions.
Don't believe me? Just browse any of the politically charged topics and browse the 0 and below posts. Sure, you will find mindless rants, but you will also find many many comments that are contrary to conventional wisdom and have been consigned to the "Below your threshold" world.
In the case of BMW, it wasn't invisible text. Everything was visible, I believe. The thing was that they would use a Javascript-based redirect so that browsers wouldn't even show that page at all, they would immediately go to another page. Only search engines would see that page.
As to the capitalized sentence, that's just a detail. Google penalized them for skirting their system, not for being spammers. Google doesn't want sites to do this, presumably because they fear they would have trouble making an automated way to rank these sites as if they were viewed by the user, it would increase Google's overhead significantly. I don't have a problem with this.
I think the idea of testing keywords for spam will only be as successful as testing email for spam. That is, less than 100% successful, and those that succeed in fooling Google are massively rewarded with hits. So I don't think your suggestion is feasible.
http://lkml.org/lkml/2005/8/20/95
is that computational linguistics still hasn't been able to make reasonable progress into Pragmatics; but then again, neither has plain-old-offline linguistics, so that's not unexpected.
s -to-be-sarcasm-or-wordplay?
Is there nothing Bayesian/connectionist we can do? Some sort of probabilistic contextual indicator of meaning? With-what-certainty-do-I-as-a-machine-believe-thi
It's still basically a mystery how we understand metaphor and sarcasm as quickly as we do (despite the Gricean notion that they involve some kind of reanalysis, there's no processing delay: an argument, some say, for a presemantic pragmatics...)
Something with a semantic web could probably determine what was going on in wordplay.... and might shed light onto how we as humans understand these "problematic" (from a generative/UG point of view) utterances. Maybe then we could get past issues like the following sentence:
Time flies like an arrow; fruit flies like a banana.
I'm needfully vague here, as I myself am not (currently) a CL...
Sometimes it's very satisfying to obnoxiously say "I told you so". Because this is basically what I said would happen in a comment here january last year...
Wow! What a coincidence! Just yesterday I was posting about how pundits and prognosticators never really qualify their predictions with any kind of time-line so that the predictions can extend into the distant future where they'll either be proven right or no one will care.
Use the punny headline and put the boring message into the tags. So easy, so W3C, so lost in the swirling whirlpool of XML-related WS-* standards.
Use the punny headline and put the boring message into the tags. So easy, so W3C, so lost in the swirling whirlpool of XML-related WS-* standards.
"As to the capitalized sentence, that's just a detail. Google penalized them for skirting their system, not for being spammers."
Do the rules exist in isolation? Or do they have a greater purpose?
If you accept that Googles spam rules are intended to make the search results better, I don't see how you can defend the application of that rule, that removed BMW from a bunch of searches for which it is the only good result.
"I think the idea of testing keywords for spam will only be as successful as testing email for spam. That is, less than 100% successful,"
We can discuss possible approaches till the cows come home, but I think the real fix is competition. I've seen this pattern before, more competition is always the fix for it, it drives new ideas and smart people to try harder.
It sounds like a good idea to me to be able to read a headline and tell something about the contents. Now we need to come up with something that can check whether the story is true or just made up.
Moore's law was about transistor density (and extended to CPU speeds).
I can throw myself at the ground, and miss.
The problem with google today is that it has very little convenience when it comes to finding the context of the word you are looking for.
If you were doing a search on "first base" for instance, you could easily be talking about baseball or the first step in sexual intimacy. Although google sets does a nice job of helping you find similar words, it doesn't do enough of helping you find things in the context you need, nor is it integrated well with the rest of thier search engine.
Also, there is the problem with exact matches. If I did a search for motherboards, I might well want mainboards as well without realizing it. Since people that refer to them as motherboards don't call them mainboards, you'll seldom if ever find one by doing a search on the other.
What this is ultimately calling for is a new kind of search that is as much about context as about the word itself. Google is not ready for the human language (let alone html). "You're killing me" and "you're killing me" are not the same thing and if I have to parse the instances of murder to find the things about humor, I no longer have a real search engine. It does great as a pattern matching algorithm with link weighing.
Of course, I suppose google desktop will alleviate this by weighing which links people have open the longest. The ones I either leave open or bookmark are generally the ones with the greatest amount of validity in my searches.
Now if only slashdot would follow suit!
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
harkens back to a day when those who read the paper, read the entire newspaper, and thusly would know the entire news.
Right! Because if it's not in the paper, it didn't happen!
Speaking of news, did you see? They're increasing the chocolate rations, we're getting 25 grams now!
Google to allow newspapers the use of "alt" or "meta" tags for their headlines? Considering there's a small, reasonably finite number of trusted news sources
Limited to ["Fox News", "White house press releases"]?
No thanks.
You can't take the sky from me...
Oh yeah. It will get you blacklisted from Google because it is against their rules.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
I have read through the comments, and I haven't seen anyone mention (I could have missed it) the major change that's brought this about. Search engines are the outward evidence of a totally different way to use information. It used to be you would pick up a paper or turn on the TV news and see what someone else had planned for you. Now, it's information on demand. That's an immense change.
s _not_smart_to_be_clever.php
Oh - the Times article's own headline will be ineffective to search engines.
I've written more about this on my blog: http://www.geofffox.com/MT/archives/2006/04/09/it
These days, users become subscribers so that they can get first post and fool the moderators into thinking that what they wrote was insightful. Rather than discuss, as mentioned in the article, how a witty title that perhaps employs humor or puns is rewritten to something mundane so that a search engine can pick up on common keywords, people these days are engaging in what Linus Torvalds calls little more than a public wanking session trying to post comments more insightful than the rest.
We don't all do that.
I, for one*, prefer to find posts that have been highly ranked (thus increasing the visibility of my reply), post a reply (usually with a slightly different subject line, to attract even more attention amidst the sea of other posts also trying to siphon attention off the same parent - but most of them having the same subject line) and go for humor.
In all seriousness I know that all that doesn't do a damn bit of good for Slashdot's serious discussion of the topic... But what can I say? It's fun. Like a game.
* "I, for one, welcome our boring headline overlords" would be the standard joke mandated here... but I don't do that.
---GEC
I'm but the humble pupil, seeking to snatch the scratchbuilt pebble from the master's fully articulated hand
Why can't they just use hidden html comments or tags to coordinate topics. Slashdot just updated Slashcode to do this after 8 years or something.
Some time in the 1990's everyone went from nice elegant resumes on wedding invitation grade paper to long buzz-word filled machine readable resumes stored in text files. The reason then was big companies feed all resumes into resume databases and then do key word searches when looking to fill a position. It was no longer important to have a stand out hard copy, it had to be something that would get caught by the key word search.
Extrapolating out we will eventually have news paper articles that read like the meta tags in trolling porn ads that show up when searching for kitchen appliances. (Actually that is probably the single best thing about Google is the removal of such "search engine porn spam!")
Think Deeply.
I don't see any conflict there. Have a headline that tells people what the story is about (so they decide whether to read it or not), then have a well written article to back it up.
I think I'll really miss all the completely lame and useless "witty" headlines from news papers. Reading the exquisitely cliched headlines in the newspapers used to give me the motivation to hit myself in the head with a hammer repeatedly in the morning. Now where am I going to find that motivation?
What a nasty attitude. "Since foreigners speak English, this must lead to dumb English."
In fact, it's going to be the opposite: the foreigners who speak English tend to be more educated than those who don't.
I've been writing "scientific English" for years now. I actually had to learn to write short and simplistic sentences, because so many English speakers were complaining about "long sentences" and "long words". If anything, learning English has forced me to accept oversimplifying everything in order to get my point across.
I keep reading and hearing that news agencies have issue with search engines such as Google, and are threatening legal action because of sites like Google News for getting a "free ride" on news items which they are merely linking to and not doing the "hard work" for uncovering the story. Now I read they try to make it easier for Google to index their stories??? Did I understanding this article correctly or am I missing something?
Much better than 'Cold Front Enters the Southern United States'.
Inane headlines have always bothered me. I don't like to click on a link only to find that the story has nothing to do with what the headline led me to surmise. Anything that increases the clarity of headlines is progress.
I believe the purpose of the rule is to ensure that Google can continue to do automated evaluation of web pages and that that evaluation represents well what a user would evaluate the page as if they read it.
Thus, they penalize anyone who makes this automated evaluation more difficult (or perhaps exceedingly difficult), even if they did the "wrong thing for the right reasons". It isn't a value judgement about individual cases, it's just an attempt to maintain teh viability of their business model.
Also, I disagree that BMW is the only good result when searching for BMW. In most cases they are the best result though.
http://lkml.org/lkml/2005/8/20/95
If I see one more lame headline about a movie hitting #1 that makes some lame pun based on what the movie is about, I'm going to start solely reading Google News.
Seriously, how could anyone think that it'd be a bad thing to give reporters motivation to make titles into relevant summaries of the article? If Google's news crawler can't determine what the article is about, would a casual news reader glancing through headlines have any better chance?
Perhaps I've misread this article; correct me if I'm wrong.
This is my most-wanted feature on Slashdot - applying a Karma penalty to posts made within a certain amount of time of the article's posting. I would probably apply a -3 and then relax it from there. The posts made earliest on Slashdot, even marked "5", are 99% of the time worth nothing posted at any other time - and truly worth that latter value.
Hah! I mock your five-digit user ID!
Slashdot's karma system is strictly a middle-down mechanism in that the effectivness only applies to the lower parts of the system, For what it was originally meant to do, weed out trolls/spam/FP bastards, it does well but turning it into a promotion as well as a punishment system it's perfect.
I've had a hand in designing more than a few user-based content control systems and while I like Moderation and M2 a lot I think it really got pointed in the wrong with with calling it 'Karma' and publishing comment ranks, et al, for the very reason you state "What's the point in posting a comment if nobody will read it". A system based on bad stuff down as efficiently as possible seldom works at pushing good stuff up.
However, this brings up a very, very important question: then why are you posting it, because you have something important to say or because you want to be moderated highly? Playing for the audience like posting a comment just to get modded up is as bad as, if not worse than, the stuff moderation was designed to prevent because it make it that much harder to filter out the noise, and it makes everything more suspect in the end.
In a way the primitive FOAF stuff slashdot could help with that, but it would take some complexity, kind of like the Eigen Vector paradigm that Advogoto tried way back when. The problem there is I have never seen a trust matrix done well yet that can handle unclean (e.g. ye olde random) users being a part of the system.
And what any of this has to do with Google, I have no idea!
Hilary Rosen's speech was about her love of money and her desire to roll around naked in a pile of money.
"show me an alternate BMW German dealership finder that isn't BMW?"
If search for BMW I must be looking for a dealership? Again, I disagree that BMW is the only good result when searching for BMW.
What if I'm looking for parts? Service (independent)? BMW's airplane engine division (now sold I think)?
Like I said, that web page isn't always the best result. I'd rank it #1 though, just like Google (usually) does.
As to BMW's page full of graphics Google can't search, well, I didn't see the UN Convention on Human Rights talking about everyone's guarantee for their web site to be ranked accurately in Google regardless of design. If they don't rank well in Google they need to take steps to fix it, and steps that Google isn't going to punish them for. If they use too many graphics for Google to be effective, then they have to take the consequences.
I do agree the reality is that Google doesn't always rank a page the same as a user does if they were to read it. But that is what Google strives for, and they have a business reason to try to convince the authors of significant web pages to play by a set of rules that allows their business of ranking using user-visible text to remain reasonably accurate. So they took steps to try to cajole BMW into playing by those rules that would benefit Google. They succeded, as an 800lb gorilla usually does.
http://lkml.org/lkml/2005/8/20/95
Google has jumped the shark.
Not about this specific case.
Google needs to be able to perform automated page ranking. If people do what BMW does it can easily thwart them.
They're willing to accept a small failure to prevent a large failure of their business model.
You're really not getting this.
http://lkml.org/lkml/2005/8/20/95