Google To Create "Blog" Search; Potentially Remove From Main
Skyshadow writes "Google, search engine of choice for pretty much everyone, has announced that it will begin a seperate index for blogs and remove them from the normal index, handling them instead in much the same way as their usenet archives. This will hopefully put an end to the recent difficulties locating primary source material among the mountains of blogs which are clogging the ratings system." There's been comments from elsewhere that says they won't be removing them - but that remains to be seen.
This is great. I hate looking for something only to find some dudes stupid blog with info that doesn't help me.
lousy words is gettin in the way of my pictures
I went to battle MC Escher but drew a blank
Is there any chance of having an RSS feature for journals, for everyone or even just subscribers?
Thing is, some of these blogs actually contain some pretty handy info from time to time, as blogs are becoming more and more used as a cheap and easy alternative to a content management system imho ....
Veni, Vidi, Velcro!
I, for one, am sick of searching material only to find that the page is some asshat's blog. Nothing against blogs, but you never know where this material came from.
OTOH, what constitutes a 'blog'? Is Slashdot a blog? Is this a blog? The lines are constantly being blurred, and I'm not sure it'll be easy for google to make that distinction.
My journal has hot
Nobody wants to read your blog and this just proves that point!
What about personal sites that may seem like blogs? example.. mine.. I have a blog but then again later on i plan for some more content and such.. hopefully it doesnt remove my site from the main index.. or will at least return it once the site becomes useful.
I've left to find myself. If you happen to see me, please, keep me there until I return.
This is a great idea, especially since many issues have much more commentary than source content. I love the quote "But what happens when the weblog fad dies down?"
However, I hope they maintain links between the main search and the blog search. Finding primary sources, then a button linking to all blog comments on theis topic would be a great research tool.
Most of the useless information people put into blogs. Although, when you search for information, would you want to search 2 different locations? This is the whole claim to googles fame. I have found that many times people post how-to's in thier blogs along with other information.
If it ain't broke...don't fix it
-Rob
has already confirmed this as false..
our register friedns had slow news day instead..:)
Don't Tread on OpenSource
I've found some of the best information on blogs. I have no problem with them making a blog specific search, but like the Linux specific search I hope relevant sites can still be found from the main search. It would be a pain to have to search every individual google engine for one bit of info. As it is now, I can use the main search and be pretty sure that I'm going to get a relevant result regardless of what category the site falls under. If I'm looking up what IIRC stands for, I don't really care if I get the info from a JoeBlow's blog or from howstuffworks.com.
Everyone is entitled to their own opinion. It's just that yours is stupid.
Now I can find and read about people's mundane activities more efficiently.
:- ate some toast
:- went to the toilet
:- left the toilet to check the number of hits on my blog
:- got a phone call, was wrong number. Might get a real call one day.
e.g.
9:30am
9:40am
10:50am
11:45am
and so on and so forth..
Be you Admins? nay, we are but lusers!
I also like the analogy made by the article to the voting system where a page votes for a topic: an expert site on turtles voting for turtles once a day every year vs. a blog mentioning turtles once in that same period leads to the expert site winning.
I hate liberals. If you are a liberal, do not reply.
I think you've probably seen it on here. Incidentally, the relevant article is actually linked from the "related links" box. For once, it's not Slashdot at fault!
So google is finally using some drano to unclog their blogs, eh? Well as always, using acids does some damage for a greater good.
-illumina+us "I put on my robe and wizard hat..."
I just hope that Google does at least say "Hey, you might be able to find what you're looking for on our blog search" at the end or something - like they do now with Google Answers. I do applaud their effort to make their database even more relevant though, and is yet another reason I have to admit to being a shameless whore for Google.
I wonder if this is also intended to stop Googlewashing? Google has a history of trying to 'play fair' - and the power of a few well connected blogs to basically 'take possession' of any term works against that philosophy.
The other day, Slashdot had a database problem while running an aricle on an Oregon Bill that promoted Open Source. I thought I'd see the google cache and so:
2 39.shtm l?tid=103
a &q=Mi crosoft+Slashdot+Oregon&btnG=Search+News
Went to news.google.com
Searched for: Microsoft Slashdot Oregon
It returned a link: Oregon Bill Would Require Open Source Consideration - 18 Apr 2003.
Curiously, the link was wrongly pointing to a Mrach 6 article!! I reported this at Slashdot and got Offtopic-ed to hell, but just now, I tried again, and surprise! the mistake still exists:
Here's the link:
http://slashdot.org/articles/03/03/06/1815
in response to:
http://news.google.com/news?hl=en&edition=us
Very inriguing indeed.
If you keep throwing chairs, one day you'll break windows....
Even the next prime minister of Canada is blogging
if you look at the # 1 answer for EVERY search on googoo these daze, ucann see why there's a knead to reduce the options/results to exclude most of the relevant answers.
This is going to be nice...Its quite annyoing to get a bunch of weblogs when your looking for something
:)
They are probably going to have to expand their cluster in order to add another cache....Hopefully it wont impact search times...but google usally does a good job at adding in their new grids relitivly well.
Microsoft does smoething similar...they cache everything that is !(pro_microsft) and (Linux) the problem is they dont let you search this index
Here is a mirror:
Google to Create Blog Search Engine?
It's 10 PM. Do you know if you're un-American?
Am I the only one who thinks it is funny to see all the anti-blog comments everytime a weblog related story is posted? IMHO, Slashdot is a weblog.
I think I originally found Slashdot on RobotWisdom-- yet another weblog. But that was a couple of years ago...
maybe it'll solve CmdrTaco's troubles about him getting emails from people looking to crack hotmail.
The One Rule Of Chess You'll Ever Need: Don't play someone who carries a kit in their bookbag.
One of the biggest newspapers in Norway, where I live, has recently said they believe blogs to be the new 'killer app' for delivering information on the net. The problem with that is that the treshold for publishing 'news' is so low, anybody can do it. This makes it very difficult for people to find the info they are looking for. At the same time there is no guaranty the info is useful or even correct. A good reputation will be more and more important for businesses and sites on the net.
/. would get though ;)
This move by google tells me newspapers in norway aren't the only ones seeing how influental blogs will/could become.This is a truly great step forward if Google could come up with a way of rating the different blogs. That way you could easily find serious tech-blogs.
Wonder what rating
Be like the twenty-second elephant with heated value in space-Bark!
Each blog usually consists of hundreds if not thousands of pages. That increases the persentage of 0.032% somewhat.
Big brother doesn't control blogs.
As a previous poster briefly mentioned, what exactly is a blog? Would Slashdot forums be considered a blog? What about the myriad of ezboard message board forums out there, as well as other discussion websites? If the answer is no, it would be seemingly difficult and perhaps only of minor benefit to seperate just the true "blog" sites while ignoring the other sites.
And what about ebay? Quite often I am searching for info on an old piece of electronics I've picked up someplace, and I do a goole search, hoping to find information about the item. Well, all I get in return are ebay links to a similar item that was sold on ebay a few months ago. And even then, I click on the link, hoping to see what the item sold for (and thus get an appraisal), but the auction has been removed from the database due to it being several months old. Why index ebay pages? It's really frustrating.
Loomis
"The television is the retina of the mind's eye" - Videodrome
Google doesn't index user sigs, so stop trying to "Google Bomb" with them.
I also reply below your current threshold.
Separating the blog index help people doing the more precise searching. Most blogs contains too much personal information.
I really don't mind finding blog links when I search for something, as they usually at least link to some relevant sources.
On the other hand, it is really a pain to search for help on something, and instead of getting a useful, authoritative document, I'll get a half-dozen archived unanswered mailing list posts from people with the same problem. I would much rather Google address this dilution from mailing lists.
The requested URL
Hopefully they'll do the same with that zillion of mailing lists...
What exactly is a Blog? Can anyone answer? What makes it different from a person web site full of links and comments, such as has existed on the Web for more than 5 years? What makes it something "new"?
I work at a company that has a blog-like recap of political news of interest for our clients and friends. If google tries to separate all sites with blog-like content, won't this naturally reduce my rank without actually increasing the source of information? Or am I missing something? How is google going to search for blog-like sites?
I think you're confusing a weblog with a "livejournal". A weblog is similar to slashdot (or warblogging.com and back-to-iraq.com). In fact, my weblog (http://privon.com) deals with politics, science, and civil rights as well as opinion pieces I've written about various issues. A weblog is another source of information.
What you're thinking of is commonly called a "livejournal" and it's exactly that - a journal. Some blogs are also journals. For example, I've got two 'blogs'. One is the one I mentioned above. The other is slightly more journal oriented, with me posting about things I've done that my family and friends (and possibly others) might find interesting. For example, I've recently posted about visiting the Trek Bicycles Demo Day as well as some of my latest photography experiences.
It might be beneficial for you to review your definition of a blog. Blogs can be an excellent source of information, not just a diary.
neurostarWouldn't it be better if they include blogs in their searches by deafult and then have a 'remove blogs from this search' link.
I think this solution would make everyone happy.
Why not post them where they belong, blogs are electronic diaries.
........."
"Dear Blog, I can barely keep my eyes open. It's not even 10pm! I've done a good amount of walking today but nothing spectacular...
thank God the internet isn't a human right.
Why not just create a "-source" flag or, as has been suggested, "-noblog"? Why are blogs being marginalized as any less authoritative than other hits? Why is using "-" (eg: ["trading cards" -hockey]) utilized for weeding out certain criteria but not employed here when the goal is the same? Could we at least have a flag for combining the two results?
A comparison is being made between blogs and the newsgroups which are worlds apart in a number of different ways not the least of which is the thread-nature of the groups.
What defines a blog, anyway? What defines a not-blog? Is CNN.com a blog? Is it not a blog because many people write for it, because of the number of hits it gets or because it has press credentials? Which category does indymedia.org fit into?
Will I only get news results when I search for "ferret care?"
What if the source IS a blog? If the subject IS the blog, will a news site reporting on the blog wind up in the main search results while the subject itself -- the blog -- be only in the blog search?
My
Limekiller
I love this idea... and I have been waiting for something like it for some time...
Think about it... I would love to search the blogosphere to see how widespread certain news items have become, or how widespread a certain opinion is...
You could use something like this to measure the spread of ideas (at least within a vocal and technologically suave minority).
Ehh, the point of this message is to inform the uninformed of the wonderfulness of Google News. It automatically features prominent headlines from all over the web, and you can search for topics, keywords, etc. in the search bar and have results sorted by relevance or date. News articles are mostly excluded from the normal index, which makes Google News the best headline locator on the Internet, by far.
There's no indication whether or not blogs will be left in or out of search results. This is very different from USENET, which was never part of the web in the first place. Orlowski is far from an unbiased source on this, having published many articles critical of bloggers in general. While two source are cited which are critical of the effect that blogs have had on the google ranking algorythm, none are cited which show the contributions personal publishers have made to the info-sphere.
Far more authoratative sources that I have already weighed in on this.
While there's certainly a lot of innane content available in blog form, this isn't really any different than it was before. I have never had to wade through 500 pages of results to find an original source either. The whole thing reeks of FUD to me Methinks that Orlowski and Roddy have their own axes to grind.
Howard Dean for president
Alright, fair enough - but how do you identify a weblog? They can do this for blogger/blogspot/whatever that they bought, and maybe standard software like moveable type etc. But what about sites based on slash, phpnuke or totally custom code? And where does a weblog begin and a news site end?
Filtering out usenet news is relatively easy, but weblogs? Mhhh, I shall remain sceptical until I see it implemented.
A very valid point, mod parent up. I've faced the same problem. Incidentally, I haven't faced any problem from "mountains of blogs" clogging up the "ratings system": few people will link to a blog if it is content-free, so IMHO pagerank is enough for filtering out useless blogs. OTOH, pagerank doesn't work very well on mailing list archives, because links to the archives as a whole say nothing about how useful an individual post is likely to be.
If they really want to make their search engine useful, they ought to separate out Web archives of mailing list discussions. Blogs usually link back to where they got the story, so with only a little digging, you can find the original material. Mailing list discussions, though, are often out of date, irrelevant, and lacking in easy-to-follow references. They annoy me much more when I'm looking for things on the Web.
I think google should also separate the family history
stuff. Often when I'm looking for info I'll get countless pages about
people who don't exist any more and I spend a lot of
time sifting through all the results.
How is this flamebait? I don't think the poster meant any harm by it; it's a legitimate question. Anyone who has used slashdot's search feature knows how poor it is. Slashdot should license google searching technology to use. Oh, that's right. OSDN's stock is in the negative numbers. They actually OWE money to their stockholders. Hehehe
Thanks for making Google a link to Google's web site. I would never have been able to find it! Maybe I could have googled for it. Oh wait, nevermind.
I am guessing they will just skip and index separately the large blog sites that contribute to vitiating google's page ranking results. It's conceivable that the page rank system can be used to distinguish ranking anomalies characteristic to these sites and thus weed them out.
I don't think this will affect people running blog-like pages on their own sites though, if that is the case.
How often does the phrase "current mood" appear?
How often does the phrase "listening to" appear on the same page as "current mood"?
Does "George Bush" or "shrub" appear on the same page as "dictator", "simian", or "ass"?
Is Wil Wheaton mentioned on the page?
It's a start. Google will have to pay me for more...
Drat! Foiled again! Gone are the days when I could put devious meta tags in my blog template to make sure that I pop up on Google. I love it when people search for something only to find MY blog pop up. HA HA HA HA HA!
No really...You know, my blog doesn't really say anything either. I just want people to read my rantings. Is it really that much of waste of time to see my blog pop up? How long does it take to click on the search result only to find that it doesn't have what you want? Huh? Like about 30 seconds? Now I gotta get shoved into some obscure category search. Really now...what are you doing searching Google that's so important that you need to save a few seconds here and there?
The problem with Google's thinking is that weblogging is becoming more and more prevalent. It's so easy for a rookie office worker to earn some extra cabbage by telling his boss "hey, I can set us up a web site real easy." You watch: more and more web logs will be seen on the web standing in as business and government web sites. Are they to be shoved into a category search as well?
Mr. Bond, they have a saying in Chicago: Once is happenstance. Twice is coincidence. The third time is enemy action.
Forbes story on Google, May 12, via yahoo finance.
Heh, I only seem to have this problem with Linux problems. I know this is going to be modded troll, but it's true.
Like if I search on some Samba error I'm getting, I'll find a question in a mailing list by some guy with the same problem, but there'll almost never be an answer.
I don't need no instructions to know how to rock!!!!
Feedster.com (formerly known as Roogle, for RSS Google) is a blog search engine that has been around for a while now. It'll be interesting to see how Feedster does once Google comes out with their engine. If it's shot to oblivion, it won't be the first time Google dominated a search engine niche.
Linux at home
I can just see the red, blue, yellow, and green logo...BLOOGLE. Will the new term for searching blogs explicitly be "Bloggling", or will it be "Bloogling"?
thank god for that.
I am the Alpha and the Omega-3
There is a wealth of categorization systems out there. Generally, they "position" the sites in an imaginary, highly-dimensional space, depending on whether keywords occurr (and how often/prominent etc.), and on certain structural properties of the documents. You can then try to define separating hyperplanes, which are functions that devide the ("feature") space into separate compartments, so you can group documents together.
Usually, these systems are trained on a set of sample documents that are already categorized, in this case, for instance, a thousand blog pages and tenthousand non-blog pages.
An example for this would be Support Vector Machines and Joachim's text classification algorithm.
Relevant keywords (from the field) to look for include "Maximum Entropy Models", "classifiers", "categorization", "Bayesian *" (whatever), "Neural Network Classifiers", "Data Mining"...
This means that I won't be able to find /.'s address any more using my super-spiffy google search button in mozilla. How will I live any more!?
Slashdor IS a blog. Because we're not talking about some Google employee sitting around and making a judgement call on every link on the net, it's obviously going to be automated by robots.
/. and K5 chatter to find some substance.
/., but make no mistake about it, it's a blog.
Slashdot, like other blogs, pollutes search engine searches with their "permalinks," which, although they might be useful, certainly constitute a blog. In fact, one of the problems with blogs and search engines is that they generate thousands of clickable hyperlinks effortlessly. It's great for someone reading a blog and trying to bookmark a certain section - it's terrible for the guy who wants information on combatting spam through more effective use of his SMTP server and has to search through 30 pages of
Certainly, Google's criteria for what defines a blog might be helpful, but it seems to me like you're subjectively deciding which blogs are legitimate news sources and which are "some kid rambling on." Say whatever you like about the legitimacy of
Perhapse its more of an issue with technical questions. I constantly use Google to look for answers to, amoung other things, technical questions. More often than not, I find an answer or at least a lead that gets me pointed in the right direction. Oddly enough, they're usually from archived mailing lists if I do a web search. And I find that the quickest route is often via Google's usenet search. So yea... maybe a seperate mailing list search might be a very useful thing indeed.
As an aside, my most recent dead end involved a Win2K error that's been popping up on one of my boxes. Usenet is full of variations on this error reported over the years without any good answers to what causes it. That doesn't mean that my Linux and Solaris searches are always gems - but it does suggest that such dead ends can be found for almost any platform on a case by case basis.
The Blogs are sites which are Hubs, i.e. they contain a lot of outgoing links to diverse sites. The source sites are usually Authorities, a lot other sites link to them. IBM had developed an engine called Clever, at around the same time as google, that gives separate ranks for Hubs and Authorities.
I miss my rubber keyboard.(Homepage)
I must use google hundreds of times a day and it seems to be as good at finding what I'm looking for as it always has been. I like the idea of being able to search only blogs, but is there a need to remove them from the main index too?
:).
All these specialized search engines are nice (usenet, images, blogs), but I still want the ability to search everything at once. Being able to find everything under the sun by typing "g [text]" in my browser's location bar is the best part about google to me. Please don't complicate it needlessly
Game... blouses.
Perhapse its more of an issue with technical questions.
;)
totally. If I'm researching some odd compiler error message- the last one was about a struct not being completely defined, or something cryptic like that. (It had to do with typedef'ing the struct, afterwhich you don't have to say "struct" anymore) All that lead me to was AIX mailing list, where their fix was to comment the struct out. Morons
But more often than not I just get other people who HAVE the problem, yet no solution.
In the future, I would want to not be isolated from my friends in the Space Station.
i'd prefer it if google got rid of all the 1.5billion pages of crap, and just showed 1.5billion pages of results that weren't crap.
Uh, first mistake. There's some lenghtly disclaimer on the net who makes the correlation that his/her(it's gotta be a her) blogger is no different than a paper diary hidden underneath someone's bed. Wish I could cough up the URL, since it comes off as so pretentious.
Mind you, I find it 100 times easier to read his/her blogger from the comfort of my own home, as opposed to breaking in someone's house and ganking a Teen Girl Squad-like diary.
In my opinion, anyone's public-entry [blogger || lj || dj || diaryland ] is no different than a web page. Sure, you may only intend to write it for your friends("My ex-boyfriend sucks! Tori rules!"), but ANYONE can see it. Make sure you remember that when you reveal personal details about yourself. If a non-friend reads that, you can't accuse him/her of being a stalker when it's there for anyone to read.
but filtering out ephemeral content in general would be good -- blogs would be included in this. so would mailing list archives, news stories, online stores, auctions, discussion groups, etc.
when i'm searching, i almost always prefer a page that somebody authored and put up as a permanent resource (or as permanent as the web allows). the top-level pages of the ephemeral sites would probably be good to keep in the main index, though i'm not sure how you index, e.g., the /. homepage.
-esme
Um, the fact that you work for a developer of Blogging Software that offers a theme called "Slashesque" wouldn't be coloring your opinion here, now, would it?
Actually, from your perspective, do you see Google's move here as somehow "de-legitimizing," or at least taking some of the wind out of the sails of the Blogging fad? What kind of features are your users looking for in the "next-generation" of web-diary gear?
As for my "subjective" decision, no, what I'm saying is that if the person "blogging" derives revenue at the end of the day for his "blogging" work, then his efforts are a class apart from the amateur Web diarists, and Google should not ignore them the way they will be (albeit politely, seating them at the "kids' table") ignoring the "amateur" bloggers.
I've never had any problems with blogs, but the archived mailing lists are what really bugs me. Searching for something, only to have the first 10 pages of hits be duplicates in various archives of a list makes finding relevant information a bit more difficult.
http://www.masturbateforpeace.com/
I should make it clear that I'm not making a statement as to the legitimacy of weblogs. God knows, my little open source project is mostly used by people as a personal outlet for their own stuff, not companies. Installations of it are probably some of the most guilty of the so-called "search engine perversion" mentioned in these articles. I don't know that separating blogs from google searches is a bad idea.
However, I firmly believe that slashdot IS a blog, as is K5, as are many legitimate news sources, and they will probably be filtered as well by whatever googlebot determines what is and isn't a blog. Just something to consider.
There is well-known "Longbet" page (Longbets.org), and bet number 2 is: .
"In a Google search of five keywords or phrases representing the top five news stories of 2007, weblogs will rank higher than the New York Times' Web site."
It's a bet between Dave Winer (Userland.com) and Martin Nisenholtz (New York Times Digital).
So there wil be no winner in this bet? BTW - most people agrees with Dave Winer.
Read the full bet story
This Is Not a Sig
Oy. If Slashdot had managed to perform even a minimum amount of editorial diligence (which, pot, here's kettle, is what the Register rails on bloggers for not doing), they'd have found pretty quickly that this article is yet another installment in Andrew Orlowski's (an up-and-coming Dvorak-wannabe) ongoing jihad against weblogs. Don't believe the hype.
Dunno. I hadn't really noticed this until this story showed up. I remember something in the reg some weeks ago, but it didn't really make an inrod. What do you people search for that has been googlewashed? My google searches are mostly technical stuff, which is probably the reason why I hadn't noticed this. Bloggers normally don't write about tech stuff and the few ones who do, write useful things, so it's not a problem.
Enquiring minds want to know.
However, I firmly believe that slashdot IS a blog, as is K5, as are many legitimate news sources, and they will probably be filtered as well by whatever googlebot determines what is and isn't a blog.
Nope. Google has already got most of those sites tagged as "news" and therefore can rather easily exclude them from the "blog" category if such be their wish.
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
So Google is going to stop indexing blogs along with all other web pages in order to reduce worthless hits to queries. Why should this be such a big deal?
I am looking through the posts here and can't get over the obvious power that Google is now exerting over the Web. Other posters have all sorts of tips and tricks for trying to get Google's attention or manipulate it when necessary. The reason they have put in the time and effor to do so is that web sites that don't get picked up by Google just don't get visited.
A few months ago someone posted and mentioned Google-Watch.org This site is worth a visit. Since Google has become so many web users' default search page, Google now wields so much power that they are bending the web around them to suit their own corporate purposes. There are rumours of Microsoft buying Google. You want a frightening monopoly scenario? Try combining a desktop OS monopoly with the cookie-based records of millions of web users and Microsoft's well-known penchant for hyper-aggressive marketing and you have a potentially lethal Force on the web. A friend of mine was running an improperly-registered version of MS Office and following an upgrade of Internet Explorer, his MS Office was disabled. Of course, when I run Windows Update on a machine, I am assured that MS is not collecting any information about that system. Sure.
Who doesn't believe that this change in Google's policy will leave blogging unaffected? Obviously many bloggers enjoy having their content show up on Google and now that is being dampened. What is next from the almighty Google? What other form of web content will they choose to dump into the Internet's equivalent of Siberia?
All of this makes me wonder if there isn't some latent human tendency to put all power into one entity. Surely Linux users, many of whom are bloggers and serious Google users, should see there is tremendous danger in continuing to allow Google to exert so much power over the Web.
In principio erat Verbum.
Perhaps, a simple solution would be for a "Include Blogs" checkbox next to or underneath the the goolge search box. Also, you should be able to set you google prefs to have it checked/un-checked by default.
Determining what is and what is not a blog will be a lot harder than determining what is and is not in a newsgroup.
I think this is a bad idea. Google has made a mistake if they think what we call currently call "blogs" are a novelty item. Blogs are the future of the web, even if a lot of people are using the technology for toy purposes today.
I want to be able to search the entire web in a single index, blogs and all. If PageRank is giving too much noise and not enough signal due to blogs, then fix PageRank.
OK,
- B
http://www.bradheintz.com/
- updated
If instead of always having dynamic pages, how about the CMS systems spit out static pages and just use the CMS to create them in the first place. They would run 10X faster and maybe survive Slashdottings!
I feel that the Register is putting too much emphasis on the negative aspects of blogging. Blogging is a legitamite way to help rank pages. Upper 10% intelligent people who mine the web everyday discern their favorite sites, take the effort to put it on a site, and as a result, should be given credit for making the quality of google searches what it is. I agree, there may be cases where this clique will help links bubble up that only satisfy their specific needs. But aren't they customers too? I bet Blog writers use google more than anybody else. Also, can't they come up with a compromise like just discounting blogs' PR value? Rather than separating them as a sort of "bastard part of the net" why not just put a discount value or some diminishing function on the number of blogs that point to a certain site. I agree journals have become something separate than plain old web pages. But they shouldn't be written out completely or shunned into their own separate space. The voices of the bloggers matter in the value of the content that bubbles up, they're not just a "bunch of kids" linking for "vanity" and participating in some "fad" like the stodgies would like to have you believe.
Philosophistry
That's a really good question. What kind of input are you giving Google? Are you being specific enough? I mean...when I want lyrics, I usually type "lyrics Flinch". If I'm interested in reading speculations on whether Hussein is really dead, I type "Hussein dead" (and I'm not as apt to get anything like this, as when I search for "Saddam Hussein").
What happens to writers, and websites which publish excusively through blog tools like diarist.com and blogger ?
Google has also announced it will have a category for Press Releases.
Free Web based FTP
i agree. I'd like separate 'search mailing list' feature so as to distinguish between 'primary documents' and 'newsgroup-like' stuff
Google hasn't announced any such thing, at least as far as removing weblog content from the main search is concerned. If you read the article, you'll note that it's Orlowski speculating about a Slashdot comment, of all things - specifically, a comment from the William Gibson blog thread. evhead posted about this Register article on Friday.
The Slashdot summary says "Google, search engine of choice for pretty much everyone, has announced that it will begin a seperate index for blogs and remove them from the normal index," but the FA says "It isn't clear if weblogs will be removed from the main search results, but precedent suggests they will be." This is the author, Andrew Orlowski's own interpretation, not a statement from Google; and IMHO, it's a bad misreading of just what the precedent is, since he goes on to talk about their Usenet archive ("Groups"), something which is naturally and fundamentally separate from the web. It would've made more sense to cite the "Images" and "News" tabs, whose results, AFAIK, are not at all filtered out of the main search results.
Personally, I've never run into the alleged problem, either.
Share and Enjoy: 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
Chill out before you break those fragile wrists pounding your desk about some 1s and 0s.
I hate hate HATE the word "Blog". It's weblog. The word is WEBLOG. It makes me nuts that the drooling masses have now standardized this ridiculous mutation of the perfect word for something into what is in actuality a comic-book sound effect.
To the person who coined this brilliance, fuck you. To all of you simpletons that have adopted its use, fuck you more.
Which is why Google is not eliminating them entirely, just moving them over to their own search.
Google hasn't announced any such thing. The whole story is made up by Andrew Orlowski, the IT equivalent of "The weekly world news" (Elvis gives birth to 300 pound baby; Google to filter blogs). He manufactures attention grabbing stories because it draws a lot of links and traffic from places that don't bother to check stories (like slashdot).
Here's a comment from Ev, of Blogger/Google.
In much the same way as 6 degrees of kevin bacon works, I'd venture to say that you can find more than half of all useful web sites within six degrees of web linking. The interesting thing will be that blogs do form their own little distinct cluster. This also applies to other aspects of computer networks, as well as certain human beings.
now we need to go OSS in diesel cars
This is good because the mass of content in the exsisting index has become quite diluted. Google has been trying to fix that as of late and it has been quite a headache for them.
This shouldn't surprise many people, but as far as I know, Orlowski is full of crap. Again. If Google didn't find that blogs improved the results (and I don't know, I would assume they test these things, like, constantly), do you suppose they'd increase the frequency at which they crawl them, or decrease it? Yes, that's what I think.
However, I hope they maintain links between the main search and the blog search. Finding primary sources, then a button linking to all blog comments on theis topic would be a great research tool.
With all the cross referenced porn links floating to the top of PageRank now "accounted for", it looks like they're filtering out what might well be the next largest group of cross-referenced material: the blogger. With blogs filtered into a side channel, one wonders what will be identified as "polluting" PageRank next?
By the time they've finished Google will look like... um.... Yahoo!
Ian.
A physicist is an atom's way of thinking about atoms
As the "purchaser" of their service, I'm entitled to try to reduce the price to whatever level I think is fair by any legal means available. While I personally don't find text ads intrusive--they are a fair price for the service--other people may not, and you're foolish to try to convince them otherwise. The only right and wrong in the commercial world is the price at the end of the day.
It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
they've announced no such thing. doesn't anyone actually follow the links? so, it starts out with a guy saying that they're gonna start a blog search and it gets twisted one step at reuters one more step at the register, and by the time it gets to slashdot it's a formal announcement that blog search results will be excluded from main results.
/. was one of the original weblogs. I don't recall seeing anything even close to this type of format before slashcode was released, and then the idea was copied widely.
That said I agree the Google move is a good one. But I still think blogs are useful when searching for info. If I find a blog while searching for a news artile, I may not care what the person has to say about the article, but if they provide a link to the story I was looking for where's the harm?
Updated frequently ... "posted by" ... dates ... hosted on one of the popular blogging sites ... Links to and is linked from other weblogs
Sounds like the news sections of most SourceForge.net projects I've run into. They're updated frequently (release early, release often), the maintainers frequently post status updates on given dates, SourceForge.net has a lot of them, and they link to other projects that use their code or that contribute code that they use.
Is SourceForge.net a blog?
Will I retire or break 10K?
I guess this will make it difficult to change the name from 'blog' to 'weblog' in every day speech, now that Google might be using it on the front page. Blog has got to be one of the dumbest buzz words to come out of the net yet.
But most likely in a couple years when the blog fad dies down we can stop cringing at the word.
J
Blog
Hella
Boxen
hm... what else...
A Web Page created by a person is usually created for a task in mind - Showing off a project ... A Blog is usually created as a online journal or diary, often for a group of friends.
What about one hostname whose HTTP space contains both a project web page and a related online journal? Such as my site or Forgotten's site or a random SourceForge.net project?
Will I retire or break 10K?
Let's assume that the Register is actually correct, and that blogs are reducing the usefulness of Google's famous PageRank algorithm for indexing web pages. How should Google respond?
First, they could update the PageRank algorithm so it works better for the WorldWide Web of today, in 2003, and not just for the WorldWide Web of several years ago. This would recognize that the WorldWide Web is everchanging and that the same search engine algorithm is not going to work well year after year without continuous tweaks and updates.
Second, they could take "troublemaker" sites--all the sites that are different from standard issue sites as they were designed in 1999, meaning blogs--and move them out of Google's index for the WorldWide Web and into their own, "junk" category. Over time, this would chop more and more of the valuable content out of Google's index, and would cut Google off from blogs, where most innovation on the web is occurring today. Google may not think blogs are important to its search engine business, but as each day passes the blog revolution marches on, with Google or sans Google. They will cut themselves out of the new revolution at their own peril.
Blog techniques are taking over the web. It may look like a fraction of a percent on paper, but blogs are where the action is today. Web publishing is going to be more and more decentralized. Blogging is fun, even if you don't get a lot of hits. Blogging is informative. Whenever I do a search on Google, I specifically look for blogs, because I know they are current and up to date, and talking about the same thing that I'm interested in.
Google apparently is motivated by two concerns: (1) Google is unwilling or unable to fix PageRank to cut out some of the so-called "blog noise." The only way they can deal with the everchanging web is to cut it into pieces. Alternatively, (2) blogs don't pay Google money, and thus they want sites that either pay them money now or will in the future to appear in front of blogs. The more money that's in it for Google, the higher Google will rank them. Unfortunately, people will see this as the corrupt practice that it is over time, and Google's credibility will suffer.
If Google really decides to move blogs into their own category, it will be a significant departure. It will be segregation. Separate and unequal. If so, it would be the writing on the wall for Google. At that point, I would be forced to recommend that people cash out of their Google stock.
As for Google's competitors, like Alta Vista and Teoma, they must be licking their chops now. Google is talking about blogs as if they were a chink in its armor. I'd exploit that to the utmost.
I was just using Teoma the other day. It's improving a lot. Google ought to be worried. Google should fix its technology to stay ahead. Instead, as the Register is reporting, they are contemplating segregation as a political strategy to maintain market dominance. That's quite disappointing. Iit's a strategy that will certainly not work in the long run.
they need to filter out all these websites that are created like "http://www.searchingspammers.com/download_windows _updates.php" and such like that.. very annoying when you need information and all u get is fake websites.. blogs havent bugged me much at all.
I've left to find myself. If you happen to see me, please, keep me there until I return.
I don't mind the mailing list archives so much as seeing the same mailing list archived on 50 different sites and scrolling past all 50 listings of a particular message on Google.
Gates' Law: Every 18 months, the speed of software halves.
Hope they would get blogs contents via RSS.
It allows to download only incremental updates
and traffic-wise more efficient than donwloading
and indexiing HTTP pages.
a world in progress...
It's not just weblogs he's against: just search on "El Reg" for "google" and see what strange articles you come up with "google news is edited by humans yet google claims it is by computer program - but who programs the computers?" style article. The author? His surname starts with O...
For example: try searching google for "key biscayne triathlon trilogy" (no quotes). You will only find useless links. Or try finding a bike ride we have here, "great tour of coconut grove" -- same story.
Yet I've had decent content for both of those topics, with photos, in pages that have dates in them, here and here.
Or, if you want yet another example...try searching for "Arnaldo Cohen Jacksonville" -- Arnaldo Cohen is a pianist...and I have a page up with qt movies of the performance, here -- and google doesn't have it.
I see googlebot on my "blog" all the time -- but nothing's ever added to the index, or if it is, it stays there for 1 day and then disappears.
Alltheweb.com, which is powered by fastsearch's engines, have nearly all my stuff indexed, even though they have about 1 billion fewer pages indexed than google.
So what's the story? Am I being excluded because Google thinks I'm a blog? Or do Google's crawlers suck?
If Pagerank works the way it's supposed to, this won't happen...you'll only see the posts that a lot of other people are linking to.
I fucking hate blogs.
Like i give a fuck what some average joe fucktard in missurah thinks about this or that.
Good riddance.
blog sites?
geocities
Tripod
words like "blog" "weblog", "last summer I went on vacation to" etc
The above mentioned sites can have very useful information. I whole heartedly agree about news having lowsy sources (see Joey Skaggs' website
Even if you use search criteria, what's to prevent some doehead from getting his own domain name? (It's only about $8 from a good name registry service)
Personally, I think it may be useful to have a blogging search, but it will be near impossible to remove many blogs besides the ones listed on major websites.
void
I've tried the "lynx" browser that works fine. Hope this isn't intentional, because it makes it really difficult for visually-impaired people if they try to block text-based browsers (ad revenue worries, maybe?)
No word from Google yet...
Rather than separating stuff, why not make it a series of choices using check-boxes. Example:
Include Web-pages: [X]
Include Blogs: [ ]
Include Usenet: [X]
And so forth. You can get better combos this way, If they add other "web types" in the future, you can combine searches without having to go to each one. They could still include a dedicated listing if they want, but I hope they don't hard-wire their data that way to prevent or reduce multi-factor searches in the future.
Even more generic would be to have a pull-down list of the "strength" of each search. Thus, if you wanted weblogs included, but given less weight, you might assign it a lower number. Zero would be the same as a no-check above. However, this is perhaps too confusing to most users.
Table-ized A.I.
I've never had a problem with my web search results being "polluted" with irrelevant blog posts, but I have had a problem where a blog post that matches my search has scrolled off of the blog's main page. Instead of reducing the rank of blogs, I think Google should try to return the correct archive link for the front-page post that matches my search.
The shareholder is always right.
According to the article, Google just recently purchased Blogger, which made a set of Blogging tools. I'm sure there are certain characteristics shared by almost all blogs made with Blogger software. This will allow Google to at least take a large portion of definite blogs out, and then refine the system as it goes.
if(!toilet_paper) roll.replace(new roll);
Many services have come to rely on Google's API, and in turn, on Google's organization of web content. For example, the automated search service Google Alert may have to reconfigure their service to meet the new demands.
i have a weblog. i think what i say is every bit as relevant as the corporate-sponsored extended mind-control advertising that CNN calls news.
and who is google to say my ideas aren't as important?
they have a democratic system for rating popularity, it's based on "links as votes". it works well... people try to trick the system - but by and large...it's works.
maybe they can improve the system by actually allowing real voting? or improving the accessibility of the voters?
but not deciding that "some journalism is better than others"...
It's authoritarian, abuse of power, and.... evil
because they need to improve their ranking system. some sites that *look like* weblogs are contain excellent journalism and real info. who'se to say what an important blog is versus an unimportant one? some nazi's who work at google?
Anyone who has been on the internet for any length of time knows that weblogs are a form of spam. Just another platform for egomaniacs to rant and rave, and clog up search engines with garbage.
Stupid Toggers - all they want is Google Juice!
J-Log: Journalism News, Media Views
Knee-jerk ad-blocking will only kill SPONSORED content on the net. And quite frankly I'd rather vote with my eyes and let the "freeware" services die than listen to one more quick word from their sponsors. I can live without Google. I can live without CNN.com. I can live without Slashdot. And I absolutely refuse to support those who _can't_ live without them on MY eyeball time.
In case you nodded off during economics class, the basic priciple of a capitalist society is that the consumers do whatever the hell they want. I don't want to watch ads, so I'll block them. And before you start whining about how I'm "stealing" from the advertisers, just remember that the SITE OWNER sold a CHANCE to reach me to the advertisers. The fact that I get "content" for free is merely coincidental.
Any time you _demand_ that consumers should do something not in our own best interests, you'll get a quick reality enema.
http://www.longbets.org/2
...they remove the blogs from main, then re-incorportate the highest-hitting blogs from the new search back into the main? Then you may not miss a relevant and useful blog while avoiding the one that is mainly about some highschool girl and funny text messages that she got from her friends?
My blog can kick your blog's ass
From the /. blurb: "has announced that it will begin a seperate index for blogs and remove them from the normal index"
Please to read first paragraph of article before posting.
I find blogs especially annoying when I'm searching for a relatively obscure MP3 (i.e. not on P2P) with Google, and I get a bunch of hits, but they're all from "Listening to:" headers on blog entries.
Now if they'd just do something about those *$&@* "WinAmp generated playlist", "my MP3 collection", "files you can request via email", and "sorry the files got taken down but I'm leaving the page up because I'm an asshat" pages, but you can't have everything I guess.
Omnes arx vestrum sunt adiuncta nobis.
While Google is good at providing access to information in multiple languages, its news search is monolingual (English), at the moment. A new blog section might similarly mean a new section for English-based blogs only - not the approach I would like to see.
Rather than wait to see how Google will handle this and the issue of determining what is and is not a blog, this would be a good time for people involved in the affected industries and media to set forth practical definitions of blogs, klogs and whathaveyou and make recommendations.
Sooner the better.
I'm getting two fairly strong messages here. Firstly, there is a need for a useful search mechanism for retrieving the gold from the mess of Blog information. (And other forum-based or otherwise ephemeral content.)
The second message is that these ephemeral sources do not follow the same rules as other sites. My (normal, non-blog) website contains strategy articles and so-on for a computer game (Heroes of Might and Magic); that material is neither going to change nor to move, and I link to places that I do not expect to change or move, and to the best places I know of, rather than wherever I encountered things; and MapHaven is not part of an incestuous web of linking. Whereas with blogs, their links are distributed rather more freely; and the often deleterious effect on Google searches has been noted. And people here are also complaining of archived forum posts or web-message-boards polluting Google results with opinions rather than results.
I think it is only sensible that Google, in its quest to provide the most useful links for its users' searches, handle the different kinds of sources differently. Frankly, I'd be surprised and disappointed if they didn't improve their methods in this sort of way sooner or later. (Sooner or later someone else would - and Google would probably fall from its pinnacle as the most used search engine.)
Rachel
It would also be nice if they can cancel or down rank listing sites, these are mainly sites created by affiliate suckers with hundreds of pointers and zero content.
Its very annoying when searching for some topic to have heaps of these listing sites popup in the search results. In my experience, these sites are more annoying and create more interference than blog entries.
I assume that because of their cross linking they get higher rankings, in many cases higher than the direct search targets. I mean for example, when searching for a brand name, you might get some of these listing sites higher than the site of the brand owner.
It would be nice if google people can find algorithms that can identify and penalize such sites - for example, pages with a great number of pointers and little content, or pages that have a big number of keywords from a collection of reference directories such as brand names or manufacturers etc. or maybe making a distinction between 'clean' pointer listings and those with bogus affiliate id URL's can offer a reasonable solution.
or, maybe you should work for the team of facists at Google that will decide the difference between a "blogs" and "news sources"?
... not "real" news sources.
and by the way... kuro5hin and slashdot are technically "blogs"
yep, rather than letting this indy media form mature and learning how to democratically rate the good ones (yours) from the bad (mine).... they will simply crush it and give what little power they have back to corporate mindf*ck behemoths like CNN...
note: my blog is, umm, clearly listed... and it's bad - but not as bad as my slashdot posts...
This move attempts to put power back in the hands of CNN and places like "the register". The register, really is nothing more than a weblog... and poorly written one at that.
Of course they will rejoice that Google is seeking to destroy the democratic power of a million internet voices, cross-linking, meme-propagating. Articles like this one have scared the pants off of media giants. Recently, the New York times blasted "technology" as the real source of "deceptive journalism".
"I don't want 5000 blogs that have the word stalin in them somewhere that on the large are just pointless ramblings by nobodies."
is my site pointless? is it a rambling by a nobody? what about slashdot?
what we need is for google to fix it's "avowedly democratic" ranking system... so that the more imporant sites stand out... regardless of whethey they are look like a "weblog"... not have some facist sit around and decide for the rest of us what is and what isn't imporant enough to be called "news"
You might want to try Gmane, a mailing-list to news and back gateway which has a quite useable web interface as well.
Of course it only carries groups that a) at some time subscribed there and b) do want to be carried, but for example open source projects are coverd quite nicely.
Wolfgang
So should Google strip out all news as "weblogging" except AP and original journalism?
No.
If people link to slashdot stories... instead of to the original source.... it's because the slashdot story, or perhaps the comments, are perceived as more relavent/interestnig than the original. Many times, authors interject their own opinion, or bring together multiple links into a single framework.
An even better example of "weblog as news" is kuro5hin. Occasionally, real news gets published at kuro5hin by reporters who have witnessed crimes and walked in marches.
The only real problem is that Google's ranking system works at the "site level" and not at the "story level"
So the *whole site* often gets ranked up because of *one story*... dragging all the crappy stories with it.
The solution to this is a more
*granular pagerank* system that cleverly incorporates tags. *NOT* the exclusion of important media sources from Google's engine!