Google To Create "Blog" Search; Potentially Remove From Main
Skyshadow writes "Google, search engine of choice for pretty much everyone, has announced that it will begin a seperate index for blogs and remove them from the normal index, handling them instead in much the same way as their usenet archives. This will hopefully put an end to the recent difficulties locating primary source material among the mountains of blogs which are clogging the ratings system." There's been comments from elsewhere that says they won't be removing them - but that remains to be seen.
lousy words is gettin in the way of my pictures
I went to battle MC Escher but drew a blank
Is there any chance of having an RSS feature for journals, for everyone or even just subscribers?
Thing is, some of these blogs actually contain some pretty handy info from time to time, as blogs are becoming more and more used as a cheap and easy alternative to a content management system imho ....
Veni, Vidi, Velcro!
I, for one, am sick of searching material only to find that the page is some asshat's blog. Nothing against blogs, but you never know where this material came from.
OTOH, what constitutes a 'blog'? Is Slashdot a blog? Is this a blog? The lines are constantly being blurred, and I'm not sure it'll be easy for google to make that distinction.
My journal has hot
Nobody wants to read your blog and this just proves that point!
What about personal sites that may seem like blogs? example.. mine.. I have a blog but then again later on i plan for some more content and such.. hopefully it doesnt remove my site from the main index.. or will at least return it once the site becomes useful.
I've left to find myself. If you happen to see me, please, keep me there until I return.
This is a great idea, especially since many issues have much more commentary than source content. I love the quote "But what happens when the weblog fad dies down?"
However, I hope they maintain links between the main search and the blog search. Finding primary sources, then a button linking to all blog comments on theis topic would be a great research tool.
Most of the useless information people put into blogs. Although, when you search for information, would you want to search 2 different locations? This is the whole claim to googles fame. I have found that many times people post how-to's in thier blogs along with other information.
If it ain't broke...don't fix it
-Rob
I've found some of the best information on blogs. I have no problem with them making a blog specific search, but like the Linux specific search I hope relevant sites can still be found from the main search. It would be a pain to have to search every individual google engine for one bit of info. As it is now, I can use the main search and be pretty sure that I'm going to get a relevant result regardless of what category the site falls under. If I'm looking up what IIRC stands for, I don't really care if I get the info from a JoeBlow's blog or from howstuffworks.com.
Everyone is entitled to their own opinion. It's just that yours is stupid.
Now I can find and read about people's mundane activities more efficiently.
:- ate some toast
:- went to the toilet
:- left the toilet to check the number of hits on my blog
:- got a phone call, was wrong number. Might get a real call one day.
e.g.
9:30am
9:40am
10:50am
11:45am
and so on and so forth..
Be you Admins? nay, we are but lusers!
I also like the analogy made by the article to the voting system where a page votes for a topic: an expert site on turtles voting for turtles once a day every year vs. a blog mentioning turtles once in that same period leads to the expert site winning.
I hate liberals. If you are a liberal, do not reply.
I just hope that Google does at least say "Hey, you might be able to find what you're looking for on our blog search" at the end or something - like they do now with Google Answers. I do applaud their effort to make their database even more relevant though, and is yet another reason I have to admit to being a shameless whore for Google.
I wonder if this is also intended to stop Googlewashing? Google has a history of trying to 'play fair' - and the power of a few well connected blogs to basically 'take possession' of any term works against that philosophy.
Am I the only one who thinks it is funny to see all the anti-blog comments everytime a weblog related story is posted? IMHO, Slashdot is a weblog.
I think I originally found Slashdot on RobotWisdom-- yet another weblog. But that was a couple of years ago...
maybe it'll solve CmdrTaco's troubles about him getting emails from people looking to crack hotmail.
The One Rule Of Chess You'll Ever Need: Don't play someone who carries a kit in their bookbag.
One of the biggest newspapers in Norway, where I live, has recently said they believe blogs to be the new 'killer app' for delivering information on the net. The problem with that is that the treshold for publishing 'news' is so low, anybody can do it. This makes it very difficult for people to find the info they are looking for. At the same time there is no guaranty the info is useful or even correct. A good reputation will be more and more important for businesses and sites on the net.
/. would get though ;)
This move by google tells me newspapers in norway aren't the only ones seeing how influental blogs will/could become.This is a truly great step forward if Google could come up with a way of rating the different blogs. That way you could easily find serious tech-blogs.
Wonder what rating
Be like the twenty-second elephant with heated value in space-Bark!
As a previous poster briefly mentioned, what exactly is a blog? Would Slashdot forums be considered a blog? What about the myriad of ezboard message board forums out there, as well as other discussion websites? If the answer is no, it would be seemingly difficult and perhaps only of minor benefit to seperate just the true "blog" sites while ignoring the other sites.
And what about ebay? Quite often I am searching for info on an old piece of electronics I've picked up someplace, and I do a goole search, hoping to find information about the item. Well, all I get in return are ebay links to a similar item that was sold on ebay a few months ago. And even then, I click on the link, hoping to see what the item sold for (and thus get an appraisal), but the auction has been removed from the database due to it being several months old. Why index ebay pages? It's really frustrating.
Loomis
"The television is the retina of the mind's eye" - Videodrome
Google doesn't index user sigs, so stop trying to "Google Bomb" with them.
I really don't mind finding blog links when I search for something, as they usually at least link to some relevant sources.
On the other hand, it is really a pain to search for help on something, and instead of getting a useful, authoritative document, I'll get a half-dozen archived unanswered mailing list posts from people with the same problem. I would much rather Google address this dilution from mailing lists.
The requested URL
I work at a company that has a blog-like recap of political news of interest for our clients and friends. If google tries to separate all sites with blog-like content, won't this naturally reduce my rank without actually increasing the source of information? Or am I missing something? How is google going to search for blog-like sites?
I think you're confusing a weblog with a "livejournal". A weblog is similar to slashdot (or warblogging.com and back-to-iraq.com). In fact, my weblog (http://privon.com) deals with politics, science, and civil rights as well as opinion pieces I've written about various issues. A weblog is another source of information.
What you're thinking of is commonly called a "livejournal" and it's exactly that - a journal. Some blogs are also journals. For example, I've got two 'blogs'. One is the one I mentioned above. The other is slightly more journal oriented, with me posting about things I've done that my family and friends (and possibly others) might find interesting. For example, I've recently posted about visiting the Trek Bicycles Demo Day as well as some of my latest photography experiences.
It might be beneficial for you to review your definition of a blog. Blogs can be an excellent source of information, not just a diary.
neurostarWouldn't it be better if they include blogs in their searches by deafult and then have a 'remove blogs from this search' link.
I think this solution would make everyone happy.
Why not just create a "-source" flag or, as has been suggested, "-noblog"? Why are blogs being marginalized as any less authoritative than other hits? Why is using "-" (eg: ["trading cards" -hockey]) utilized for weeding out certain criteria but not employed here when the goal is the same? Could we at least have a flag for combining the two results?
A comparison is being made between blogs and the newsgroups which are worlds apart in a number of different ways not the least of which is the thread-nature of the groups.
What defines a blog, anyway? What defines a not-blog? Is CNN.com a blog? Is it not a blog because many people write for it, because of the number of hits it gets or because it has press credentials? Which category does indymedia.org fit into?
Will I only get news results when I search for "ferret care?"
What if the source IS a blog? If the subject IS the blog, will a news site reporting on the blog wind up in the main search results while the subject itself -- the blog -- be only in the blog search?
My
Limekiller
I love this idea... and I have been waiting for something like it for some time...
Think about it... I would love to search the blogosphere to see how widespread certain news items have become, or how widespread a certain opinion is...
You could use something like this to measure the spread of ideas (at least within a vocal and technologically suave minority).
Ehh, the point of this message is to inform the uninformed of the wonderfulness of Google News. It automatically features prominent headlines from all over the web, and you can search for topics, keywords, etc. in the search bar and have results sorted by relevance or date. News articles are mostly excluded from the normal index, which makes Google News the best headline locator on the Internet, by far.
There's no indication whether or not blogs will be left in or out of search results. This is very different from USENET, which was never part of the web in the first place. Orlowski is far from an unbiased source on this, having published many articles critical of bloggers in general. While two source are cited which are critical of the effect that blogs have had on the google ranking algorythm, none are cited which show the contributions personal publishers have made to the info-sphere.
Far more authoratative sources that I have already weighed in on this.
While there's certainly a lot of innane content available in blog form, this isn't really any different than it was before. I have never had to wade through 500 pages of results to find an original source either. The whole thing reeks of FUD to me Methinks that Orlowski and Roddy have their own axes to grind.
Howard Dean for president
<span title="you know, in order to spread more 'Google censors Evhead' suspicions"></snip></span>
<!-- Andrew Orlowski strikes with another brilliant theory designed to get attention from bloggers (even though the number of their readers is of course "statistically insignificant"). Well shit, I'm biting.
Based on Eric Schmidt's mentioning of a blog search, Orlowski suggests that Google will remove blogs from the main index.
This shouldn't surprise many people, but as far as I know, Orlowski is full of crap. Again. If Google didn't find that blogs improved the results (and I don't know, I would assume they test these things, like, constantly), do you suppose they'd increase the frequency at which they crawl them, or decrease it? Yes, that's what I think.
Too bad my headline isn't any truer than the Register's.-->
Alright, fair enough - but how do you identify a weblog? They can do this for blogger/blogspot/whatever that they bought, and maybe standard software like moveable type etc. But what about sites based on slash, phpnuke or totally custom code? And where does a weblog begin and a news site end?
Filtering out usenet news is relatively easy, but weblogs? Mhhh, I shall remain sceptical until I see it implemented.
If they really want to make their search engine useful, they ought to separate out Web archives of mailing list discussions. Blogs usually link back to where they got the story, so with only a little digging, you can find the original material. Mailing list discussions, though, are often out of date, irrelevant, and lacking in easy-to-follow references. They annoy me much more when I'm looking for things on the Web.
Thanks for making Google a link to Google's web site. I would never have been able to find it! Maybe I could have googled for it. Oh wait, nevermind.
The thing about a Blog is the simple fact that while it may contain information that is of value to a person, most of the time it is simply a day-to-day journal of random thoughts and events.
A Web Page created by a person is usually created for a task in mind - Showing off a project (case mods, hacks on furbys, peep surgery), a fan information page (Dr. Who, Anime, Star Trek, Babylon 5), or a page created for a group (Local SCA Group, Computer User's Group, MMORPG Guild Page).
A Blog is usually created as a online journal or diary, often for a group of friends.
What tends to trip off the search engines are the Blog sites that link to other people by common interestes. WWW.Livejournal.com allows you to have linke by friends, and common interests. Were I to have a blog with them and I set up as one of my interests as Star Trek, then I'll likely end up with several hundred names of people that also like star trek.
Google goes out and farms new sites. It hits so-and-so's blog in Livejounal. It sees a link mentioning Star Trek and follows it...then it sees about 1000 more ST links... 1001 ST links that likely won't have a dang thing about Star Trek on the pages (unless someone happens to brag about how he scored ST:TNG season 1 on ebay for a song).
More and more people are blogging and hence this is why blogs (which have been around for quite a while) are now starting to become a concern for the search engines trying to filter out the signal to noise ratio.
I like Google's idea. One of the reasons blogs like together is often so people can network with people who share common interests. If you don't and want to learn about Star Trek can find real information by going to hte main page while the people looking for fellow ST fans can go to the blog page.
Makes sense to me
Phoenix
-- Wiccan Army, 13th Airborne Division "We will not fly silently into the night"
How often does the phrase "current mood" appear?
How often does the phrase "listening to" appear on the same page as "current mood"?
Does "George Bush" or "shrub" appear on the same page as "dictator", "simian", or "ass"?
Is Wil Wheaton mentioned on the page?
It's a start. Google will have to pay me for more...
I can just see the red, blue, yellow, and green logo...BLOOGLE. Will the new term for searching blogs explicitly be "Bloggling", or will it be "Bloogling"?
There is a wealth of categorization systems out there. Generally, they "position" the sites in an imaginary, highly-dimensional space, depending on whether keywords occurr (and how often/prominent etc.), and on certain structural properties of the documents. You can then try to define separating hyperplanes, which are functions that devide the ("feature") space into separate compartments, so you can group documents together.
Usually, these systems are trained on a set of sample documents that are already categorized, in this case, for instance, a thousand blog pages and tenthousand non-blog pages.
An example for this would be Support Vector Machines and Joachim's text classification algorithm.
Relevant keywords (from the field) to look for include "Maximum Entropy Models", "classifiers", "categorization", "Bayesian *" (whatever), "Neural Network Classifiers", "Data Mining"...
Slashdor IS a blog. Because we're not talking about some Google employee sitting around and making a judgement call on every link on the net, it's obviously going to be automated by robots.
/. and K5 chatter to find some substance.
/., but make no mistake about it, it's a blog.
Slashdot, like other blogs, pollutes search engine searches with their "permalinks," which, although they might be useful, certainly constitute a blog. In fact, one of the problems with blogs and search engines is that they generate thousands of clickable hyperlinks effortlessly. It's great for someone reading a blog and trying to bookmark a certain section - it's terrible for the guy who wants information on combatting spam through more effective use of his SMTP server and has to search through 30 pages of
Certainly, Google's criteria for what defines a blog might be helpful, but it seems to me like you're subjectively deciding which blogs are legitimate news sources and which are "some kid rambling on." Say whatever you like about the legitimacy of
Perhapse its more of an issue with technical questions. I constantly use Google to look for answers to, amoung other things, technical questions. More often than not, I find an answer or at least a lead that gets me pointed in the right direction. Oddly enough, they're usually from archived mailing lists if I do a web search. And I find that the quickest route is often via Google's usenet search. So yea... maybe a seperate mailing list search might be a very useful thing indeed.
As an aside, my most recent dead end involved a Win2K error that's been popping up on one of my boxes. Usenet is full of variations on this error reported over the years without any good answers to what causes it. That doesn't mean that my Linux and Solaris searches are always gems - but it does suggest that such dead ends can be found for almost any platform on a case by case basis.
but filtering out ephemeral content in general would be good -- blogs would be included in this. so would mailing list archives, news stories, online stores, auctions, discussion groups, etc.
when i'm searching, i almost always prefer a page that somebody authored and put up as a permanent resource (or as permanent as the web allows). the top-level pages of the ephemeral sites would probably be good to keep in the main index, though i'm not sure how you index, e.g., the /. homepage.
-esme
I've never had any problems with blogs, but the archived mailing lists are what really bugs me. Searching for something, only to have the first 10 pages of hits be duplicates in various archives of a list makes finding relevant information a bit more difficult.
http://www.masturbateforpeace.com/
Oy. If Slashdot had managed to perform even a minimum amount of editorial diligence (which, pot, here's kettle, is what the Register rails on bloggers for not doing), they'd have found pretty quickly that this article is yet another installment in Andrew Orlowski's (an up-and-coming Dvorak-wannabe) ongoing jihad against weblogs. Don't believe the hype.
Did you report it to the bug database - where it ISN'T OFFTOPIC?
Determining what is and what is not a blog will be a lot harder than determining what is and is not in a newsgroup.
I think this is a bad idea. Google has made a mistake if they think what we call currently call "blogs" are a novelty item. Blogs are the future of the web, even if a lot of people are using the technology for toy purposes today.
I want to be able to search the entire web in a single index, blogs and all. If PageRank is giving too much noise and not enough signal due to blogs, then fix PageRank.
Google hasn't announced any such thing, at least as far as removing weblog content from the main search is concerned. If you read the article, you'll note that it's Orlowski speculating about a Slashdot comment, of all things - specifically, a comment from the William Gibson blog thread. evhead posted about this Register article on Friday.
Updated frequently ... "posted by" ... dates ... hosted on one of the popular blogging sites ... Links to and is linked from other weblogs
Sounds like the news sections of most SourceForge.net projects I've run into. They're updated frequently (release early, release often), the maintainers frequently post status updates on given dates, SourceForge.net has a lot of them, and they link to other projects that use their code or that contribute code that they use.
Is SourceForge.net a blog?
Will I retire or break 10K?
It's not just weblogs he's against: just search on "El Reg" for "google" and see what strange articles you come up with "google news is edited by humans yet google claims it is by computer program - but who programs the computers?" style article. The author? His surname starts with O...
Rather than separating stuff, why not make it a series of choices using check-boxes. Example:
Include Web-pages: [X]
Include Blogs: [ ]
Include Usenet: [X]
And so forth. You can get better combos this way, If they add other "web types" in the future, you can combine searches without having to go to each one. They could still include a dedicated listing if they want, but I hope they don't hard-wire their data that way to prevent or reduce multi-factor searches in the future.
Even more generic would be to have a pull-down list of the "strength" of each search. Thus, if you wanted weblogs included, but given less weight, you might assign it a lower number. Zero would be the same as a no-check above. However, this is perhaps too confusing to most users.
Table-ized A.I.
...they remove the blogs from main, then re-incorportate the highest-hitting blogs from the new search back into the main? Then you may not miss a relevant and useful blog while avoiding the one that is mainly about some highschool girl and funny text messages that she got from her friends?
My blog can kick your blog's ass