Slashdot Mirror


Google To Create "Blog" Search; Potentially Remove From Main

Skyshadow writes "Google, search engine of choice for pretty much everyone, has announced that it will begin a seperate index for blogs and remove them from the normal index, handling them instead in much the same way as their usenet archives. This will hopefully put an end to the recent difficulties locating primary source material among the mountains of blogs which are clogging the ratings system." There's been comments from elsewhere that says they won't be removing them - but that remains to be seen.

18 of 304 comments (clear)

  1. blogs.google.com? by fewnorms · · Score: 5, Insightful

    Thing is, some of these blogs actually contain some pretty handy info from time to time, as blogs are becoming more and more used as a cheap and easy alternative to a content management system imho ....

    --
    Veni, Vidi, Velcro!
  2. yay and aaah by DaLiNKz · · Score: 3, Insightful

    What about personal sites that may seem like blogs? example.. mine.. I have a blog but then again later on i plan for some more content and such.. hopefully it doesnt remove my site from the main index.. or will at least return it once the site becomes useful.

    --
    I've left to find myself. If you happen to see me, please, keep me there until I return.
  3. Yes! But will there be a metasearch? by DeHar · · Score: 5, Insightful

    This is a great idea, especially since many issues have much more commentary than source content. I love the quote "But what happens when the weblog fad dies down?"

    However, I hope they maintain links between the main search and the blog search. Finding primary sources, then a button linking to all blog comments on theis topic would be a great research tool.

  4. Re:'Bout time by NReitzel · · Score: 3, Insightful

    That's what makes google valuable, now isn't it? They consistantly do a good job (better than most) of separating the wheat from the chaff from the link farms.

    --

    Don't take life too seriously; it isn't permanent.

  5. Personally.. by xchino · · Score: 5, Insightful

    I've found some of the best information on blogs. I have no problem with them making a blog specific search, but like the Linux specific search I hope relevant sites can still be found from the main search. It would be a pain to have to search every individual google engine for one bit of info. As it is now, I can use the main search and be pretty sure that I'm going to get a relevant result regardless of what category the site falls under. If I'm looking up what IIRC stands for, I don't really care if I get the info from a JoeBlow's blog or from howstuffworks.com.

    --
    Everyone is entitled to their own opinion. It's just that yours is stupid.
  6. Will they at least link to the new search? by gleffler · · Score: 4, Insightful

    I just hope that Google does at least say "Hey, you might be able to find what you're looking for on our blog search" at the end or something - like they do now with Google Answers. I do applaud their effort to make their database even more relevant though, and is yet another reason I have to admit to being a shameless whore for Google.

  7. Re:'Bout time by Qzukk · · Score: 4, Insightful

    Probably the distinction they will make will be between publicly-available blogging space (livejournal,deadjournal,pitas, and so on) and a personal website that is or contains a blog. This would be the easiest way, since it comes down to setting aside a few hostnames for the new search engine to crawl.

    --
    If I have been able to see further than others, it is because I bought a pair of binoculars.
  8. Re:'Bout time by arvindn · · Score: 5, Insightful
    what constitutes a 'blog'?

    I was wondering about that too. Its not black and white, of course, especially when you want to automate it. I can think of several indications that a page is a blog, some weighted linear combination of these factors should work well enough in practice if you spend some time tweaking the weights:

    • Updated frequently
    • Keywords like "blog", "weblog", "posted by", "comments", "permanent link", and so on.
    • Got dates all over the place
    • Is hosted on one of the popular blogging sites (blogspot, lj, /. journals...)
    • Links to and is linked from other weblogs.
    This last factor is important. If you start from a rough heuristic and execute an iterative algorithm, similar to how they calculate pagerank, your blog detection algorithm will get better.
  9. Ummm... no by neurostar · · Score: 3, Insightful

    I think you're confusing a weblog with a "livejournal". A weblog is similar to slashdot (or warblogging.com and back-to-iraq.com). In fact, my weblog (http://privon.com) deals with politics, science, and civil rights as well as opinion pieces I've written about various issues. A weblog is another source of information.

    What you're thinking of is commonly called a "livejournal" and it's exactly that - a journal. Some blogs are also journals. For example, I've got two 'blogs'. One is the one I mentioned above. The other is slightly more journal oriented, with me posting about things I've done that my family and friends (and possibly others) might find interesting. For example, I've recently posted about visiting the Trek Bicycles Demo Day as well as some of my latest photography experiences.

    It might be beneficial for you to review your definition of a blog. Blogs can be an excellent source of information, not just a diary.

    neurostar
  10. Blur by limekiller4 · · Score: 4, Insightful

    Why not just create a "-source" flag or, as has been suggested, "-noblog"? Why are blogs being marginalized as any less authoritative than other hits? Why is using "-" (eg: ["trading cards" -hockey]) utilized for weeding out certain criteria but not employed here when the goal is the same? Could we at least have a flag for combining the two results?

    A comparison is being made between blogs and the newsgroups which are worlds apart in a number of different ways not the least of which is the thread-nature of the groups.

    What defines a blog, anyway? What defines a not-blog? Is CNN.com a blog? Is it not a blog because many people write for it, because of the number of hits it gets or because it has press credentials? Which category does indymedia.org fit into?

    Will I only get news results when I search for "ferret care?"

    What if the source IS a blog? If the subject IS the blog, will a news site reporting on the blog wind up in the main search results while the subject itself -- the blog -- be only in the blog search?

    --
    My .02,
    Limekiller
  11. Don't forget Google News... by crashnbur · · Score: 3, Insightful
    ...remove them from the normal index, handling them instead in much the same way as their usenet archives...
    One would think that the Google blog search would work more similarly to the Google News search, which searches headlines from online news publications all over the web from all over the world. Google Groups is, as you know, just usenet... Google News, however, like the new Google blog search, will be indexing sites on the world wide web (ostensibly removed from the normal index).

    Ehh, the point of this message is to inform the uninformed of the wonderfulness of Google News. It automatically features prominent headlines from all over the web, and you can search for topics, keywords, etc. in the search bar and have results sorted by relevance or date. News articles are mostly excluded from the normal index, which makes Google News the best headline locator on the Internet, by far.

  12. Blogs removed from google = FUD by lysurgon · · Score: 5, Insightful

    There's no indication whether or not blogs will be left in or out of search results. This is very different from USENET, which was never part of the web in the first place. Orlowski is far from an unbiased source on this, having published many articles critical of bloggers in general. While two source are cited which are critical of the effect that blogs have had on the google ranking algorythm, none are cited which show the contributions personal publishers have made to the info-sphere.

    Far more authoratative sources that I have already weighed in on this.

    While there's certainly a lot of innane content available in blog form, this isn't really any different than it was before. I have never had to wade through 500 pages of results to find an original source either. The whole thing reeks of FUD to me Methinks that Orlowski and Roddy have their own axes to grind.

  13. They should separate mailing list archives first by Anonymous Coward · · Score: 5, Insightful

    If they really want to make their search engine useful, they ought to separate out Web archives of mailing list discussions. Blogs usually link back to where they got the story, so with only a little digging, you can find the original material. Mailing list discussions, though, are often out of date, irrelevant, and lacking in easy-to-follow references. They annoy me much more when I'm looking for things on the Web.

  14. Re:'Bout time by Anonymous Coward · · Score: 5, Insightful

    I, for one, am happy of searching material only to find that the page is some asshat's blog...

    because what is important, in my point of view, is to GET THE ANSWER to what I'm looking for.

    And if the answer is in a weblog that belongs to "Linux-freaks.Adhzerbahidjan", it still is the answer I'm looking for...

    I mean things like "Proftpd doesn't seem to accept fxp connections", why the hell is this part of my distro not working as I wish...can only be proposed by people having the same problem and discussing it in a blog.

    Another reason I prefer Weblogs to, say, IRC is that I don't have to humiliate myself asking "basic" questions to the 15 year old Guru that is nicknamed "EvilRootBeer" , I just have to parse a few blogs and get my answer without ANY fine manual to read.

    "Nothing against blogs, but you never know where this material came from." Because you KNOW where the news from CNN is coming from ? I mean, they show proof and research material everytime they air a show, or a major groundbreaking news ("Mass destruction weapons found in Irak","Terrorist Bretzel Fails Coup d'Etat"..."

    at least with blogs and the net, you can try and cross check the data, whereas with tv, you usualy only gulp some more mountain dew.

    I just wish you had to find you Linux docs using the manuals provided on the distro and absolutly no other acees to raw data...

  15. Bullshit. Please read. by sethadam1 · · Score: 4, Insightful

    Slashdor IS a blog. Because we're not talking about some Google employee sitting around and making a judgement call on every link on the net, it's obviously going to be automated by robots.

    Slashdot, like other blogs, pollutes search engine searches with their "permalinks," which, although they might be useful, certainly constitute a blog. In fact, one of the problems with blogs and search engines is that they generate thousands of clickable hyperlinks effortlessly. It's great for someone reading a blog and trying to bookmark a certain section - it's terrible for the guy who wants information on combatting spam through more effective use of his SMTP server and has to search through 30 pages of /. and K5 chatter to find some substance.

    Certainly, Google's criteria for what defines a blog might be helpful, but it seems to me like you're subjectively deciding which blogs are legitimate news sources and which are "some kid rambling on." Say whatever you like about the legitimacy of /., but make no mistake about it, it's a blog.

  16. Re:journals by fjordboy · · Score: 3, Insightful

    I'm not so concerned about the journals so much as just forums and discussion boards in general. The blogs don't bother me nearly as much as looking for something on google and the first 30 responses are just people spouting out opinions in messageboards....not unlike usenet. I've had to sift through page after page of forums and discussions to find the real information. I'm all for adding a blog.google.com or something, but I think that doing a similar thing with discussion boards and forums would be a good idea as well.

    However, I think there is a potential problem with blogs that also contain real content or at least original content. A lot of people have regular webpages that they just update regularly in a blog fashion...will there be a seperation?

  17. Re:journals by AndroidCat · · Score: 3, Insightful
    Putting "127.0.0.1 pagead.googlesyndication.com" in /etc/hosts did the trick, though...

    You might want to use 0.0.0.0 instead. That way you won't get an access attempt on localhost. I usually only block annoying ads (x10) or privacy problems (doubleclick). I don't see the point in blocking Google's text ads.

    One day I'm going to put a mini-server on 127.0.0.1 that serves up cute cat pictures instead of blocked banner ads. :^)

    --
    One line blog. I hear that they're called Twitters now.
  18. Offtopic... by jasno · · Score: 3, Insightful

    I've never had any problems with blogs, but the archived mailing lists are what really bugs me. Searching for something, only to have the first 10 pages of hits be duplicates in various archives of a list makes finding relevant information a bit more difficult.

    --

    http://www.masturbateforpeace.com/