Slashdot Mirror


Google Expanding To IRC?

AnimeFreak writes "In this The Register article, Google apparently has been involved in a little bit of activity in various IRC channels. According to Google, as asked by IRC Junkie: they're researching ways to improve their service and the activity is only temporary. Could this mean an ability to search for information that is contained on IRC? Services, such as Netsplit.de and Search IRC exist, and both allow the ability to get information from various IRC networks. Is Google trying to replicate what both these sites have done?"

34 of 208 comments (clear)

  1. Terms. by saintlupus · · Score: 5, Funny

    "Search for w4r3z complete. Results 1-10 of eleventy billion:"

    --saint

    1. Re:Terms. by NightSpots · · Score: 2, Interesting

      Wouldn't it also be nice for google to have an IRC interface to their search engine?

      Google bots in popular channels. It could work.

  2. I believe there is already a such a service by caston · · Score: 2, Informative

    and believe it or not it's called xgoogle.com

    --
    Beings aspergers AND pulling chicks... I enjoy the challenge!
  3. Concerned by FreeLinux · · Score: 4, Insightful

    The "information" on IRC is 99% crap. I'm concerned that, by integrating IRC searching in Google, the signal to noise ratio of Google will go way down. If however, Google keeps it as a separate service like Usenet I suspect that it will go away due to lack of interest.

    Who really wants to search IRC, except the Justice Department?

    1. Re:Concerned by SEWilco · · Score: 2, Insightful
      Who really wants to search IRC, except the Justice Department?

      The new guidelines, billed as a response to the September 11 terrorist attacks, permit the Bureau to engage in the "proactive collection of information on threats to the national security," displacing an older policy that obliged the FBI to have a specific investigative purpose before collecting information on individuals or groups. "FBI on look-out for foreign government hackers"

      Government workers on IRC sounds like a good idea to me. The more time on IRC, the less time they're messing up important things.

    2. Re:Concerned by platypus · · Score: 2, Interesting

      The advantage of IRC, though, compared to the Web, is that it is more reliable - in a very weak sense, but nonetheless.

      Think of the google page rank algorithm, it is in great danger to be made useless by link farms.

      That is because google has problems seperating link farms from "real" pages which link to each other and by that, provide each other some trust (pagerank).

      With well populated irc channels, googles bots can have a higher trust that these channels are not artifial, like the link farms are.

      Although you are right, the information to be found there is crap in most cases, I could imagine google harvesting known good help channels (linux-help, etc) for urls which are posted in conversations ("#bla-expert shouts: If you want to know more about bla, goto www.bla-project.org/documentation") , in order to better qualify web pages.

  4. When will this end? by AtariAmarok · · Score: 5, Funny

    2005 - Google indexes all the things ever said on soap operas and talk radio.

    2007 - Did you forget what you said in your high school cafeteria in 1998? Don't worry, Google now has it indexed.

    2010 - Lost your car keys? Don't worry, Google knows. Just do a search and you will find them.

    --
    Don't blame Durga. I voted for Centauri.
    1. Re:When will this end? by presroi · · Score: 4, Funny

      2012 - No more questions to ask? No problem: google will find a new question for you.

    2. Re:When will this end? by cygnusx · · Score: 2, Funny
    3. Re:When will this end? by Richard_at_work · · Score: 3, Informative

      bash.org dude! Bash.org!!

  5. searching the irc by jlemmerer · · Score: 3, Insightful

    Well, how do you build up a reliable irc database. I mean there are many servers and bots and so on in the irc, and most of them deal with warez and therefore are only up temporary. So if google really wants to build a irc search engine they have to find a way to get rid of the dead links, and also from links that point to illegal copy's (you can be sued for pointing to warez, can't you (see the deCss case)).
    I personally would be glad, for the irc is a little bit, well, unstructured, and a search engine would definitely do good, but the problems building a database and interface based thereon seem enomous to me.

    --
    ".Sig Stealer" was here
    1. Re:searching the irc by That's+Unpossible! · · Score: 4, Interesting

      Well, how do you build up a reliable irc database.

      Have your bots sit in channels worth archiving. Break logs down into manageable chunks (hourly, by size, etc), and index them. Searches pull up these chunks of log with your search terms highlighted.

      I mean there are many servers and bots and so on in the irc, and most of them deal with warez and therefore are only up temporary. So if google really wants to build a irc search engine they have to find a way to get rid of the dead links, and also from links that point to illegal copy's

      Ever try searching for warez on Google Groups? Good luck. They don't archive the binary newsgroups, and it is simple to weed out the posts that contain binaries in regular newsgroups.

      Google is pretty smart, let's wait and see what they come up with.

      --
      Ironically, the word ironically is often used incorrectly.
    2. Re:searching the irc by mrtroy · · Score: 2, Insightful

      First off...I assume Google will only be using the major networks, which are permanent.

      Secondly, there are many servers, and bots...but how does this relate to an IRC database?

      Servers and bots dont talk much. And I would assume google would be ignorning all mode changes

      Next, IRC is not all about warez. Its the first GOOD chat system, and I still prefer it to any IM, hands down.

      And what the hell do you mean IRC is unstructured? There are networks, which have servers, which have channels and users. Users can belong to channels. Whats the problem?

      And google would likely just be doing some logging of channels, a simple channel listing would be redundant.

      --
      [I can picture a world without war, without hate. I can picture us attacking that world, because they'd never expect it]
  6. it's called xgoogle.com by theGreater · · Score: 3, Interesting

    Well, yes and no. xGoogle is designed largely around finding shared files on IRC IIRC (always wanted to do that). As far as I know, it depends not upon channel content, but on server/channel names and perhaps M'sOTD.

    -theGreater Pedant.

  7. google ist listening to the heartbeat by presroi · · Score: 4, Insightful

    Well, recalling from where I get "news" (read: 90% useless but funny content via links), the IRC (IRCnet, which is popular in Germany) is a incredible fast distribution way for links.

    Assuming that google is interested in finding new sites as soon as possible, they should crawl the irc channels.

    This does not mean that they are going to index it.

  8. The original scoop on this story... by Punchinello · · Score: 5, Informative

    It seems Tony Collen had the original scoop on this story. It is more informative than the Register link.

    If you scroll down his original web log on this topic you will see Google's first official acknowledgment of their IRC activity.

    --

    Remember... ZG9uJ3QgZm9yZ2V0IHRvIGRyaW5rIHlvdXIgb3ZhbHRpbmU=

  9. Perhaps Google can now answer the all-important... by tuffy · · Score: 5, Funny

    ...a/s/l?

    --

    Ita erat quando hic adveni.

  10. does that mean... by zr-rifle · · Score: 5, Interesting

    that spam will extend itself to irc?
    Thousands if not millions of bogus irc channels with specific keywords inserted in the topic only to attract hits on the main google search page?

    --
    Hack your mind out of its sandbox.
  11. Already? by chendo · · Score: 3, Funny

    XGoogle.ORG -> Error: Cannot Connect to Data Base
    Too many connections


    Slashdotted already? We slashdotters are more dangerous than a beowulf cluster of... something.

    --
    Founder of Mirror Moon - Tsukihime Game Trans
  12. the last line in the article got me thinking.. by WegianWarrior · · Score: 3, Informative

    How IRC users would react to a bot from microsoft.com is an exercise left to the reader.

    If the IRC is anything like was it was when I last brushed thru, not many will even notice - or attemt to engage the 'bots in "virtual intimate acts".

    Off course, there would always be someone - likely a Mac or Linux user - who will notice and scream up about how MicroSoft is 'spying' on the IRC-network, which in turn would lead to several more or less wellinformed blogs writting about it, which in turn will lead to a /. headline close to "Micro$oft trying to take over IRC, will shut out 3rd part clients"...

    --
    Everything in the world is controlled by a small, evil group to which, unfortunately, no one you know belongs.
  13. Bot vs. Bot by matchlight · · Score: 5, Insightful

    The IRC admins, at least for most of the better channels, will simply set up a config to kick/ban the google bot. Many channels don't allow non-human connections unless set up by the channel admins. Unlike the annoying spammers who uses legit and stolen access points, google will likely come from a single legit source making the process of denying access easier.

    Google shouldn't be trying to find more content, they should be working on filtering out the mass of garbage sites that already exist.

  14. The Buzz In IRC... by SEWilco · · Score: 2, Funny
    Google Labs does have to keep busy. I wonder what they're up to.
    • Identify authoritative IRC participants, their information and related web sites?
    • Identify stupid IRC participants, and reduce the importance of their information and related web sites?
    • IRC Rent-An-Expert Service?
    • GoogleNatter: The Bot that makes you sound authoritative.
    • GoogletyMooglety: The IRC filter that lets you hear only the good stuff.
  15. p2p search by Manos+Batsis · · Score: 2, Interesting

    With the importance of Google in our every day lives steadily increasing, I don't dare to think of what might happen if Google et all stops being our good friend at some distant point. Centralized repositories are just not the way to go, we need a distributed, user-base owned, search engine. Maybe in the next Matrix moovie...

    1. Re:p2p search by mregit · · Score: 2, Funny
      ChatScan. Feb, 2001.

      ChatScan was an Israeli enterprise that claimed 10 million in funding. They joined a bot to IRC channels. The bot broadcast live channel text to their website. The idea was, people could scan down a list of pre-selected channels, see which had interesting conversations, then go and join them - or just watch from the website.

      Users who found what they thought was private conversation up on the web were outraged. IRC channel owners and admins agree with you 100% - they considered this unwanted and unauthorize intrusion a gross invasion of privacy, and banned the bots. The bots came back on new IPs and were banned again. When they came back on a variety of IPs, IRC admins got together and put up a list so everyone could be sure to ban every last one.

      Logging and archiving classes, guest speakers, technical chats and special events is a great idea, IF the people putting on said event WANT it logged. But those who do, already put the transcripts up on the web.

      IRC networks compete for users. Users definitely will not stay on networks or in channels if they think there is a chance of casual conversation being logged. Doesn't matter if Google tries, or some new startup. It won't fly.

  16. once again proving nothing online is private... by *weasel · · Score: 4, Insightful

    like archiving email, usenet, and web traffic before it - this is simply a reminder that nothing you type through an open network is -private-. this is a lesson most of us should have learned a long time ago.

    but this isn't an invasion of privacy. there's no expectation of privacy when you log onto a public chat board. just as there's no expectation of privacy should you decide to walk naked through a park.

    the best you can hope for online is pseudonymity.
    but that's out the window with the combined power of google. which is quickly becoming the internet's inadvertant Big Brother.

    the primary difference being, google works -for- the people just as much as it works -against- the people.

    --
    // "Can't clowns and pirates just -try- to get along?"
  17. Google The Movie by wo1verin3 · · Score: 5, Funny

    Bill Gates: Speak.

    Neo: The search engine Google has grown beyond your control. You cannot stop him -- but I can.

    Bill Gates: And if you fail?

    Neo: I won't.

    --- several scenes later ---

    Google: Mr. Anderson! Welcome back, we missed you.

    * Google pauses and looks around at the multitude of web sites and irc channels he has cached

    Google: Like what I've done with the place?

    Neo: It ends tonight.

    Google: I know it does, I've had some researched figure out the answer for me. That's why the rest of me is just going to enjoy chatting on irc while we fight. I've seen the logs and irc'ers already know that I'm the one that beats you, so they're just gonna download from some leet xdcc bots.

  18. Google for Cyber Sex source:irc by jsse · · Score: 5, Funny

    Now we've new category of stuffs to search for other than p0rns. :)

    bloodninja: Ok baby, we got to hurry, I don't know how long I can keep it ready for you.
    j_gurli3: thats ok. ok i'm a japanese schoolgirl, what r u.
    bloodninja: A Rhinocerus. Well, hung like one, thats for sure.
    j_gurli3: haha, ok lets go.
    j_gurli3: i put my hand through ur hair, and kiss u on the neck.
    bloodninja: I stomp the ground, and snort, to alert you that you are in my breeding territory.
    j_gurli3: haha, ok, u know that turns me on.
    j_gurli3: i start unbuttoning ur shirt.
    bloodninja: Rhinoceruses don't wear shirts.
    j_gurli3: No, ur not really a Rhinocerus silly, it's just part of the game.
    bloodninja: Rhinoceruses don't play games. They f*cking charge your ass.
    j_gurli3: stop, cmon be serious.
    bloodninja: It doesn't get any more serious than a Rhinocerus about to charge your ass.
    bloodninja: I stomp my feet, the dust stirs around my tough skinned feet.
    j_gurli3: thats it.
    bloodninja: Nostrils flaring, I lower my head. My horn, like some phallic symbol of my potent virility, is the last thing you see as skulls collide and mine remains the victor. You are now a bloody red ragdoll suspended in the air on my mighty horn.
    bloodninja: Goddam am I hard now.


    (Original post from bash.org

  19. I can only image it now... by SageMadHatter · · Score: 4, Funny

    *Goes into new google IRC search mechanism and searches for term "Warez"*

    Result: "Warez" is a very common word and was not included in your search

    Mad Hatter

  20. Actually... by Short+Circuit · · Score: 2, Insightful

    The idea of searchable IRC logs kindof scares me. An investigative team need only go to Google to search for discussions by someone with the nickname "l33t".

    Of course, IRC logs are already out there, often made available by the denizens in charge of the channel in question. But they're not hooked up to a common database.

    The speed of information dissemination is great for research and development, but that applies to both you, and people who want to learn about you.

    I've mentioned several times on IRC that I have a brain disorder (Asperger's syndrome, specifically), but I may have been operating under the assumption that the information wasn't important enough to be spread around to twenty or thirty Googleable sites. To be honest, I don't care who knows, which is why I'm saying it here.

  21. That would be awesome cool actually by Karma+Sucks · · Score: 2, Interesting

    For example, I would like to search and browse the chatter on the SUSE acquisition and KDE vs Ximian situation on #gnome @ irc.gimp.org.

    If Google could allow me to do that, that would be fantastic.

    As an aside, does anyone know of IRC logs for #gnome?

    --
    (Please browse at -1 to read this comment.)
  22. Google + IRC = Better Ranking method? by jasonhamilton · · Score: 2, Insightful

    The odd thing is that people are reporting the robot joining channels, doing /whois on users and more. What value could the /whois info from random users have? The only thing one can safely say about this whole situation is: Google is doing some testing on IRC. Personally, this is how I look at it: Google ranks websites according to many criteria. Ranging from keyword density, keywords, text placement on the page, to incoming links and what the text within the links say. What use could IRC have? It is possible that active topics that are being discussed in real time could be used to help boost rankings towards subjects that are currently hot topics, similar to how google currently temporarily boosts the scoring of newly indexed pages to the google index. This is of course, pure speculation. As others no doubt have already thought -- actual postings of private user information would be useless, as ChatScan had several million in funding couldn't pull it off with the IRC community two years ago. However, using that information to derive popular subjects might. OTOH, google can likely gleam similar information from the millions of searches users enter into their search engine each day.

    --
    SearchIRC - Now with live chat directory!
  23. Indexing URLs, not conversations by lordscarlet · · Score: 2, Interesting

    From the information I've seen, Google is capturing URLs in channels, not the actual conversations.

  24. Re:Interesting by sokk · · Score: 2, Informative

    I visited the site, but it only index channel names. Not discussions. Eg. you can't search for: "Cannot open /dev/dsp" +quake 3 I should've mentioned it; but I meant searching past discussions.

  25. No proper Format by kyndig · · Score: 2, Interesting

    I have reviewed several logs of IRC chat rooms, and have not yet seen a good log format. Reading something like:

    klax: So what'd you eat for dinner
    bryan: Does anyone know how to recompile a kernel?
    ray: I had french fries and a beer

    Provides little to no format. Google currently cache's PDF files in their cache; and should your search term return a pdf file, all your keywords are highlighted. I would imagine that google would use this same approach for their log format system, yet even this does not provide a friendly browsable view. I don't have any recomendation for a proper format, as I have not seen any good formated logs.

    --
    My Thoughts, Kyndig