Slashdot Mirror


Could IBM Shake up the Search Engine World?

overshoot writes "IBM has just tossed a bucket of chum into the whole search showdown, which Microsoft thought was between them and Google. Apparently, IBM Research has developed a 'key facts' search technology (as distinct from 'key words') over the last several years. Now they're going public with it -- by putting it on SourceForge under an OSS license!" (According to the article, it's expected to show up on SourceForge by the end of this year, not immediately.)

193 comments

  1. End of search engine wars? by thc69 · · Score: 0

    ...resulting in 100% consolidation?

    Heheh...not.

    --
    Procrastination -- because good things come to those who wait.
  2. Answer: No. by Anonymous Coward · · Score: 0

    That is all.

    1. Re:Answer: No. by Anonymous Coward · · Score: 0

      I wholeheartedly agree, sir.

  3. Slow down IBM ... by Anonymous Coward · · Score: 2, Funny

    The search bar on your site barely works as it is.

    1. Re:Slow down IBM ... by scotty1024 · · Score: 1

      I agree, I try their search tool at www.ibm.com every so often and I still to this day have to use Google to find anything on their web site. My money is on them donating it to FOSS so someone can fix it for them.

    2. Re:Slow down IBM ... by ccbutler · · Score: 1

      no doubt!

      trying to search within IBM's own web site is a joke.

    3. Re:Slow down IBM ... by Lord+Pillage · · Score: 1

      No, they just designed it that way.

      --
      try { Signature mysig = new CleverAttempt(); } catch(NonCleverSignatureException e) { postanyway(); }
    4. Re:Slow down IBM ... by Anonymous Coward · · Score: 0

      IBM does often devellop software they will never use themselves, and this does not always mean they disapprove the result! But... this SDK is "an all-JavaTM implementation" and I know that the Java stuff IBM writes is either brilliant or it suckes (big time)!!!

  4. SourceForge proposal... by RoadkillBunny · · Score: 2, Funny

    It will be funny if sf.net denies them. But then, I guess they got a deal with them already.

    --
    Cheers,
    RoadkillBunny
    1. Re:SourceForge proposal... by GoldAnt · · Score: 1

      I've always thought google adequately searched for me. AFAIK there isn't anyone else with the same amount of resources dedicated to searching...?

    2. Re:SourceForge proposal... by B3ryllium · · Score: 1

      I think you missed the story where Yahoo outran Google ...

    3. Re:SourceForge proposal... by maxwell+demon · · Score: 1

      If I understand the linked article correctly, the new thing is that they don't just look for the occurence of certain words, but try to get some (probably very basic) sort of meaning of the thing. Which I think could give a big advantage esp. for exclusion searches (e.g. Einstein -physicist).

      --
      The Tao of math: The numbers you can count are not the real numbers.
    4. Re:SourceForge proposal... by GoldAnt · · Score: 1

      I could see how that could be helpful... Don't know how many times i've tried to search for something and gotten clobbered with a more common meaning of the word.

  5. LOL by Anonymous Coward · · Score: 0

    Now FOSS will destroy Google as well as Microsoft.

    Companies that are going bankrupt (Like 321 Copy software company) or CloneCD should also release their programs under FOSS before going under to destroy their opponents as well, and everyone except the information monopolists benefit.

  6. ok but by Anonymous Coward · · Score: 5, Funny

    I'll stick to letting Google know every single detail of my life thanks.

  7. Yay. by Sinryc · · Score: 4, Funny

    Yay, now EVERYONE can make their own Search Engine and say how they are SO much better then everyone elses!

    --
    Yay, I have a sig.
    1. Re:Yay. by TheOtherAgentM · · Score: 2, Funny

      I plan to make mine far inferior, but drive people to use my search engine with spyware.

    2. Re:Yay. by b0r1s · · Score: 2, Insightful

      Size of index, speed (requiring hardware, content nodes, etc), tuning (algorithms may be alike, but small tuning makes all the difference with the SEO spam going around), and anti-abuse (worms searching for phpBB urls are bad, m-kay) will keep this from being a 'free perfect search for everyone' tool.

      --
      Mooniacs for iOS and Android
    3. Re:Yay. by gstoddart · · Score: 4, Interesting
      Yay, now EVERYONE can make their own Search Engine and say how they are SO much better then everyone elses!

      Well, let's just hope it becomes one big, honkin' FOSS project.

      Search technology is huge. Having it available which apparently can index conceptual links as opposed to literal links is astounding.

      I say smart move on IBM's side. Get all the publicity of opening up really cool tech to the open-source community, then proceed to make a gazillion dollars in professional services gigs, and get the added benefit of everyone making your tech better because it's useful.

      Provided this isn't steamingly fresh technology (unlikely from IBM realy) they should see some interest in this.

      I for one, can imagine a nice bunch of associative content, and am wondering how much resources this might require to run on a machine and I'm going to go RTFA. =)
      --
      Lost at C:>. Found at C.
    4. Re:Yay. by coop0030 · · Score: 1

      I for one, am not excited about the fact that any Joe Shmoe could send out robots to index my pages. If there are thousands of robots indexing my pages every day I am going to have a pretty large bandwidth bill to pay.

      Let's hope it is complicated enough that not everyone will be able to set up their own search engine easily.

      I would be excited though if it was a single large open source entity that works on a competing search engine. That would be neat!

    5. Re:Yay. by Anonymous Coward · · Score: 0

      Have you not heard of robots.txt?

    6. Re:Yay. by sentanta · · Score: 1

      If this is licensed under an Open Source license wouldn't Google, Yahoo, etc take whatever is worthwhile and incorporate it into their existing search algorithms?

      --
      The Big Yuan - tracking mainland China
    7. Re:Yay. by Karzz1 · · Score: 1

      Bill? Is that you?

      Seriously though, isnt that how msn search gets the skewed usage statistics that it does (ok, I digress, IE is *technically* not spyware..... yet).

      --
      Beware of he who would deny you access to information, for in his heart he dreams himself your master.
    8. Re:Yay. by Anonymous Coward · · Score: 0

      omg lol i didtn reeliz how gay usign windoze n propierty softwair wuz til i strted reedin slashodt

      windoze-fre zone!

    9. Re:Yay. by pembo13 · · Score: 1

      Yes.

      --
      "Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
    10. Re:Yay. by jurt1235 · · Score: 1

      There are already thousands of bots and spiders busy on the web. Some really ridiculous ones, so this one more will not really matter.

      --

      My wife's sketchblog Blob[p]: Gastrono-me
    11. Re:Yay. by Basje · · Score: 1

      Not if IBM patents it.

      That way either
      1. Google, MS, Yahoo etc can use the open source implementation (which is a licence to use the code including the patented stuff), possibly requiring opening their own codebase or
      2. they licence the patents from IBM

      Remember IBM still has the largest patent portfolio.

      --
      the pun is mightier than the sword
    12. Re:Yay. by Bimo_Dude · · Score: 1
      worms searching for phpBB urls are bad, m-kay

      Man, I've really got to slow down while reading. I misread this as, "worms searching for PHB urls are bad, m-kay."

      Hmmmm... :)

      --
      "Teleporting Rodents with D-Cell Battery Displacement" theory -- IgnoramusMaximus (692000)
    13. Re:Yay. by LifesABeach · · Score: 1

      Do you think that IBM is still pissed about Windows 3.1, and losing all that PC market share to the convicted of Redmond?

    14. Re:Yay. by shokk · · Score: 1

      Yay! Now web sites can be hit by 100x the irrelevant search engine traffic instead of a few like Yahoo and Google that actually matter. This is a DoD in the making. I'm sure there will be more than a few that decide to ignore robots.txt.

      --
      "Beware of he who would deny you access to information, for in his heart, he dreams himself your master."
  8. Open Source by Anonymous Coward · · Score: 0

    IBM is really into open source lately

  9. IBM? YOU SERIOUS? by Anonymous Coward · · Score: 0

    Their software is horrible. Ever worked with DB2? It sucks bigtime.

    1. Re:IBM? YOU SERIOUS? by Glooty-Us-Maximus · · Score: 1

      Is that why powerhouses such as Ebay use it as well as other IBM products such as Websphere?

    2. Re:IBM? YOU SERIOUS? by oopsdude · · Score: 2, Interesting

      IBM has always been cozy with eBay; as I recall, eBay's logo said "powered by IBM" for quite a long time.

    3. Re:IBM? YOU SERIOUS? by Anonymous Coward · · Score: 0

      Just because a company uses some software does not mean it's any good.

    4. Re:IBM? YOU SERIOUS? by Glooty-Us-Maximus · · Score: 1

      And how does that detract from the quality of IBM's products? I'm sure IBM may have given them discounts for some free advertising in the form of that logo, but I don't feel that Ebay would have gone with an inferior product which would cost them money through downtime or the inability to handle a higher number of users.

  10. Spotlight! by Anonymous Coward · · Score: 0

    How will this compare with, say, something like Spotlight?

    1. Re:Spotlight! by Anonymous Coward · · Score: 0

      It feels much snappier than Spotlight.

    2. Re:Spotlight! by Anonymous Coward · · Score: 1, Funny

      Depends on watts and type of bulb.

    3. Re:Spotlight! by FLAGGR · · Score: 2, Insightful

      Damnit, are you talking about spotlight in Tiger? There's a huge goddamn difference between a desktop indexing search and an internet search engine. My god. The scale is like, so insanly different (and if the Apple PR has said anything about it being scallable to the likes of an internet search, then I'm selling my mac, NOW) How does this compare to spotlight? How does an apple compare to an orange? How does the color red compare to the number 7.623? How does 6 in the afternoon compare to the goatse man?

    4. Re:Spotlight! by FLAGGR · · Score: 1

      Goddamn it, just read the article. The slashdot summary was slightly misleading. I still hold that they are vastly different technologies, but a little less so now that I RTFA. Sorry.

    5. Re:Spotlight! by seramar · · Score: 1

      "How does 6 in the afternoon compare to the goatse man"

      By about 180 degrees.

      --
      australian project gutenberg is better than the original.
    6. Re:Spotlight! by bradkittenbrink · · Score: 1

      Don't apologize, How does 6 in the afternoon compare to the goatse man? just became my new .sig.

  11. Google? by daviq · · Score: 0

    This is where Google comes in and buys IBM out in full.

    --
    Go to the w3.org and put Slashdot.org through the validator.
    1. Re:Google? by confusion · · Score: 2, Informative

      I'm guessing that IBM has a 50% higher market cap, 30X Google's revenues and $110B in assets doesn't come into play here?

      Jerry

  12. Long hard road. by UlfGabe · · Score: 0, Redundant

    I applaude IBM for taking this stance and entering the hotly contested search engine world.

    More competition is better. I would enjoy more innovation. They do have a hard long road to follow however, and they may find it difficult.

    Check out my journal if interested in a difficult problem.

    --
    Check journal for info on Anti-TextBook, an idea by me.
    1. Re:Long hard road. by Carnildo · · Score: 1

      The search algorithm is just a minor part of running a search engine. The key part, which Google has down pat, is getting the results from a metric buttload of web pages, doing it fast, and doing it for a very large number of people at once.

      --
      "They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
    2. Re:Long hard road. by drawdevm2000 · · Score: 1

      Yeah but I doubt IBM made like a web search engine like Google, I bet you its just a single site search engine, purely for one site only. But then again you neevr know.

      --
      GSRG.org Senior Administrator and Director
    3. Re:Long hard road. by carl0ski · · Score: 1

      after reading the blurb there was not even a mention of web based. OSS community is itching for an effect Desktop search algorythm maybe this is it. Software to index anything

  13. The only thing IBM is going to do... by ninja_assault_kitten · · Score: 0, Troll

    Is die a very slow and painful death...at least in the software market.

    1. Re:The only thing IBM is going to do... by guaigean · · Score: 1

      Why exactly? Because they are gaining more supporters through community offerings?

      --
      Microsoft Sucks, F/OSS Rocks. I get mod points now right?
  14. http://almaden.ibm.com/cs/crawler by Urgo · · Score: 5, Informative

    wfp2.almaden.ibm.com - - [08/Aug/2005:15:48:34 -0400] "GET /robots.txt HTTP/1.0" 200 69 "-" "http://www.almaden.ibm.com/cs/crawler [fc7]"
    wfp2.almaden.ibm.com - - [08/Aug/2005:15:48:38 -0400] "GET / HTTP/1.0" 200 41317 "-" "http://www.almaden.ibm.com/cs/crawler [fc7]"

    I've been getting once a day connections on my server from ibm for quite some time now (a year or so). Doesn't surprise me in the least. :)

    --
    Belive in Technology and AMAZE yourself. -- RIP ZDTV/TechTV
    1. Re:http://almaden.ibm.com/cs/crawler by anagama · · Score: 1

      Riuniti on ice - so nice.

      --
      What changed under Obama? Nothing Good
    2. Re:http://almaden.ibm.com/cs/crawler by muzza · · Score: 1

      almaden.ibm.com

      I had to look at that twice because the first time I read laden.bin.com... they really are declaring search engine jihad!

    3. Re:http://almaden.ibm.com/cs/crawler by Kafka_Canada · · Score: 1

      Yahoo Akhbar, Yahoo Akhbar...

      --
      Fuck it
    4. Re:http://almaden.ibm.com/cs/crawler by johnnytv · · Score: 2, Funny

      /usr/local/bin/laden

      --
      Install, Then Run
    5. Re:http://almaden.ibm.com/cs/crawler by kinema · · Score: 1
      "GET / HTTP/1.0"
      Why HTTP v1.0 and not v1.1?
  15. Information wants to be free... by Kloog · · Score: 0, Offtopic

    as in freedom, not free as in beer.

    1. Re:Information wants to be free... by WillAffleckUW · · Score: 1

      as in freedom, not free as in beer.

      isn't that supposed to be bheer?

      However, while I agree that Information wants to be free, I prefer cider myself.

      --
      -- Tigger warning: This post may contain tiggers! --
    2. Re:Information wants to be free... by Anonymous Coward · · Score: 0

      In Soviet Russia, they have information about free beer.

  16. not a web search engine by sled · · Score: 5, Insightful

    From TFA: "While simple but powerful keyword searches have revolutionized how Internet users locate and retrieve information, IBM is looking to transform how office workers sift through the piles of data stored inside organizations."

    The posting implies that IBM is entering into competition with MS and Google. I saw no indication that IBM intends to launch a web search engine.

    1. Re:not a web search engine by CypherXero · · Score: 0

      Both Microsoft and Google have both online search engines, and local computer search tools.

    2. Re:not a web search engine by b0r1s · · Score: 3, Informative

      The Google appliance is marketed (if not in the online docs, at least in person) as an enterprise tool for organizations to search their internal data. While this ceratinly isn't their primary revenue stream, this tool would in fact compete with that aspect of Google's business.

      --
      Mooniacs for iOS and Android
    3. Re:not a web search engine by SlashEdsDoYourJobs · · Score: 1

      One of Google's products is an intranet appliance for "sifting through the piles of data stored inside organisations". This would put IBM in direct competition with them in that market. Public search isn't the only thing that Google does, you know.

  17. Re:ASFR Successfully Trained Students by Anonymous Coward · · Score: 0

    omg i think i love you.

  18. Finally some competition by Device666 · · Score: 2, Insightful

    Now I think Microsoft has a big problem... Now they really should start becoming innovative... And google finally could have a nice open source competitor. This will increase innovation in giant leaps and ofcourse would make it hard for microsoft ever to beat Google.. This will be a worthy test of the power of open source!!!

    1. Re:Finally some competition by Donny+Smith · · Score: 2, Insightful

      > Now I think Microsoft has a big problem...

      How's that?
      This software has 0% market share (and that was with all the IBM's sales, support and development efforts).
      They couldn't make a dent in the market (why do you think they're releasing it to open source if it's so good)?

      >And google finally could have a nice open source competitor.

      I don't think so. Those search engine guys are mean mother fuckers - thousands and thousands of full-time engineers working on solely one task - imporoving their search products/services.
      On the other hand, IBM tosses out a semi-working product to the loosely connected community to debug.
      My guess is that they simply realized they're unable to compete so now they just hope to prop up DB2 or WebSphere sales.

      Come on, have you ever tried to find anything on IBM's own site? It's laughable - they can't make it work on their own fucking web site!
      When was the last time you heard about their open source ViaVoice (or whatever that thing they released to open source few years ago was)?

      >This will increase innovation in giant leaps and ofcourse would make it hard for microsoft ever to beat Google.

      Oh - the open source search software will finally push Google to innovate, which in turn will leave MS in dust.
      Until that happens (and while Yahoo indexes 2.5 times more docs than Google), Google's engineers will be on collective vacation, taking it easy while allowing this open source search engine to get its shit together.

    2. Re:Finally some competition by Frit+Mock · · Score: 1

      "And google finally could have a nice open source competitor."

      There is already a *possible" OSS competitor for quite some time for quite some time ... although the project still lacks a few thousand people running that peer to peer based search engine.

      Take a look at http://www.yacy.net/yacy/ and try it out.

    3. Re:Finally some competition by mforbes · · Score: 1

      I don't think so. Those search engine guys are mean mother fuckers - thousands and thousands of full-time engineers working on solely one task - imporoving their search products/services.

      <snip>

      Google's engineers will be on collective vacation, taking it easy while allowing this open source search engine to get its shit together.

      Make up your mind. Are they on vacation or are they working solely on improving their search engine? (leaving out any comments about your use of such colorful language)

      --

      Allegedly real newspaper headline from 1998:
      Man Struck by Lightning Faces Battery Charge

  19. IBM has so much unpublished advanced research by snotclot · · Score: 2, Interesting

    IBM is pretty crazy when it comes to advanced research in any of its fields.

    I have heard of stories from researchers there that IBM has its own terminology for alot of technical EE/CS stuff, as they discovered it way before the world did but were so secretive they didn't publish any of it.

    I'm not surprised if IBM has enough tech in search to seriously knock down Google!

    This OSS thing comes as a surprise, as it contradicts their secretiveness about their research.

  20. chum and guns by Burz · · Score: 4, Funny

    a bucket of chum into the whole search showdown,

    This is an awful mixed metaphor. How does Slashdot expect its readers to navigate the treacherous IT seas with such poorly-seasoned and half-baked information?

    1. Re:chum and guns by Soko · · Score: 0

      I think teh reference is from here:

      Amy, I think you're going to earn a place as our Official ASR Sysadmin's Chum. In a secondary, particularly bloody-minded sense of the word.

      Steve VanDevender


      First thing that I though of.

      Soko

      --
      "Depression is merely anger without enthusiasm." - Anonymous
    2. Re:chum and guns by overshoot · · Score: 5, Funny
      This is an awful mixed metaphor. How does Slashdot expect its readers to navigate the treacherous IT seas with such poorly-seasoned and half-baked information?

      It's easy when you're three sheets to the wind, even if you pepper your reply with editorial condiments. Anyway, the goose is sufficiently sauced to be worth a gander.

      --
      Lacking <sarcasm> tags, /. substitutes moderation as "Troll."
    3. Re:chum and guns by CaptainCarrot · · Score: 5, Funny
      I know! It throws a monkey wrench into that entire kettle of fish! There's no foothold you could sink your teeth into! It blows your mind from the ground up!

      ...and so forth.

      --
      And the brethren went away edified.
    4. Re:chum and guns by Anonymous Coward · · Score: 0

      a bucket of "chum" ? I don't even know what that is. But it's one 'h' away from making me vomit :(

    5. Re:chum and guns by Anonymous Coward · · Score: 0

      I'm not sure if you're fishing for comments (or trolling...), but I'll comment just for the halibut... Some of these articles are as if people bandit together and are just taking pot shots at big companies. Some are even submitted anonymously by possies who are so yellow-bellied that they can't show the whites of their eyes as they shoot off their mouthes. Don't sit perched on your high horse. Take it with a grain of salt.

    6. Re:chum and guns by Hektor_Troy · · Score: 1

      And remember, that a penny saved is worth two in the bushes. Oh, and don't cross the road, if you can't get out of the kitchen.

      --
      We do not live in the 21st century. We live in the 20 second century.
    7. Re:chum and guns by Tsu+Dho+Nimh · · Score: 1
      Hey, give the guy a break! He was writing that blurb BEFORE HIS MORNING CUP OF COFFEE!!!!

      I wuz there.

    8. Re:chum and guns by sharkey · · Score: 1

      Maybe, but this IS Slashdot. After all, you can lead a yak to water, but you can't teach an old dog to make a silk purse out of a pig in a poke.

      --

      --
      "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
    9. Re:chum and guns by Omnieiunium · · Score: 1

      It was so bad, I thought it say cum.

    10. Re:chum and guns by freewaybear · · Score: 1

      I love mixed metaphors! what's a meta for?

      --
      Registered Linux User #404114 [url=http://www.punkoiska.com][img]http://img406.imageshack.us/img406/4379/posbannercf5.g
  21. what about yahoo!? by dezmund · · Score: 5, Insightful

    MSN thought it was between them and google?
    http://news.yahoo.com/news?tmpl=story&u=/cmp/20050 722/tc_cmp/166401634
    sorry bill, but if anything its between yahoo (22% share of all searches) and google (47%).

    Not to mention most of those MSN searches (12%) are from IE users who don't know how to change their browser's start page.

    1. Re:what about yahoo!? by Exitar · · Score: 1

      A friend of mine for example.
      I expect this post to be modded informative of course...

    2. Re:what about yahoo!? by Anonymous Coward · · Score: 0

      Not to mention most of those MSN searches (12%) are from IE users who don't know how to change their browser's start page.

      And that's just like your parents who couldn't figure out how to put the condom on right and they had little "accident" dezmund.

      Same difference. If you can "assume" that they are from IE users who don't know how to change their start page, then I can assume you were born by your mother rolling over in the wet spot.

    3. Re:what about yahoo!? by f3773t · · Score: 0

      I agree there that it should have read between google and yahoo. Hardly anyone I know uses the msn search ... I don't have stats but the people I talk to either yahoo or google.

      I do both ... put the same search string into both ... for text searches they are equal. Stuff like "Jotun cathodic protection" or "wellstream flexible end fittings" will turn up good matches on both ... but google image search is still miles ahead of yahoo!

      One other thing to note is the article mentiones that
      "plans to give away key search technologies for corporate data retrieval that use concepts and facts instead of simpler "keyword" searches relied upon by consumer Web companies such as Google Inc."

      The fact that both google and yahoo can turn up meaningful stuff for "wellstream flexible end fittings" may mean they already have such capabilities ... or perhaps it means that keyword searches are just as effective.

    4. Re:what about yahoo!? by Punboy · · Score: 3, Funny

      Plus, those who think that the address bar are for system commands (and are thus afraid of it) and the search-bar is where you type in the website address o.O

      My grandparents are weird.

      --
      If you like what I've said here, and want to read more, go to http://www.krillrblog.com
    5. Re:what about yahoo!? by ciroknight · · Score: 1

      Uh where do you get your numbers and do you work for Yahoo/Google/MSN/etc.?

      Last I heard it's pretty darned impossible to tell just how many searches are processed by which search engine unless you are actually within those companies and have access to that company's numbers.

      It's possible to get averages from websites by referal beacons, but some engines list sites higher than others, some are enhanced by paid ads, etc. etc. IT's just not scientific at all to post percentages of what you don't know.

      So, in MSN's eyes, it could very well be between them and Google. And in Google's eyes they may rule the world. And in Baidu's eyes, the world may not be enough ;).

      --
      "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
    6. Re:what about yahoo!? by bhtooefr · · Score: 1

      I know someone who KNOWS how to change her browser start page (although at our school, she couldn't unless she used another browser), yet STILL uses MSN. She told me that she doesn't use Google because she doesn't trust it. Fair enough, but MSN instead? WTF is THAT?

      And it's not MS love, either - I even had her using Opera for a while, and it wasn't even b/c of security. Of course, this was all at school, and they took down the public share, so Opera was harder to use...

    7. Re:what about yahoo!? by WWWWolf · · Score: 1

      Oh, a lot of people I know type URLs to the google search box...

      I do, too, but only to get that ever-important "cache" or "show as HTML" link =)

    8. Re:what about yahoo!? by eqkivaro · · Score: 1

      How do you distinguish between those who don't know how to change their start page and those who choose MSN as their start page?

  22. Get it now by QuantumG · · Score: 3, Informative

    Unstructured Information Management Architecture SDK. The UIMA SDK (Software Development Kit), is an all-JavaTM implementation of the UIMA framework, and it supports the implementation, description, composition, and deployment of UIMA components and applications. It also supports the developer with an Eclipse -based development environment that includes a set of tools and utilities for using UIMA.

    Go you crazy Java dudes, go.

    --
    How we know is more important than what we know.
  23. What information really wants by InfiniteWisdom · · Score: 1

    ...is for you to stop anthropomorphizing it/

  24. I, for one, ... by kaan · · Score: 2, Funny

    I, for one, welcome our new chum-tossing search-engine overlords...

  25. This means K-... by TransEurope · · Score: 1, Offtopic

    KDeskserach?

    KDeskfinder?

    Koogle?

    Kahoo?
     
    ...in the next KDE :D

    1. Re:This means K-... by Anonymous Coward · · Score: 0

      I don't think Krap is taken yet

    2. Re:This means K-... by Anonymous Coward · · Score: 0

      Kaboom

    3. Re:This means K-... by Anonymous Coward · · Score: 0

      I don't think Krap is taken yet

      A good dose of kexlax will help that.

    4. Re:This means K-... by TransEurope · · Score: 1

      Kvista or Koutlook would be great *g*

    5. Re:This means K-... by Anonymous Coward · · Score: 0

      Actually, they're already working on it. It is called Tenor. http://dot.kde.org/1113428593/.

      Not everything for KDE begins with K!

  26. What is still missing... by sploxx · · Score: 1

    is a P2P layer on top of this complete with efficient, distributed and secure search. A good P2P search engine is still missing and (IMHO) one of the more important things needed, last but not least for political reasons (privacy, censorship etc.).

    That would make it possible to give back control of every aspect of the 'web experience' to the user.

    Ok, I'm dreaming :-)

    1. Re:What is still missing... by nostriluu · · Score: 2, Informative

      You could start with this: http://www.yacy.net/yacy/

    2. Re:What is still missing... by sploxx · · Score: 1

      Looks interesting, thanks for the link! :)

  27. Just ignore the link in the slashdot item by hackwrench · · Score: 5, Informative

    The important information is simply the url http://www.alphaworks.ibm.com/tech/uima/

    1. Re:Just ignore the link in the slashdot item by Anonymous Coward · · Score: 0
      UIMA, huh?

      not to be confused with the infamous IUMA.

    2. Re:Just ignore the link in the slashdot item by SnprBoB86 · · Score: 5, Informative

      Definitely read/skim the SDK User's Guide http://dl.alphaworks.ibm.com/technologies/uima/UIM A_SDK_Users_Guide_Reference.pdf

      The annotator premise is almost too simple; it's brilliant.

      --
      http://brandonbloom.name
    3. Re:Just ignore the link in the slashdot item by freewaybear · · Score: 1

      Who, Uima Thurman?

      --
      Registered Linux User #404114 [url=http://www.punkoiska.com][img]http://img406.imageshack.us/img406/4379/posbannercf5.g
    4. Re:Just ignore the link in the slashdot item by SkjeggApe · · Score: 1

      Woa... That sucker is 342 pages, filled with examples, "tutotials", javadocs, etc, etc.. Maybe this is worth looking into..

  28. But how is it different? by Anonymous Coward · · Score: 0

    The name is different, but how are "key facts" different than "key words"? The article only seemed to say that it will be used by businesses internally.

    On a nostalgic, somewhat related note, does anyone remember Scopeware? Unfortunately, it seemed that when that story was posted, most of the /. crowd didn't really RTFA and dismissed it as being silly. They were trying to do exactly this kind of thing - provide intelligent ways for businesses to manager their heaps of data. That company is dead and gone now but it seems like everyone is starting to pursue this kind of thing, like MS with their relational database file system thingy in Longhorn. I guess they were just a few years too early.

  29. The "Don't Be Evil" Contest... by ScentCone · · Score: 1

    ...will sure light up. There will be so many people trying out-do the not-doing-evil of all of the other search engines that they'll have to resort to being evil just to prove how not evil they are.

    --
    Don't disappoint your bird dog. Go to the range.
  30. Just a thought - distrubuted search by Eightyford · · Score: 1

    I'm not sure if this is feasable as it would be hard to ward off spammers, but is there any chance that we could see an OSS distributed search system that works like SETI@HOME?

    Maybe I'll patent it, before Epicrealm does...

    1. Re:Just a thought - distrubuted search by Petrushka · · Score: 1

      I'm not sure if this is feasable as it would be hard to ward off spammers, but is there any chance that we could see an OSS distributed search system that works like SETI@HOME?

      I think distributed search is an interesting idea, and worth exploring, but very likely to be prone to all sorts of exploits. My guess is spam would be the least of the worries. It'd be feasible, I'm sure, to get people to voluntarily make their desktop a "node" in a massively distributed search engine (another thread has been suggesting names starting with K; one that leapt to my mind is KaZeArch ... ho ho), but:

      1. bandwidth!!!!!! do you really want a kajillion nodes around the world zapping your server every second? OK, this could be dealt with with a little forethought.
      2. the possibility of security holes; now, of course one can take precautions against that, but basically the more things can develop holes, the more will. Security holes in OSes are bad enough; I can imagine one hole in an app that makes all computers part of a huge interdependent network could cause a heartstoppingly dire catastrophe. Again, interdependency would be something to be avoided.
      3. Is it really worth it? I suppose that depends on how scalable the commercial search engines are.

      So, interesting idea, raises some obvious problems which are I'm sure soluble, needs more knowledge than I have to work out the feasibility. These are just initial thoughts, ... $0.02 worth.

    2. Re:Just a thought - distrubuted search by Anonymous Coward · · Score: 0

      Do we really need to marry distributed computing to search just because they're the 2 hot technologies of the moment?

      Distributed computing is only really valuable when the cost of combining results from multiple nodes is less than the cost of performing calculations a single (master) node. For search engines, which basically perform lookups in an inverted index, you'd really have to be doing some really crazy shit to make distribution worthwhile.

      Besides, the cost to each node wouldn't just be bandwidth and computational power. It would have to be disk space as well, because without a local index to examine, the whole point is moot. You're NOT going to be sending bits of the index out to individual nodes for them to examine them and send back the results -- that would slow search to a crawl.

  31. huh? by pokka · · Score: 1

    which Microsoft thought was between them and Google.

    Where did this come from? It certainly wasn't part of the article. With BAIDU's IPO, and Yahoo expanding its index count to 20B pages (almost 4x Google's count), I seriously doubt that anyone in the search engine business thinks they can predict who will dominate in a few years - it's possible that the next "pagerank killer" is written by some CS grad students or by a search engine company that hardly anyone has heard of (yet).

    1. Re:huh? by a+gash · · Score: 1

      Pagerank is beautiful for it's simplicity, but it is a specific implementation of the search layer. What IBM is touching on here is not a search layer, it's parseing layer on top of the search engine. That parseing layer will be where the search companies fight it out over the next few years. The concept is called Natural Language Search and I'm sure all the big boys have been working on it for some time now. IBM hasn't hit it here, but they defintely just took a step ahead of google et al.

      Ask Jeeves tried a new implemenation of it this May, but it's still very raw. It is probably the first implementation we've seen of Kozoru's natural language processing engine (although there is no mention of it anywhere). I've been working on the problem as well, and am quite pleased with my results. Hopefully I'm about to start that company that hardly anyone has heard of yet! :)

    2. Re:huh? by Tuross · · Score: 1

      Note that all the discussion in the resulting threads has been about web-based publically-available search engines.

      Don't confuse this with a more generic "search engines" which includes the technology of companies like Autonomy, FAST, Verity, ISYS, and so on.

      Autonomy's products have been doing a bloody good job at natural language search for almost a decade now (and they run on Linux as well as legacy platforms). Have you ever heard of Kenjin? These guys had a better desktop search engine 5 years before the so-called "major players" had anything at all. And of course its all distributed, modular, scalable, and maps security. If you need any of that.

      --
      Matt
      1. Read Slashdot
      2. ???
      3. Profit
  32. Wait... by nmb3000 · · Score: 1

    ...which Microsoft thought was between them and Google.

    I think it still is pretty much between them (and perhaps Yahoo) as IBM is obviously not actively persuing this market. From first glance it appears that they wanted to give search engines a swing, and in the end decided not to go after it. However being IBM, instead of burying their research they released it into the public so others can benefit from it.

    While this is good, but Microsoft and Google really have nothing to worry about. It's not like Big Blue is starting up it's own web search portal.

    --
    "What do you despise? By this are you truly known." --Princess Irulan, Manual of Muad'Dib
    /)
    1. Re:Wait... by Anonymous Coward · · Score: 0

      You mean "However being the NEW IBM"

    2. Re:Wait... by MichaelSmith · · Score: 1
      However being IBM, instead of burying their research they released it into the public so others can benefit from it.

      IBM may yet live to benefit from this project. A new google-like startup will need their own software to start their business, so they won't use it. IBM own the copyright, and have their own capital so they could start their own search engine with the OSS software at a later date.

  33. Chum by overshoot · · Score: 1
    The posting implies that IBM is entering into competition with MS and Google.

    No, the posting (at least tried to) implies that IBM is changing the rules on the search game.

    Chum are the bait that you throw to sharks to get them fighting each other.

    --
    Lacking <sarcasm> tags, /. substitutes moderation as "Troll."
  34. RDF? by Anonymous Coward · · Score: 0


    "However, the technology has not existed to allow software to search out and make sense of these disparate forms of data."

    Surely the technology has been around for a while http://www.w3.org/RDF/? It's just that no-one is using it?

  35. Big Blue Marbles by Doc+Ruby · · Score: 3, Insightful

    So Google and MS will incorporate the "key facts" code into their products. That won't exactly shake up the search engine world. It will (possibly) improve it for everyone, and maybe (if "key facts" works better than their proprietary "key words" functions) even let another engine compete in their category. The latter might shake something up. But, like every other mass human activity, this competition is fought over brand names. Google clevery established a terrific brand, through careful simplicity and consistency in graphic and info design. This IBM release would merely grant more substance to the existing brands, and some substance to any newly emerging one. Which new brand would have to establish its own competitive value, largely through style.

    IBM's move does have the power to shake up the open/proprietary software jihad underway. If Microsoft used their open code, it would be hard for MS to claim that open source is inherently bad, or proprietary code is inherently superior. Google would demonstrate the same argument, but no one complains about Google's code remaining proprietary, because it mainly runs on their servers, which few people yet demand should be opened to outsiders. These are the kind of subtle strategic moves that let IBM continue to pull the strings of the entire industry. Success that generates more business and flexibility for IBM, in the mixed open/proprietary space it's carving for itself, will also demonstrate another powerful idea. American corporations can achieve market influence through strategic deployment of basic R&D. Not just through proprietary products, but also through manipulation of competitors who adopt open tech they create.

    All in all, this looks like a smart move by IBM. Let's hope 1> this rumor is true; 2> the tech is really good; and 3> we're not already too far gone down the entrenched lines between our corporate jihadis to get the benefit of the mutual cooperation that this tech could enable, to great mutual benefit.

    --

    --
    make install -not war

    1. Re:Big Blue Marbles by I'm+Don+Giovanni · · Score: 0

      "If Microsoft used their open code, it would be hard for MS to claim that open source is inherently bad, or proprietary code is inherently superior. "

      Microsoft already uses open source code, just not GPL code.

      "Google would demonstrate the same argument, but no one complains about Google's code remaining proprietary, because it mainly runs on their servers, which few people yet demand should be opened to outsiders."

      This is how GPL will die a slow death. More and more developers are "releasing" apps as web apps rather than distributed binaries. These devs can "derive" their product from GPL code and make their products available to be used by the public without having to release their code, simply because the apps are run on their server rather than run on the user's computer. That the GPL folks continue to allow for this huge hole in the GPL license (because they don't want to step on Google's toes) shows their huge hypocrisy.

      --
      -- "I never gave these stories much credence." - HAL 9000
    2. Re:Big Blue Marbles by Doc+Ruby · · Score: 2, Interesting

      The evolution of GPL software into embedded apps that interop with other, non-GPL apps, shows that one basic premise of the FSF worldview is wrong: users and programmers actually have different values, not identical ones, at least where getting the source code is concerned. Practically no users, and even only few programmers, and , have expressed any desire (beyond mere whining) to get the source code for apps with which they only want to interop. So GPL requirements to release new code that hasn't actually changed the GPL code really go too far, in the practical scheme of things.

      However, a new GPL seems appropriate for APIs. Just like there's a different GPL for libraries to which code is linked, with less compulsory release requirements than the GPL on included/derivative code. The APIs of GPL code to which new code isn't even linked, but interops via the APIs, should require reciprocal release of their documentation. That is: call an API, and the calling API side must be as documented as is the called GPL API. As well as the entire callable API of the calling app. With that "GPAL" just as viral as the original GPL: calling a GPL'ed app's API puts the new, calling app also under that GPAL. But not under the original GPL, which would require all the new code to be published. Just because it maybe called a single API, totally disproportionate to the value released by the new coders. Otherwise, the new coders will more likely write their own version of the code they could otherwise just call by API, if the called app weren't GPL'd. That reinvention of the wheel to avoid the GPL is not good for the GPL code, the GPL, or coders - and therefore not for users. We can have the fair exchange, that keeps the innovation flowing, by requiring fair disclosure proportionate to the value derived from the GPL'd code. Or we can have an unfair system that the GPL accelerates into unuseablity.

      --

      --
      make install -not war

  36. IBM does not innovate. Repeat 5000 times. by Anonymous Coward · · Score: 0

    Call me skeptical, but there are many things that appear on slashdot which are lauded as "FROM IBM" and therefore glorious, while I seldom see that anything coming from IBM is ever that glorious. They are slow and monolithic. Developerworks also isn't IBM, which is an error a lot of posters make -- Developerworks is a site IBM runs that pays people for articles, and usually the information quality is average or marginal at best. IBM research also doesn't succeed in making a lot of great products, hence they need to open source some things of marginal quality here and there to maintain street credit and trump the IBM name.

    Large behemoths like IBM manage to innovate against their nature, and even that happens only occasionally. The machine resists...

    Anything interesting happens on the fringe, never at the big 800 pound gorilla of bureaucracy.

  37. This story has the highest ratio of Trolls... by Anonymous Coward · · Score: 0

    and useless comments to insightfull comments I have ever seen. You know, like this one.

  38. will it be good enough by Fr05t · · Score: 0

    to know I'm looking for amateur or anal when I search for 'a'?

  39. Little to do with opponents... by AutopsyReport · · Score: 4, Informative
    From the article... "I don't see any of the major players moving into this area," Arthur Ciccolo, head of search technology at IBM Research, said of how major consumer Internet search companies such as Google, Yahoo Inc. and Microsoft have focused on the public Internet instead of private record data retrieval.

    And from the Slashdot summary... IBM has just tossed a bucket of chum into the whole search showdown, which Microsoft thought was between them and Google.

    No, IBM's technology has little to do with Google, Yahoo or Microsoft's search technology. This isn't a competition until either three introduce similar technology. Reading the article's third paragraph would clarify this, and would make the summary a little more accurate, too.

    --

    For he today that sheds his blood with me shall be my brother.

    1. Re:Little to do with opponents... by evilviper · · Score: 1
      No, IBM's technology has little to do with Google, Yahoo or Microsoft's search technology. This isn't a competition until either three introduce similar technology.

      Similar private-record search products? Like the google search appliance that has been around for years now?

      http://www.google.com/enterprise/
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  40. Beer wants to be free, too. by Anonymous Coward · · Score: 0

    What, expecting free mod points on a worn out cliche?

    Don't let taps, bottles, and cans hold it back. Let beer flow freely, as in FREEEEEEEDOM!!!

    1. Re:Beer wants to be free, too. by Kloog · · Score: 1

      Seems to work for everyone else. I wouldn't want to rock the boat while navigating these chum-filled waters.

    2. Re:Beer wants to be free, too. by freewaybear · · Score: 1

      I really, really wish beer were free.

      --
      Registered Linux User #404114 [url=http://www.punkoiska.com][img]http://img406.imageshack.us/img406/4379/posbannercf5.g
  41. IBM DB2 extensions... by farrellj · · Score: 1

    About 8 years ago, when I was writing software for OS/2, I ran across an interesting extension that IBM had for its DB2 software, called (I think) the Ultimedia extensions. These would allow you to search photos for a type of object that it understood. So you could tell it to search for all pictures that had a red ball and a tree...and it would return a list of all photos with those two objects. It was really interesting, but I have not heard anything about it since then...

    ttyl
              Farrell

    --
    CAN-CON 2019 - Ottawa's only book oriented Science Fiction Convention! October 18-20, Sheraton Hotel, Ottawa, Canada h
    1. Re:IBM DB2 extensions... by Anonymous Coward · · Score: 0

      That's because the task is too fucking hard.

      Unless maybe it was simply querying metatags.

    2. Re:IBM DB2 extensions... by Anonymous Coward · · Score: 0

      ... It was really interesting, but I have not heard anything about it since then...

      What? You mean your first query is still running?

    3. Re:IBM DB2 extensions... by electrichamster · · Score: 1

      Looks interesting, there's a blurb about it here:
      http://www-306.ibm.com/software/data/umm/umm.html

  42. Why wait for SourceForge? by r_jensen11 · · Score: 4, Informative

    It's available now. As the article says:

    UIMA technology is expected to be made available through open-source software site SourceForge by the end of 2005. The UIMA framework can currently be downloaded free of charge from IBM AlphaWorks at http://www.alphaworks.ibm.com/tech/uima/.

    So, I ask, why wait for it to appear on SF if we can get it now?

    1. Re:Why wait for SourceForge? by Anonymous Coward · · Score: 3, Informative

      um, because it's closed source right now

  43. We might start to see the limits of OSS by tentimestwenty · · Score: 1

    I'm in agreement here. If anyone can see the algorithms, then it's going to be pretty easy to manipulate the results and ruing the efficiency. Perhaps this will be the first example of the limits of OSS due to the necessity for secrecy.

  44. That reminds me... by Anonymous Coward · · Score: 0

    Someday I intend to make a great OS program called "KK" just to throw them for a loop when they try to name the KDE version of it :P

  45. Hey, it was good enough for Shakespeare... by Anonymous Coward · · Score: 0

    ...who famously had Hamlet wonder:

    "Whether 'tis nobler in the mind to suffer
    The slings and arrows of outrageous fortune,
    Or to take arms against a sea of troubles,..."

  46. Key technology for Unstructured search by Anonymous Coward · · Score: 0

    You can dowload the Unstructured Information Management Architecture SDK from alphaworks and take a good look at how to analyze unstructured information (text, audio, video, images, etc.) to discover, organize, and deliver relevant knowledge.

  47. IBM has so much unpublished advanced research by jigglysnot · · Score: 1

    IBM is pretty crazy when it comes to advanced research in any of its fields. I have heard of stories from researchers there that IBM has its own terminology for alot of technical EE/CS stuff, as they discovered it way before the world did but were so secretive they didn't publish any of it. I'm not surprised if IBM has enough tech in search to seriously knock down Google! This OSS thing comes as a surprise, as it contradicts their secretiveness about their research.

  48. Open Source, but who will be able to run it? by michaeldot · · Score: 1

    The key to search engines, whatever their underlying ranking algorithm, is trawling through the couple of billion pages on the net to generate the data to be be searched.

    Obviously most of us simply don't have the bandwidth or the computing power & storage to do that.

    So are IBM treating the search engine source release as a hypothetical interest for people who can't actually make practical use of it, or are they going to give access to their own trawled data?

    If the latter, then this is very significant.

    1. Re:Open Source, but who will be able to run it? by Anonymous Coward · · Score: 0

      You don't know what you're talking about. RTFA.

    2. Re:Open Source, but who will be able to run it? by michaeldot · · Score: 1

      Scrub that, I assumed from the (arguably misleading) "Google vs Microsoft" in the intro that it was search in the web context. RTFA showed it's about corporate data searching, so my "net trawling" comment makes no sense. Sorry. Wishful thinking I guess. Gotta learn not to RTFIntros.

  49. laughable by Anonymous Coward · · Score: 0

    If there search engine is anything like their web site, then MS and Google have nothing to worry about.

  50. I would look forward to this by Mal-2 · · Score: 1

    There have been many times when I have known what something is or does (since I've seen it in action), but not what it is called. If I could search for information on the basis of known facts, rather than just guessing at search terms, I think I would have much quicker success at such searches. I can usually find whatever I needed to know, but it can take weeks if I don't know the words to search for. Sometimes it takes joining mailing lists or asking people personally. Yeah it works, and the current system is immensely better than going to a library to hunt for something, but it can still be improved upon.

    For example, I was looking for a particular type of flute, smaller than a normal flute but larger than a piccolo, but with the same standard keywork and fingering system. I knew such a thing existed, having seen it in use in a flute choir, but I didn't know it was called a "treble G flute". Instead I had to search on what I did know -- it's a flute, and used in a flute choir -- and pick through the truly staggering number of hits myself in hopes of finding what I'm looking for. If I could have automagically narrowed that down with specifications such as "smaller than a concert flute", "larger than a piccolo", "made of metal", and "has Boehm system closed keywork", I would have had very few hits to search through and most of them would have been relevant. Google reduced the whole world down to a haystack to search for that elusive needle. Searching by facts might have reduced that down to a teacup.

    Mal-2

    --
    How is the Riemann zeta function like Trump rallies? Both have an endless number of trivial zeros.
  51. Re:LOL - Google is more than an algorithm by pogson · · Score: 2, Interesting
    Google is a search engine farm. It would take a while for anyone to catch up even with a better recipe. If IBM's stuff is FOSS, Google could use it.

    This is good news anyway. Keyword/phrase searching becomes less useful as the universe expands. I have 11000 texts fully indexed with swish-e and I get way too many hits unless I use phrases. If I knew what phrase was in the books I sought, I would not need the search engine.

    I love search engines because I cannot figure out how to organize a file cabinet or a hard drive...

    --
    A problem is an opportunity http://mrpogson.com
  52. eBay runs on Sun, not IBM. by NeoBeans · · Score: 2, Informative
    Is that why powerhouses such as Ebay use it as well as other IBM products such as Websphere?

    That was a long time ago in a galaxy far, far, away. eBay now runs on Sun.

  53. Earth Calling Apple by Anonymous Coward · · Score: 0
    Dear Apple Execs & Marketing Droids (& fanbois),

    This is how a commercial enterprise does "Open Source". Please take note!

  54. They keep giving stuff away! by elgee · · Score: 1

    No wonder my IBM stock is tanked.

  55. Don't count on it being of any use. by duffbeer703 · · Score: 1

    If this radical new technology is anything like the new, improved, "Deep Blue" search backing IBMs support pages, its a real piece of junk, almost like Altavista circa 1998.

    --
    Conformity is the jailer of freedom and enemy of growth. -JFK
    1. Re:Don't count on it being of any use. by cant_get_a_good_nick · · Score: 1

      It isn't. It's new, called UIMA. We eval'ed this under an NDA. I's pretty cool, though what we saw was an SDK more than a package you can install. it is more of an infrastructure, so can be used to create new engines.

  56. Not Just Sun by Anonymous Coward · · Score: 0

    That's just the hardware (and operating system and JVM). And not necessarily all eBay's hardware. The original poster is correct: eBay uses IBM WebSphere software.

  57. Re:IBM has so much unpublished advanced research by xiaomonkey · · Score: 1

    Interesting....

    I thought IBM tried to patent everything and anything plausibly patentable that came across the desk of someone on their research team.

    If they patent everything, they can be pretty sure that they'll be able to extract some pretty hefty licensing fees from the industry at large. However, if they keep too many things under wraps, while they might gain a competitive advantage for a product that they're bringing to market relatively soon, they risk loosing the ability to file for all of the relevant patents. For example, someone in another research lab might simultaneously make similar discoveries and file for some patents. Thus, in the worst case, IBM could be forced to pay heavy licensing fees to the second company, for tech that IBM originally discovered.

    So, I guess, I'd like to know, under what conditions does IBM tend to keep things underwraps and when do they opt for the patent?

  58. Re:IBM has so much unpublished advanced research by joepez · · Score: 1
    This is definitely a surprise for more reasons that one. In 03 I was interning at IBM in their product marketing group for the zSeries (working on a new product) and one of the technologies frequently discussed as a potential showcase for the new mainframe was an un-named search technology leveraging UIMA and natural language. An IBM researcher had developed all of the algorithms in a pretty hush-hush project (never did find out if there were any other sponsors involved). At the time there was a lot of discussion on trying to find a way to monetize the search technology.

    In the trials for the new mainframe they were searching the entire net, but not for your typical search reasons (ex. Searching on an address), but to find relationships and patterns. Evidently they were getting some really interesting results searching on predictive patterns for stocks (finding tell-tale relationships that indicated when something was going to move) or in evaluating government actions. A lot of discussion I sat in on was on how they could use the tech to find patterns across thousands of sources.

    Anyway the net net is they were trying hard to find a way to sell this tech, part of their new efforts to monetize technologies like this (IBM has this great weather predicative technology for micro-cells that still hasn't seen the light of day). Guess they couldn't find a good way to sell it directly, so releasing it this way to the world is pretty interesting. Though it wouldn't surprise me that when they ran the numbers they found they'd sell more hardware and services then the software would ever generate if it was adopted by other companies.

  59. There can be only two! by Anonymous Coward · · Score: 0

    IBM has just tossed a bucket of chum into the whole search showdown, which Microsoft thought was between them and Google.

    Yahoo says, "I'm nawt dead yet." "It's just a flesh wound."

  60. realsearchengines.org by Foktip · · Score: 1

    if its open source whats to stop jerks from making it ignore robots.txt? this is gonna help phpbb, and other things that currently have asstastic search capability - but its gonna be the next big thing for DNS attacks, wurms, etc.

    mayby someone'll come up with some sort of SPI-thingy that sniffs out the indexing weasels from the good guys and bloxorz them!

    i call it, realsearchengines.org - the only place to register as a legit search engine indexer, and to report "search-spammers"

  61. Just for Reference by AoT · · Score: 4, Funny

    10 tads = 1 few

    10 fews = 1 some

    10 somes = 1 alot

    10 alots = 1 load

    10 loads = 1 buttload

    10 buttloads = 1 assload

    10 assloads = 1 shitload

    10 shitloads = 1 fuckload

    I do not have the book here or I would give the non-metric chart, you know how hard it is to remeber how many hogsheads are in an imperial buttload?

    1. Re:Just for Reference by mr_snarf · · Score: 1

      ahha, you just brightened up my day with that one :)

      --
      printf("Goodbye cruel world!\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b");
    2. Re:Just for Reference by Anonymous Coward · · Score: 0

      1 alot != 1 a lot;

    3. Re:Just for Reference by Anonymous Coward · · Score: 0

      I don't use Metric

  62. Not quite a new concept... by Excelsior · · Score: 1

    You almost make it sound as if this is the first OSS search engine out there. Apache Jakarta's Nutch, a subproject of Lucene, has been around for over two years. I haven't done tons of research on the subject, so I'm betting Nutch isn't the only one.

  63. Both? by A+nonymous+Coward · · Score: 1

    Both "both" are not both needed. :-) for you idiot moderators on crack.

  64. Google by Anonymous Coward · · Score: 0

    It's open source, right?

    Google likes open source.

    If this is something really useful, why wouldn't they take it in board ? Assuming of course that Google don't have something that whoops UIMA and aren't talking about / releasing it. In which case they could use it to augment their research.

    So it may not be Google vs Microsoft vs IBM.
    It may be (Google & IBM) vs Microsoft.

  65. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  66. I'll stick with splunk by Anonymous Coward · · Score: 0

    The only key facts I need to know around here are about 30,000 events per second on my network. splunk does the job, it's free, and it isn't a Trojan GNU packed with IBM consultants waiting for nightfall.

  67. They added some useful stuff to linux, you know... by TheLittleJetson · · Score: 0

    ...though they stole most of it from SCO.

  68. Re:IBM has so much unpublished advanced research by jigglysnot · · Score: 1

    Yeah, the patent everything method sounds more plausible for money making. I was pretty surprised when I heard the stories too. As an example, apparently the LSB and MSB are backwards at IBM, so like their "MSB" is our "LSB", or something like that. Other things like Dual-rail dynamic domino logic are called something wholey different I heard... I am not so sure as to their conditions for release or kept under wraps. That's an interesting question I'd like to know myself.

  69. DARPA?? by Anonymous Coward · · Score: 0

    This doesn't exude my confidence...perhaps the first order of the day when going over the lines of code would be to see if the application "phones home"

  70. Ever heard of... by Anonymous Coward · · Score: 0

    I have been working with search engines for some time now, and the 'concept search' that IBM is mentioning is nothing new. Actually, the Cambridge UK-based company Autonomy http://www.autonomy.com/ has been market leader in this field for years. IBM even has some specialists on Autonomy working for them... makes you wonder...

  71. Jaws reference by SethJohnson · · Score: 1

    ....IBM tossed a bucket of chum into the whole search showdown...

    To which Paul Allen responded from the deck of his yacht, "We're going to need a bigger boat!"

    Seth

  72. Proposal to start OSS search organisation by jurt1235 · · Score: 1

    Fellow /. viewers,

    Why not do the following: Several of us have access to sufficient infrastructure (own/lease with diskspace to spare, plus a bandwidth surplus).
    Why do we not combine that in a distributed search environment with mirrored nodes with this technique of IBM. The addition of the distributed technology to spider and index the web will be a significant challenge, but the concept is I think pretty appealing. I for one will be willing to "donate" the necessary domain and starting facilities.

    Anyone who is interested, you know how to find me.

    --

    My wife's sketchblog Blob[p]: Gastrono-me
  73. I doubt by Elixon · · Score: 1

    I'm skeptical about their technology. I'm sure that if it is so "good" then IBM will try to take some money of it... and if it is not so "good" so let it go on SF.net and take some populist's credits out of it... and maybe geeks will stick to it and they will make Google and Search.MSN weaker not necessarily by open-source quality search engines but by swarms of emerging not-as-good-as-google specialized search engines...

    Anyway IBM will benefit by open-sourcing the market segment that its competitors are dependant on...

    --
    Well, I've got to get back to work. When I stop rowing, the slave ship just goes in circles.
  74. Distributed Seach Engine by Danger+Stevens · · Score: 1

    Any chance this OSS project could be made into a distributed app that is hosted on thousands of individual web servers in some sort of cooperative?

    Imagine if any group of people could develop a search engine that (through funky DNS or distributed scripting) they could easily host themselves and provide internet searaches that have certain intentional biases.

    Like a "Google for gamers" or "Google for crackers" or "Google for linguistics" - but all independantly hosted.

    Sounds like "Free as in 'Look, I'm cool like Google'"

    --
    World Changing - News for Humans, Stuff about our planet
  75. How does IBM benefit from this? by theGeekDude · · Score: 1

    Exactly how do the IBM guys manage to make money from this? They just seem to dontating it to biggies lke Google and MS. Sure they are not in mood for some social service, until it benefits them.

    --
    Dont waste you time reading stupid sigs like this.
  76. So, whats the real deal here? by chefren · · Score: 1

    After searching for a whooping 5 minutes and even googling (gasp!) I couldn't find any decent article about what this actually is, just lots of info on how to use it. It looks like there is a new query language so it might be interesting for query expansion. But how does it extract these key facts from the documents? Does it do real natural language analysis? Just guess by looking at the document terms like every other search technology? Or is it just a framework that doesn't really do anything by itself? It sure looks like it when skimming http://www.research.ibm.com/journal/sj/433/gotz.ht ml so no revolution yet, sorry.

  77. Bad move on our part by Anonymous Coward · · Score: 0

    I work for IBM. If we're releasing the search engine we use on our internal site, the only thing that we can be hoping is that someone will fix the damn thing.

    Or maybe we're throwing it out prior to licensing Google's algorithms (IBMers can only hope). Most of the hits we get from the internal site in respose to a query are useless.

  78. Mixed? by Anonymous Coward · · Score: 0

    I don't know, it seems pretty clear to me: the reference is to shooting fish in a barrel.

  79. Use in Linux Searching? by jambarama · · Score: 1



    What I hope this is used for is the Linux desktop. Searching in Linux sucks. For the most part that is ok, if you can install Linux (and use it) you know where your stuff is. But if Linux expands to an average end user, a search that works would be a great boon. "IBM's version of Google Desktop Search." Google could fill this gap itself, but it hasn't released ANY software for Linux, so once again Big Blue steps up and contributes something useful. I hope it is incorporated.

  80. Outsourcing at IBM by Anonymous Coward · · Score: 0

    How much you want to bet that this technologoy did NOT come from IBM's India campus?

  81. A Bucket of Cold Water by bobej1977 · · Score: 1
    This is interesting, but somwhat deceptive. IBM has created a framework, not an actual search engine. The framework is effectively a data layout combined with a processing pipeline and query engine that gives emphasis to semantic processing of information, rather than strictly textual. See IBM's FAQ regarding Annotations.

    You still have to buy the software that will plug into the framework in order to actually process the information, though some open source projects are certain to come along.

    This is interesting stuff, but not as thrilling as the article would suggest. Imagine if google open sourced their systems software, except the part that does the whole PageRank thing.

    --
    The meek shall inherit the earth, in 3 by 6 plots. - Lazerus Long
  82. Re:IBM has so much unpublished advanced research by Anonymous Coward · · Score: 0
    Thus, in the worst case, IBM could be forced to pay heavy licensing fees to the second company, for tech that IBM originally discovered.

    They'd only have to pay licensing fees if they sold a product embodying the patent w/o having independently developed the invention before the patent application was filed. They would perhaps have to deal wth lawyers fees, but they always do

  83. Author says Google is largest computer company... by mcguyver · · Score: 1

    I like how the author points out Google as being the "worlds largest computer company" in the same article as IBM. Apparently having the company name International Business Machines, having $100B in assets and revenues of $100B a year will not trump a hyped up dot-com company with $3B in assets and revenues of $3B a year. Surely Google is the world leader in search but when did that become the only function of computers?

  84. Sem Web! by smartdreamer · · Score: 1
    IBM plans to give away key search technologies for data retrieval that use concepts and facts instead of simpler "keyword" searches
    Did I hear semantic web?
  85. distrubuted search to be a reality by BenjaminM · · Score: 1

    There's a new serious effort to make an open source search engine, and I can see there's a lot of interest about this in this thread. You can get a preview of our opening documents at the Openzuka forums, http://openzuka.org/phpBB2/viewforum.php?f=2.

    At the least, I see distributed processing possible in doing the work of indexing (and analyzing) the internet, and passing this on to a central server.

    Send me your e-mail address or other contact and your level/area of interest, and I'll let you know how things develop! http://bemweb.com/contact/

    --
    benjamin, Agaric