Slashdot Mirror


Nutch: An Open Source Search Engine

Anonymous Coward writes "Someone forwarded me this site working to create an open source search engine called Nutch. In the age of weighted rankings on search engines for profits, there's an obvious need for an unbiased search engine. After all, isn't a search engine supposed to be for finding relevant data, not as an indirect and sometimes slimy method of advertising? Nutch is clearly in their intial stages, but it would certainly get my vote." You can find the project on SF.net, and also read the Business 2.0 article on it.

8 of 291 comments (clear)

  1. Biased listings by Champaign · · Score: 4, Insightful
    I think many commercial search engines have learned that biasing themselves to sites who have paid them is a good way to errode consumer confidence, and damage their readership/userbase. Just as newspapers have to at least provide the image of objectivity, the same demands are on search engines.

    I'm quite comfortable with how Google does this (present commercial links clearly marked to the side), and am not convinced a non-commercial (open source) alternative is needed.

  2. Seems pretty pointless by cryptochrome · · Score: 4, Insightful

    Free and open code is good and all... but the one real cost of a search engine is RUNNING it. It requires a far from trivial amount bandwidth and hardware, and somebody has to pay for all of it. Unless someone comes up with a novel P2P solution (and many are trying) it just won't happen.

    What they should be doing is pressuring the existing search engine companies for some integrity.

    --

    ---If you can't trust a nerd, who can you trust?

  3. Can this work? by jmkaza · · Score: 4, Insightful

    I think the idea is good in principle, but could it actually succeed? Google gets hit with millions of request each day. They've got hardware that can support thousands of slashdottings a day and a fat pipe to feed all of that info out. That takes alot of money. Financing an open source project is difficult enough, but financing an open source service such as that would seem next to impossible. Ideas?

    The other major problem would be that, with the ranking criteria being available for all to see, it would be relatively simple to manipulate page rankings.

  4. Search engine game is NOT over by AtariAmarok · · Score: 4, Insightful

    "Google has WON the search engine war, probably forever. Find some other mountain to climb, guys."

    At one time, Oldsmobile won the auto company wars. Where are they now?

    IBM ruled the PC roost. Hmmmm....

    Command-line OS's were king. But now???

    Altavista and infoseek and Lycos were search engine kings at one time. Whither this trio?

    The point is, it is not over.

    --
    Don't blame Durga. I voted for Centauri.
  5. Re:just don't get it by cduffy · · Score: 4, Insightful

    Think about cryptosystems: The whole point about the really good ones is that you can know the algorithm, but still not break it. Granted, pulling that off for a search engine is prone to be much, much harder -- but I *do* believe it's well within the realm of possibility. Ambitious in the extreme? Certainly... but there's something to be said for high-risk-high-reward projects.

  6. Re:Slimey adverts? by Anonymous Coward · · Score: 5, Insightful

    This project is the SOFTWARE to run a search engine. Not a corporation that needs to generate income to justify the resources required to run the search engine.

    Anyone could take this source code and with enough money, challenge Google.com as the top search engine.

    I see this project as a competitor to shrink wrapped search engines. IE google appliance or maybe even Folio based products. Typically corporations have many documents that need to be indexed and searchable to their needs.

    I haven't seen this on the homepage but it doesn't list what content it can index. I hope it can at least index PDF's and popular Office documents.. Maybe even Media files? And what XML indexed fields? Or external metadata?

  7. Re:Patents. by Feztaa · · Score: 4, Insightful

    I hope the authours of this project do their homework. My impression is that most of the good search and indexing schemes have already been patented, which will make it difficult to release such a project without stepping on someone's toes.

    Hmmm, I just realized something... with patents, you end up stepping on people's toes. Without patents, you get to stand on their shoulders. Which do you think is the better vantage point?

  8. Re:Patents. by AstroDrabb · · Score: 5, Insightful

    Does it matter? There are no innovations. ALL knowledge is based on prior knowlegde. Look in any field of study and you will soon learn that advancement is not possible without prior knowledge. What we know about computer science today is thanks to the knowledge gained by those before us. It is this way in EVERY field, Astronomy, Medical Science, Mathmatics, etc. Humankind does not grow by leaps and bounds, we grow by incremental improvements. I have not heard of ONE discovery/innovation in which the discovery/innovator was not educated in prior knowledge. Now the question we need to ask ourselves, and especially the government is do we really want the advancement of our society to be hindered by monetary interests of the greedy?

    --
    If Tyranny and Oppression come to this land,
    it will be in the guise of fighting a foreign enemy. -James Madison