Slashdot Mirror


How Hard Is It To Write Your Own Search Engine?

kha0z writes "Anna Patterson, from Stanford University, overviews the difficulties that have to be overcome when attempting to develop and/or implement a search engine solution in this article in the ACM Queue Magazine. The article covers many issues dealing from data sources, to indexing, to ranking. How does Google make it look so easy?"

2 of 23 comments (clear)

  1. Search engine != entire web by pauljlucas · · Score: 4, Insightful
    Not all search engines are designed or intended for indexing and searching the entire web, and not everybody needs such a search engine. Often, people want to search their stuff: their documents on their local disks, their e-mail, etc.

    While writing a local search engine isn't trivial, it's a lot easier than writing a web search engine since all the scaling issues disappear -- I know: I wrote one.

    --
    If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
  2. The crawl is hard, too by blamanj · · Score: 4, Insightful

    ...harder than she implies.

    You have to deal with 404s, robots.txt, politeness (don't bring down someone's site by crawling too fast), redirects, content you can't handle (Flash, Javascript).

    The list goes on.