How Hard Is It To Write Your Own Search Engine?
kha0z writes "Anna Patterson, from Stanford University, overviews the difficulties that have to be overcome when attempting to develop and/or implement a search engine solution in this article in the ACM Queue Magazine. The article covers many issues dealing from data sources, to indexing, to ranking. How does Google make it look so easy?"
Google just makes it look that way.
While writing a local search engine isn't trivial, it's a lot easier than writing a web search engine since all the scaling issues disappear -- I know: I wrote one.
If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
Well, most of the additional services still focus on searching and information. Information is also one of their core goals (I should have said this earlier). There still is nothing like Yahoo Travel or MSNBC.
Froogle is still a search product, but with a focus on shopping.
Groups is mostly still a search product (You can post also, so it's also about creating information). The service has been around for years (I think it's their second big project after web search). If I have a technical question, I often find the answers in Google groups. Blogger is new, but is similar to Groups in it's goals.
Gmail is also largely about search. With search they can place ads in your email.
Actually, I guess you can really say that Google is about using a good search technology to place highly targeted ads with the information.
"Can of worms? The can is open... the worms are everywhere."
...harder than she implies.
You have to deal with 404s, robots.txt, politeness (don't bring down someone's site by crawling too fast), redirects, content you can't handle (Flash, Javascript).
The list goes on.