New Search Engines
An anonymous reader wrote in to say "It seems that there's a new company out there
called Fast Search
and Transfer which is competing with Inktomi.
They have a demo online at www.alltheweb.com and their engine seems to be ultra-fast.
News about this is available here.
Try out the demo, it is awesome what these guys have done."
It is fast, but so far I've not had as good luck searching as
with other engines. And the speed is probably largely due to
the sparse HTML. But its not bad.
As someone else commented, these people wrote the search engine behind ftpsearch.lycos.com. Which is fast. There are a few more reasons, apart from sparse HTML.
This engine asks for no cookies
The output is not in a table, so you see the results as they arrive in your browser, without having to wait for the whole table (in lynx it makes no difference
The load is not very high yet, probably.
But having seen FTPsearch in action for the last 4-5 years, and having seen it always return results quickly, it wouldn't surprise me if alltheweb stayed fast.
"There are two major products that come out of Berkeley: LSD and UNIX. We don't believe this to be a coincidence."
The only search engine I know that does a good job at this is Google. It is so good at finding relevant sites, I don't care if the response time is occasionally a little slow.
(Google uses a nice algorithm where they gauge the "importance" of a page by how many other sites link to it.)
If a thing is not diminished by being shared, it is not rightly owned if it is only owned & not shared. S. Augustine
The specs on the search engine are available at http://web.fast.no/product/search/d et.asp?id=34.
The press release doesn't exactly scream it out, but the search engine is actually just a little bit of software stuck on top of some pretty neat custom hardware. They call their chip the FAST PMC (Pattern Matching Chip), and their server is just your average (well, sort of average) high end server, with a buttload of those chips stuck on PCI cards.
The specs on the PMCs are available at http://web.fast.no/product/PMC/det.asp ?id=52.
FAST claims 100 MB/sec throughput on each chip, and each card has its own RAM (from 8 MB to 2 GB). The chips actually run at 100 mHz each, and even have support for RegEx matching (slightly limited).
From the specs:
A typical configuration will contain 4 to 8 plug-in cards per search node, and 16 or 32 chips on each card.
Overall, I'm pretty impressed - putting search capabilities into hardware is a pretty good idea, especially since so much of a modern processor is geared toward things like Floating Point calculations, which doesn't help text searching at all.
Scott Severtson
Software Developer
Auragen Communications
scotty@auragen.com
Scott Severtson
Senior Architect, Digital Measures