Slashdot Mirror


Google's Search Appliance

An anonymous reader noted that Google is working on a Search Engine that you can install behind your corporate firewall for indexing your internal documents. It's a bit thin on information, but it looks like for as little (cough) as $20k, you can have your own google box. Not for everyone obviously ;)

10 of 250 comments (clear)

  1. article from C|Net here: by mESSDan · · Score: 4, Informative
    From C|Net.

    It's a little more indepth than the India times article.

    --

    -- Dan
  2. Ouch. Try HTDIG. by Kozz · · Score: 3, Informative

    Yes, quite CLEARLY it's only for those who've got some cash to blow. If you've got a modest-sized Intranet site, I would highly recommend htDig. I've installed and configured it in several places and it works like a charm. Best of all, it's GPLed! Sure, it doesn't have all the fancy matching algorithms used by Google, but it does a damned good job nonetheless.

    --
    I only post comments when someone on the internet is wrong.
  3. We're using it here...it rocks! by HRH+King+Lerxst · · Score: 4, Informative

    They just implemented this were I work, it's a vast improvement over what we had before. It even includes the cache and newsgroup features!!

    Two thumbs up!!

    --
    No one got beat up more often than the mimes of the old west!
  4. Re:Looking for a good internal search engine by pere · · Score: 3, Informative

    Try http://www.mnogosearch.org

    Brilliant search engine. It has parser for most file-formats (You can use pdf2txt to index your pdf-files). It even indexes your mp3's if you should happen to have some on your local net.

    Free (at least as in beer) for Unix. Binaries for Windows costs between $99 and $699.

  5. Re:Looking for a good internal search engine by richieb · · Score: 5, Informative
    Try htDig. It does all these things and is free software. I used it on a corporate intranet in the past. Not as good as Google, but you can't argue with the price.

    --
    ...richie - It is a good day to code.
  6. Re:Why Google Can Be So Expensive... by PoiBoy · · Score: 4, Informative

    Actually, I've seen interviews in some business magazines with their CEO. In fact, they are slightly profitable and have been for a few years.

    --
    Sig (appended to the end of comments you post, 120 chars)
  7. Re:Google enters this market at the right time by jeffehobbs · · Score: 3, Informative

    Google searches .doc files.

    http://www.google.com/help/faq_filetypes.html

    1. What file types are returned in a Google search? There are 12 main file types searched by Google in addition to standard web formatted documents in HTML. The most common formats are PDF, PostScript, Microsoft Office formats:

    Adobe Portable Document Format (pdf)

    Adobe PostScript (ps)

    Lotus 1-2-3 (wk1, wk2, wk3, wk4, wk5, wki, wks, wku)

    Lotus WordPro (lwp)

    MacWrite (mw)

    Microsoft Excel (xls)

    Microsoft PowerPoint (ppt)

    Microsoft Word (doc)

    Microsoft Works (wks, wps, wdb)

    Microsoft Write (wri)

    Rich Text Format (rtf)

    Text (ans, txt) ~jeff

  8. Re:Didn't we know this all along? by neonstz · · Score: 3, Informative

    If you read the entire article you would know that there are two versions for sale, one small $20k box which can index up to 150,000 documents, and one "millions of millions" version which costs $250k.

    If a large company puts out all the revisions of all their documents it will be quite a lot of documents :). $250k is still quite cheap for something that will index all electronic documents the company has ever produced.

  9. Re:Search engine by gorilla · · Score: 3, Informative
    What a horrible script.

    No taint checking (What happens if 'q' contains ";rm -rf /;".

    No warnings.

    No proper formatting of HTML, on the output. If the grep matches "", then it's not going to display anything on netscape. You need to either strip tags, or force tag matches.

  10. Re:Ouch. Try HTDIG. by ghutchis · · Score: 3, Informative

    Actually, saying it doesn't have all the fancy matching algorithms isn't really fair.

    Granted, we can't implement Google's patented things, but that's not to say we don't come close.

    Indexing the text of links to documents? Yes.
    http://www.htdig.org/attrs.html#description_fact or

    Keeping track of the weight of links pointing to a document? Yes.
    http://www.htdig.org/attrs.html#backlink_factor

    Probably the big "missing link" is a proximity weighting. Interested? Help is always welcome!

    -Geoff