Slashdot Mirror


Google's Search Appliance

An anonymous reader noted that Google is working on a Search Engine that you can install behind your corporate firewall for indexing your internal documents. It's a bit thin on information, but it looks like for as little (cough) as $20k, you can have your own google box. Not for everyone obviously ;)

250 comments

  1. Oh now come on by yobbo · · Score: 5, Funny

    People don't have THAT much pr0n do they?! :)

  2. Possibly very good... by larien · · Score: 5, Interesting
    Certainly I'd see value as a user of a huge corporate internet. Several times I've wanted to find information on some of our internal pages which, of course, I can't use google.com for because of the firewall. While there is an internal search engine, it's results can be less than stellar and I've missed Google.

    Aside from anything else, it gives Google a revenue stream so they can continue to provide their services (web, image and usenet searches) for free; they need to find a valid business model, and hopefully this can contribute.

    1. Re:Possibly very good... by uberman · · Score: 2, Interesting

      My former employer is currently in the process of evaluating a '4-node' 'Google-Box', very neat hardware, essentially a mini-rack about 2.5 feet high, with 4 1U rack servers (presumably a mini linux cluster), storage, and a UPS.

      The selling point for them:

      As a governmental organization, regulations stipulate they must be able to provide online content to the RCMP upon request, so it must be hosted on-site. As I'm sure most corporations have similar guidelines, this could be a big cash cow for google at some point.

      Google's top notch search technology, now on-site? Sign me up!

      uberman

    2. Re:Possibly very good... by sid_vicious · · Score: 2

      Aside from anything else, it gives Google a revenue stream ... they need to find a valid business model ...

      Google's "sponsored links" seem like a valid business model to me. Search on something generic like computers and you'll see pastel links pop up with advertisements. I imagine people pay a nice chunk of change for those.

      --
      If it ain't broke, it doesn't have enough features yet.
    3. Re:Possibly very good... by leviramsey · · Score: 5, Insightful
      Google's "sponsored links" seem like a valid business model to me. Search on something generic like computers [google.com] and you'll see pastel links pop up with advertisements. I imagine people pay a nice chunk of change for those.

      Google runs on two business models: the Sponsored Links model (and the Google Sponsored Links are much more effective than any other online advertising out there) and the sale of search services (to Yahoo!, Washington Post, et al).

      Fact is, Google's already profitable. Why? Because they didn't make the moronic mistakes that the other dot-coms did. Have you seen a Google Super Bowl ad? Have you seen a Google ad anywhere? Exactly. The Google model is, quite simply, you run a lean and mean ship that gets the job done well, and you make money.

    4. Re:Possibly very good... by garcia · · Score: 2

      I don't know why /. doesn't cache them already. It's not like it would be that difficult.

      IIRC /. was one of the few sites I could actually reach during 9/11, it would make a lot more sense for them to implement this themselves to save other sites from the destruction of /. readers :)

      Just my worthless .02

    5. Re:Possibly very good... by Spankophile · · Score: 2

      > the Google Sponsored Links are much more effective than any other online advertising out there

      I guess you have some data to back that up? Why are googles ads better than others? Because they annoy you less? When's the last time _you_ clicked on a google sponsor because of their compelling attraction.

      > Fact is, Google's already profitable.

      I guess you know that from their public financial statements right? (sarcasm) Or maybe because you'r on the board? Hmm, didn't think so.

      So, aside from being a google fan-boy (of which I am one myself), where to you get these wonderfully objective facts?

    6. Re:Possibly very good... by morgus+morphus · · Score: 1

      Actually, I did click on quite a few earlier on while I was checking out potential hosts for a website :)

      Of course you're not going to click on them if you're not going to buy something.

    7. Re:Possibly very good... by SpaceLifeForm · · Score: 1

      I think you missed the point.
      This is to have your own search engine for your intranet, that is not publicly viewable over the Internet. Google's public search engine can't see the documents in question. Many organizations have huge intranets with millions of documents.

      --
      You are being MICROattacked, from various angles, in a SOFT manner.
    8. Re:Possibly very good... by garcia · · Score: 2

      I think you missed my point. Whatever you were talking about has nothing to do w/what I said.

      I want /. to cache all the documents that they post on the main page so that we don't /. effect the sites.

    9. Re:Possibly very good... by ipfwadm · · Score: 1

      I think you missed the original poster's point. Whatever you were talking about had nothing to do w/ what he said. Perhaps you were trying to reply to this comment?

    10. Re:Possibly very good... by silicon_synapse · · Score: 1

      They can't do that because of copyright issues. It would be a good idea to give them and their ISP some advance warning though and offer to cache. Of course then the bean counters might complain about unncessary bandwidth usage.

    11. Re:Possibly very good... by carlos_benj · · Score: 1

      Seems I've read an article within the last couple of months that touted customer testimonials regarding Google's ads producing more revenue than other internet ads (perhaps as a ratio to expense of the ad because I think the same article talked about the relatively low cost of the Google ads). Either the same article or one within that timeframe mentioned Google's profit picture but I don't recall if they were already profitable or nearing profitability. Of course this is all hearsay from an unnamed source, but maybe it's enough to trigger the more industrious among us to ferret out the details.

      --

      --

      As a matter of fact, I am a lawyer. But I play an actor on TV.

    12. Re:Possibly very good... by jedrek · · Score: 5, Interesting

      Well... we had a 6% click-thru rate on our test run of 10.000 which cost us a whoping $110. I don't think that's too bad.

    13. Re:Possibly very good... by laserjet · · Score: 2

      Exactly. In rebute of the thread's parent, Google doesn't need to "find" a valid business model. They have one, and have had one for quite some time. Google is a profitable comapany (albeit a private one). They make money. If you make money, that is a valid business model.

      --
      Moon Macrosystems. Sun's biggest competitor.
    14. Re:Possibly very good... by jesser · · Score: 3, Insightful

      When's the last time _you_ clicked on a google sponsor because of their compelling attraction?

      Google's ads tend to be relevant to what I'm searching for, so I click on them often.

      Last summer I looked up filk music after seeing something about a "space-themed filk concert featuring Kathy Mar and..." at Stanford the day before the Mars Society convention. I searched for filk, and there was an ad to download some of Kathy Mar's music from mp3.com! I listened to what mp3.com had and then went to the concert. During the concert, I met Kathy and also met the guy who put the ad up.

      Oh, did you mean "What was the last time I bought something through Google adwords"? I haven't yet, but I am now a filk fan and plan to buy Prometeus Music's Space CD when it comes out. (Kathy's CD, which I didn't buy, is also a Prometheus CD.)

      I also ran $50 worth of ads for my non-revenue-generating bookmarklets site because I thought it would be a cool way to give Google money. I don't know how many people run ads without the intent of making money, though.

      --
      The shareholder is always right.
    15. Re:Possibly very good... by Fweeky · · Score: 2

      I think the biggest reason for thinking they're likely to be successful is because they're targeted; if you're looking for something in particular and you get an advert related to it, you're more likely to click on it than you are on $some_randon_ignorable_banner.

    16. Re:Possibly very good... by eh2o · · Score: 1

      one simple reason why google ads are better;

      they are in all text... which means that the browser is actually more likely to see them, because they can't be blocked as easily as banner.

    17. Re:Possibly very good... by Jebediah21 · · Score: 2

      I think a single, good commercial could help Google. Likewise a few print ads in magazines (not computer ones tho!) or newspapers would help spread Google to new Internet users.

      As long as the commercial isn't stupid or overly expensive to produce I don't see what harm it could do. It wouldn't kill Google, that's for sure. I think what ended up killing other search engines (like AltaVista, HotBot, etc.) were poor search results and terrible strategies. If people want a portal they will use Google or another one. When people just want to search the net a portal with ad banners and pop-ups makes searching a chore.

      --

      Everytime you look at porn a devil gets their horns.
  3. Google enters this market at the right time by hawaiianshirt · · Score: 4, Insightful

    Everywhere you look, companies are hawking products geared for searching internal documents. Google is making a good move; enter an expanding market as an established leader in searching.

    --
    hawaiianshirt
    1. Re:Google enters this market at the right time by rm-r · · Score: 2, Interesting

      I'm suprised it's taken Google so long to get in on the act, after all Northen Lights got into this just recently as well (Can't be bothered to search for the old /. link to the story right now though).

      Three years ago I was involved in impelementing a similar box, from Excalibur Technologies, for the company I was working for during my university gap year (it was there that I first start reading /. too ;-) The company was a massive multinational ex-British state owned utility and wanted to be able, amongst other things, have every single company document on the network and have a database of all staff and their skillset so that as relevent business units were formed managers could place staff already on the books rather than get contractors in. The system sold for several hundered thousands pounds, so there's plenty of money in it even if it's only the big companies who are going to really need this kind of thing.

      Judging from the website Google clearly have some fantastic technology, and they certainly have the reputation, they should do very well.

      --

      J-aims
      --
      Yo, whatever happened to peas? Join T( H)GS
    2. Re:Google enters this market at the right time by uebernewby · · Score: 2, Flamebait

      The only problem with this that I can see as that most internal documents a company would be interested in aren't HTML documents that link to each other. So how are they going to page rank thousands upon thousands of stand alone .DOC files?

      --

      News and bla for computer musicians: http://lomechanik.net/
    3. Re:Google enters this market at the right time by Anonymous Coward · · Score: 1, Informative

      Google already indexes PDF documents, and extracting text from a Word document isn't particularly hard. They could either treat it as a text file, use reverse-engineered file layout information, or license dewording technology from MS.

    4. Re:Google enters this market at the right time by jeffehobbs · · Score: 3, Informative

      Google searches .doc files.

      http://www.google.com/help/faq_filetypes.html

      1. What file types are returned in a Google search? There are 12 main file types searched by Google in addition to standard web formatted documents in HTML. The most common formats are PDF, PostScript, Microsoft Office formats:

      Adobe Portable Document Format (pdf)

      Adobe PostScript (ps)

      Lotus 1-2-3 (wk1, wk2, wk3, wk4, wk5, wki, wks, wku)

      Lotus WordPro (lwp)

      MacWrite (mw)

      Microsoft Excel (xls)

      Microsoft PowerPoint (ppt)

      Microsoft Word (doc)

      Microsoft Works (wks, wps, wdb)

      Microsoft Write (wri)

      Rich Text Format (rtf)

      Text (ans, txt) ~jeff

    5. Re:Google enters this market at the right time by uebernewby · · Score: 2

      I know, but I was talking about the page rank feature that makes sure you get relevant results when you do a search, where a site/document gets a higher rank if more sites/documents link to it. .DOC files or PDFs don't link to anything, so I can't see how using Google technology to index a corporate intranet with tons of these files would be more useful than an ordinary "flat" keyword search engine.

      --

      News and bla for computer musicians: http://lomechanik.net/
    6. Re:Google enters this market at the right time by Hallow · · Score: 2, Informative

      You probably haven't used Acrobat or Word for awhile. They both can contain links.

    7. Re:Google enters this market at the right time by uebernewby · · Score: 2

      I know, but the point is, they usually don't.

      --

      News and bla for computer musicians: http://lomechanik.net/
    8. Re:Google enters this market at the right time by laserjet · · Score: 2

      No, I've got Word 2.0c right here and it does not do links. Sorry, buddy. And, I've got the most up to date version of DOS installed too, so you obviously don't know what you are talking about.

      --
      Moon Macrosystems. Sun's biggest competitor.
    9. Re:Google enters this market at the right time by Cato · · Score: 2

      Page ranking mainly works by finding documents that are linked *to*, and therefore are popular, authoritative, or whatever. As long as the intranet sites have links to .DOC files etc, Google should continue to work OK - admittedly, there won't be links between the DOC files, but that's just another reason to convert them to HTML, where they can perhaps be auto-linked based on keywords etc.

    10. Re:Google enters this market at the right time by uebernewby · · Score: 2

      I understand that. I'm just saying you'll be hard pressed to find any company intranet were most documents worth indexing aren't primarily in .DOC format or somesuch. Converting all those documents to X/HTML and creating meaningful (!= auto-linked based on keywords) just so you can use Google doesn't make sense.

      --

      News and bla for computer musicians: http://lomechanik.net/
    11. Re:Google enters this market at the right time by shogun · · Score: 2

      dewording technology

      I think we have a new buzzword there.

    12. Re:Google enters this market at the right time by Anonymous Coward · · Score: 0

      On an intranet, Google's search is as powerful as grep.

      The not-so-secret sauce for Google is their page rank algorithm, and this depends on the *very* *rich* *large* collection of internet documents.

      In typical corporate intranets (every one I've seen, across many industries), most of the documents are sitting in an NT filesystem or a document management system and no links that would help in ranking results.

      Without links, Google = grep.

      --Pat zippy@cs.brandeis.edu

      P.S. Full disclosure - I work for a competitor of Google's.

  4. hmm. by raindog151 · · Score: 5, Funny

    will it also index employee email?

    Searched the intranet for 'herbal viagra'.
    Results 1-10 of about 1,279,500. Search took 0.14 seconds.

    --
    your jesus is another mans xebu. chew on that hypocrites.
  5. Search engine by blibbleblobble · · Score: 1

    Or you could write a 10-line perl script to index the titles of all your documents. Then maybe another 10-line perl script to do searches on it.

    It does sound quite useful actually. If you have any serious amount of information to categorise (spy agencies, perhaps?)

    1. Re:Search engine by cjsteele · · Score: 2, Funny

      Actually, this is like a 10-line script if you can use `grep`... something like...

      #!/usr/bin/perl
      use CGI;
      $query=param( 'q' );
      $document_root = "/home/";
      print "<html><body>";
      foreach `grep $query $document_root`
      {
      print "<li>$_</li>\n";
      }
      print "</body></html>";
      exit(0);

      --
      "This above all, to thine own self be true" :x!
    2. Re:Search engine by Anonymous Coward · · Score: 0

      While that might work for your own little site, it just doesn't scale.

    3. Re:Search engine by gorilla · · Score: 3, Informative
      What a horrible script.

      No taint checking (What happens if 'q' contains ";rm -rf /;".

      No warnings.

      No proper formatting of HTML, on the output. If the grep matches "", then it's not going to display anything on netscape. You need to either strip tags, or force tag matches.

    4. Re:Search engine by cjsteele · · Score: 1

      Indeed, this is not a production script; it fails to check for tainted input, and furthermore provides for no malintent within the names of the files (a cross-site scripting bug could be to name a file to contain some wild-ass javascript, or something to that affect, thus causing the javascript to be embedded in the output of the script.)

      Please note: this was not meant as a "use this in production, 'cause its 'super 31337", this was a, "very basic search engines are indeed this easy." Only an idiot would look at this and say, "he's suggesting we use THAT?!?" Neigh good sir, think again.

      --
      "This above all, to thine own self be true" :x!
  6. Splendid! by johnburton · · Score: 4, Interesting

    I see more of this in the future - if you want a search engine, buy one and put it on the network. If you want a web server, buy one and put it on the network. You want a disk server... Well you get the point.

    As hardware continues to get cheaper and software more expensive as it gets more complex it makes sense to do this rather than trying to configure multiple applications all on the same server.

    And good luck to google making money on this so they can keep their search engine fast and free of annoying advertisments.

    --
    Sig is taking a break!
    1. Re:Splendid! by seanadams.com · · Score: 2

      That sounds great, until you take a step back and look at all the *crap* that people have tried to sell this way. Most of these products are just cheap PCs running a free UNIX, a little bit of other free software like a web server/router/firewall/sendmail, and maybe a little web config tool to help you set it up. I've seen products like this sold for $30K or more! FYI the shunra is a horrible network simulator product that I evaluated at my last company - we ended up building something way better for $0 plus the cost of a PC, using FreeBSD and DummyNet. Look at all those lame-ass NAS boxes which cost $1500 and up. Why would I want to pay that kind of markup for the simplicty of setup, when the box is so severely cripped compared to a cheap PC? Unfortunately not everyone realizes how easy it is to do this stuff themselves, so there will always be a market for garbage like this.

      Now, there have been a few notable exceptions, and these are only the ones where the value of the software far exceeds that of the hardware needed to run it. This googlebox sounds like one of them. Another PC-based Internet appliance that is almost worth the $$$$ is Cobalt's Qube and Raq products - I wouldn't buy one myself because I know how to set up all that stuff w/o a pretty web UI, but I've heard great things from people who have purchased them.

      It's just too easy to get ripped off buying these appliances.

    2. Re:Splendid! by eggboard · · Score: 1

      Folks, Google is making scads of money selling search service to business. Go to Cisco and other sites, and you see "powered by Google." They make a lot of money off this service because, even as an outsourced service, they can save hundreds of thousands of dollars a year in staff, maintenance, and software licenses and development from in-house search engine deployment.

      The Google in a box service is just an extension of their existing business service. So while it's a great thing, it's another tool. They're already nearly or entirely profitable, apparently, between ads and their business search service.

      --
      Freelance tech journalist for the Economist, MIT Technology Review, Macworld, and others
  7. Search? by qurob · · Score: 1, Funny

    find | grep missingdata

    Ctrl-Esc, F, 'missing data'

    1. Re:Search? by Anonymous Coward · · Score: 0

      Not to quibble, but:

      find ./ -exec grep "missingdata" {} \; -print

    2. Re:Search? by Anonymous Coward · · Score: 0

      Faster alternative:

      Winkey-F 'missingdata'

  8. Looking for a good internal search engine by egburr · · Score: 3, Interesting
    I've been looking (when not otherwise distracted) for a good search engine for my documents on my home network, on a linux server. So far, I haven't found anything I've liked (or that even seemed to work very well).

    I would like to find a search engine that will index:

    • text files
    • html files
    • PDF files
    • names of binary files
    Unfortunately, I am not able to spend much to purchase such a search engine (say $20, not $20K). This would be for my personal use, not for any kind of commercial use, and would not be funded except by my anemic hobby budget.

    Does anybody have any recommendations?

    --

    Edward Burr
    Having a smoking section in a restaurant is like having a peeing section in a swimming pool.
    1. Re:Looking for a good internal search engine by Anonymous Coward · · Score: 0

      I have a suggestion. Upgrade to Windows 2k and use the indexing services.

      Bill.

    2. Re:Looking for a good internal search engine by pere · · Score: 3, Informative

      Try http://www.mnogosearch.org

      Brilliant search engine. It has parser for most file-formats (You can use pdf2txt to index your pdf-files). It even indexes your mp3's if you should happen to have some on your local net.

      Free (at least as in beer) for Unix. Binaries for Windows costs between $99 and $699.

    3. Re:Looking for a good internal search engine by richieb · · Score: 5, Informative
      Try htDig. It does all these things and is free software. I used it on a corporate intranet in the past. Not as good as Google, but you can't argue with the price.

      --
      ...richie - It is a good day to code.
    4. Re:Looking for a good internal search engine by NewbieSpaz · · Score: 2, Informative

      try ht://Dig. It's free and works with *nix. Info about pdf indexing is here: http://www.htdig.org/FAQ.html#q4.9
      It's a good solution for a small to medium sized website. If you run Linux, it might be on your install CD's, or might be installed already.

      --
      ------
      Random, useless fact: I type in startx entirely with my left hand.
    5. Re:Looking for a good internal search engine by fliplap · · Score: 2

      You actually gave me a very good idea that i think the community could benefit from. Because I'm not positive that what you're looking for is just an internal website search engine. My guess is that you're looking for something to search all documents in all directories (all readable by you anyway) on your local network.
      I can imagine this wouldn't be a tough task if you created a modified 'locate' command in perl with an updated updatedb script that would check for text files (cat those - store results in SQL database), strip html docs off tags (SQL those results), pdf2txt your pdf files and just store the names of binaries, heck you could even run "strings" on binaries if you were so inclined and store the results.
      Of course this would be much more disk and processor intensive than your typical updatedb so you might only run it say, once a month, or once every 2 weeks. But it could be a real life saver. The best thing todo would be to have one SQL server, with a cgi frontend, so you could just goto your webserver on your internal network, type in your query, and the engine would tell you on what machine in what directory you could find the document. I'm actually considering writing this now unless someone else has already done it, please reply if you know of a similar or identical system.

    6. Re:Looking for a good internal search engine by Anonymous Coward · · Score: 0
    7. Re:Looking for a good internal search engine by ethereal · · Score: 1
      --

      Your right to not believe: Americans United for Separation of Church and

    8. Re:Looking for a good internal search engine by Anonymous Coward · · Score: 0

      mnogosearch

      Its pretty nifty

    9. Re:Looking for a good internal search engine by mvpll · · Score: 1

      I've started to create one, using MySQL, Zope and a browser front end. At present it suits my purposes better to interact with it using Python scripts but it is nice to have "one-click" search results in the browser. I don't think I'm going to get what I ultimately want using the above (heaps of issues, security of course being a main one, and maintaining the current state being another), but it works well for a "proof of concept".

      It also is not likely to become a neat little "install and use" package, but that is one of my goals for it. For example there are tradeoffs between harddisk space and search time depending on how much indexing you do, presenting this in an easy to tune manner would be no small task.

    10. Re:Looking for a good internal search engine by ziad · · Score: 1

      You also might want to look into Lucene, it has very powerful full text search and index capabilities.
      Lucene

    11. Re:Looking for a good internal search engine by ghutchis · · Score: 2, Informative


      "Not as good as Google,"
      OK, fair enough. Have some suggestions for how to improve it? Unlike Google, you can tailor all the search weightings in ht://Dig.

      Either general suggestions like "titles should be weighted more" or parameter changes would be quite welcome.

      It's open source, it's yours. So don't you want to see it improve?

      -Geoff

    12. Re:Looking for a good internal search engine by Anonymous Coward · · Score: 0

      I can't believe no one has mentioned Strangesearch!

    13. Re:Looking for a good internal search engine by Anonymous Coward · · Score: 0

      A not-too-expensive commercial product is ISYS by Odyssey Development. http://www.isys.com.au - lots of file formats, can also do mailbox indexing.
      I find it strange that Google is pitching this as a "new thing". I've been using local text indexing for so long that I've still got a DOS version. And the index goes back to 1989, and I can still find documents.

    14. Re:Looking for a good internal search engine by wdavies · · Score: 2

      Try http://jakarta.apache.org/lucene/The best of the free ones IMHO - written by a very experienced commercial search engine (Doug Cutting, Excite).

      It's a little hard to get going with, but it is in Java, and VERY efficient (a commercial search engine was only about 30% faster).
      Winton

    15. Re:Looking for a good internal search engine by shogun · · Score: 2

      Also if you run a university internet website you can have it indexed and searchable by google via their University Search Feature a free version of their sitesearch service for educational institutions. Which are both really just a site-restricted internet search, but you can use you own templates in the search results which is cool.

  9. Can it solve the office hide and seek games? by KrunZ · · Score: 0, Offtopic

    If it can find my colleges I will definately make my boss buy this product.

  10. Why Google Can Be So Expensive... by BTWR · · Score: 5, Insightful

    Google did exactly what us fanboys all whined and complained for - a company that made a good product (awesome search engine) without selling out (no popup ads). Google offered a free service, built up an enoumous following, and now offers its premium service for a premium price, while insuring its loyal customers continued free services. Forget eBay, Google is an Internet-Success-Story worthy of such praise!

    1. Re:Why Google Can Be So Expensive... by Anonymous Coward · · Score: 0

      most "success stories" don't involve bankruptcy. Google is a private company, so figures aren't available, but it would be reasonable to assume they're not profitable.

    2. Re:Why Google Can Be So Expensive... by Rentar · · Score: 2, Funny
      Forget eBay, Google is an Internet-Success-Story worthy of such praise!

      Oh no! By declaring Google an "Internet-Success-Story" you doomed them! They gonna go bankrupt in 3 month or less!

    3. Re:Why Google Can Be So Expensive... by Anonymous Coward · · Score: 0
    4. Re:Why Google Can Be So Expensive... by Beltza · · Score: 1

      Is it really true that Google is taking its next step on its way to even bigger success? Or do they realize that they cannot continue like this forever (giving exelent stuff away for free).
      The first step is offering additional services for paying customers. After this initial step, too many companies made the mistake asking money for their basic services and thereby losing their broad support.
      Lets cross our fingers for Google.

    5. Re:Why Google Can Be So Expensive... by PoiBoy · · Score: 4, Informative

      Actually, I've seen interviews in some business magazines with their CEO. In fact, they are slightly profitable and have been for a few years.

      --
      Sig (appended to the end of comments you post, 120 chars)
    6. Re:Why Google Can Be So Expensive... by silicon_synapse · · Score: 1

      Didn't Google IPO a while back on one of the foreign exchanges? Somewhere in Europe I thought.

    7. Re:Why Google Can Be So Expensive... by carlos_benj · · Score: 1

      Or do they realize that they cannot continue like this forever (giving exelent stuff away for free).

      Your budget must be different than mine. Last time I checked, $20k wasn't even close to free? Or maybe Andersen is your accounting firm....

      --

      --

      As a matter of fact, I am a lawyer. But I play an actor on TV.

    8. Re:Why Google Can Be So Expensive... by sumengen · · Score: 1

      How about cashflow?

  11. $20K Isn't really that much if you consider it. by jellomizer · · Score: 5, Insightful

    The companies that are useing the apliance are Large Corporation with Hundreds perhaps Thousands of computers and Millions of files and documents to find. The real question is how much money is the company loosing from people who have to redo misplaced documents. or make new ones which are simular to an other document that someone else made a while back. In a large corportation a Thousand of people working at $20 an hour are taking 1 hour to redo a document or spend time finding it. It makes up for the caust. Also if it gives google more money the better change the search eng. Stays free and without a ton of anoying avertising.

    --
    If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    1. Re:$20K Isn't really that much if you consider it. by juggler314 · · Score: 0

      There's absolutely nothing wrong with charging for a good product. No matter how you cut it nothing is free to develop. Opensource projects might be given away for free, but they still cost much money in the form of time spent. Hopefully google will provide a trial or some sort of free version for those of us without $20K to spend though.

    2. Re:$20K Isn't really that much if you consider it. by Styros · · Score: 3, Interesting

      If you consider the amount of time needed to create a search engine like Google, you'll see that $20k is very cheap. At my company, IT charges our dept $100/hr, so $20k only gives you 200 man-hours. And, that's cheap! In talking with some of my friends, their IT dept charges almost $500/hr, which would only give you 40 man-hours. I'd much rather pay Google for their search engine than get a product from IT that they threw together in 40-200 man hours.

    3. Re:$20K Isn't really that much if you consider it. by Ian+Wolf · · Score: 2

      It's especially cheap when you look at the cost of some other products like Autonomy. I used to work for a company that paid $50K for it, and kept paying and paying and paying. I believe they spent upwards of $100K before it was implemented to their satisfaction.

      --
      "The words of the prophets are written on the Slashdot walls."
    4. Re:$20K Isn't really that much if you consider it. by Fishstick · · Score: 1

      Yep, this appears to be pretty competitivie, at least based on the evaluation we did last year on search engines to license and install on our intranet.

      AltaVista is like $50,000 for up to four processors indexing up to 250,000 documents (yes, they charge you based on how much you use it - $25,000 only lets you index up to 50,000 files).

      Inktomi was cheaper, but didn't have al lot of the features of AVS.

      Google wasn't a choice at the time. I wonder what that $20,000 includes (would be nice if it was for unlimited use within your network).

      --

      There is much cruelty in the universe, John.
      Yeah, we seem to have the tour map.

    5. Re:$20K Isn't really that much if you consider it. by Fishstick · · Score: 2

      Ugh, at the risk of getting modded down for replying to my own note, I did actually *gasp* read the article after this post and found the following:

      The product comes in two versions; one that sells for $20,000 and scales to search up to 150,000 documents and a more powerful version for $250,000, which Google says can scan "millions and millions" of documents.

      But that supposedly includes hardware so it still sounds like a good deal.

      --

      There is much cruelty in the universe, John.
      Yeah, we seem to have the tour map.

    6. Re:$20K Isn't really that much if you consider it. by Zagadka · · Score: 2

      Hopefully google will provide a trial or some sort of free version for those of us without $20K to spend though.

      They already do. If you go to www.google.com they'll let you search the web for free.

    7. Re:$20K Isn't really that much if you consider it. by juggler314 · · Score: 0

      true, now all I have to do is build a tunnel through my firewall and point google.com at it...

    8. Re:$20K Isn't really that much if you consider it. by LoseNotLooseGuy · · Score: 2, Informative

      The real question is how much money is the company loosing from people who have to redo misplaced documents

      I find it difficult to believe that the company would be capable of "letting loose or releasing" money from people--that would be tantamount to theft. However, it is possible that the company would fail to obtain money from these people. The word you were looking for is losing.

      Congratulations! You have been participant #27 in my campaign to rid Slashdot of this error.

      --
      Proudly correcting Slashdot's most irritating linguistic error since 2002.
  12. Advertising?? by DeadVulcan · · Score: 1

    The article says: "Google's core consumer search business is free and is funded largely by advertising."

    But where? Has anyone seen this advertising? It's certainly not in the form of banners... I've always wondered how Google supports itself.

    --
    Accountability on the heads of the powerful.
    Power in the hands of the accountable.
    1. Re:Advertising?? by headchimp · · Score: 0

      Companies pay for the Sponsored links section in google

    2. Re:Advertising?? by djmurdoch · · Score: 2, Informative

      Go to Google, search for "google advertising", and you'll get this page near the top of the search results. Basically, they're selling people "sponsored links".

    3. Re:Advertising?? by juggler314 · · Score: 0

      The advertising they speak of is in the top line responses that always come back (for instance - search for any reviews etc and you'll always get a few links to meta pages devoted to deals). Then there are also the highlighted (and usually more relevant than the top line links) links on the right hand side of the page.

    4. Re:Advertising?? by Anonymous Coward · · Score: 0

      Do a google search for something like "long distance" or "apartment search".

    5. Re:Advertising?? by greenrd · · Score: 1
      It's text ads. Search for "object databases" or "sex". Then look carefully.

    6. Re:Advertising?? by IainHere · · Score: 1

      Try searching for "jobs" - wherever you are in the world, you're guaranteed to get an advert for a work agency just above the search results. If you cannot see it, try searching for "glasses"

    7. Re:Advertising?? by Mr.+Slippery · · Score: 1
      Has anyone seen this advertising? It's certainly not in the form of banners...

      If you search on certain words or phrases, they'll be things labeled "sponsored links". For example, if you search on "advertising", four sponsored links related to the topic show up. Search on "foo bar baz", though, and none show up.

      --
      Tom Swiss | the infamous tms | my blog
      You cannot wash away blood with blood
    8. Re:Advertising?? by DeltaStorm · · Score: 1

      They use text advertisments. Entering in a phrase like "domain" or "digital camera" will result in sponsored links above and to the right of the normal search results.

      --
      .sdrawkcab si gis siht
    9. Re:Advertising?? by NickisGod.com · · Score: 2, Insightful

      Well you see, Google has half a brain.

      "Hmm...if somebody's searching for domain registration, let's offer text ads about domain registration. Then, they won't be pissed about downloaing goofy banner/javascripts and they may actually click on the ad because it *is* useful."

      Almost makes sense--but then you can't shoot the monkey.

      Seriously though, I've clicked on Googe ads numeorous times beause they're relevant.

  13. a new wave... by cjsteele · · Score: 1

    I think you're going to see a LOT more of this type of 'appliance' in the future... with the ever growing masses of information exploding off the desktops in every office around the country, we are quickly approaching critical mass -- we need better ways of tracking and managing data. I think Scott Adams mocks this as a "Knowledge Management" line of thinking, but its true.

    I've dealt with this at the last three employers I've been at, and no one seems to have a good solution.

    The CIA is using Northern Lights for their document managing, so is the FBI going to take up the Google-cross?

    --
    "This above all, to thine own self be true" :x!
  14. Please by headchimp · · Score: 0
    let me know when the first company actually pays for this.

    These days companys are downsizing and cutting back. We just sold off a bunch of extra inventory and were told to power down our computers at night and especially the weekend to save electricity.

    Now when I came in this morning, a bunch of furniture has been hauled out with "sold" signs tagged to them.

  15. Uhm so by motox · · Score: 1

    So they are giving out 10k for a contest and getting 20k for every copy they sell... im not sure i want to help them now :)

    1. Re:Uhm so by Anonymous Coward · · Score: 0

      Oh? And you use their service for how much? For free? The results they've spent loads of money collecting. Oh no, we wouldn't want to help people like that, would we?

    2. Re:Uhm so by motox · · Score: 1

      Good point. :)

  16. They use text banners by ColGraff · · Score: 2

    Little lines of text from advertisers. Sweet, huh?

    --
    I'm the stranger...posting to /.
  17. sidebar ads by mikeee · · Score: 2

    Google sometimes has ads in a sidebar on the right or top. These are targeted based on your search, and are thus usually relevent enough not to be annoying (not to mention being ignorable).

    I find it hard to believe the revenue from those is really significant, but who knows; I bet their clickthrough rates are much better than those damn popup ads.

  18. That's old news. by forged · · Score: 0, Redundant

    Companies such as Cisco and the likes with a huge intranet, have been using Google for some time. Use the search engine on their main page to get the idea.

    To me, it was only a matter of time until they port their technology to simpler environments (home users & smaller corporations) for a fraction of the cost.

    (incidentally I searched for porn and still got 4 results back :)

    1. Re:That's old news. by Anonymous Coward · · Score: 0

      umm, that searches their public files(www) which are already indexed and cached @ google anyways. Anyone can do this. This box is for searching files on an intranet, that is cut off from the internet

  19. article from C|Net here: by mESSDan · · Score: 4, Informative
    From C|Net.

    It's a little more indepth than the India times article.

    --

    -- Dan
  20. Is this new? by TechnoLust · · Score: 2, Interesting

    Our corporate intranet has an excite search on it, and the intranet is not accessible from the net. I doubt they would have paid $20k for it either. Does anyone else have something like this, because I was under the impression it was common to have an internal search engine?

    --
    "Da ist ein Technölüst in mein Unterpanten!"
    1. Re:Is this new? by coug_ · · Score: 1

      It's not new for companies to have their own search engines. I've had some experience with setting up Excite search engines on web sites. The only thing new here is that it's Google's search engine that's now being sold to corporations.

      As far as I know, the Excite search engine was free. It probably wasn't nearly as good as Google's, and Excite may have had a better non-free engine.

    2. Re:Is this new? by Radical+Rad · · Score: 2

      Novell has something like this too.
      It comes free with Netware 5.1 called Web Search.
      Maybe not as spiffy as Google but is damn fast, and it also has capability of password protecting sensitive search results.

  21. Quote reminds me of an old joke... by Dick+Click · · Score: 0, Offtopic

    Tom: "Can you give me Google in a box?"
    Mary: "Yes, we can."
    Tom: "Well, let him out!"

  22. I go google for googleboxes! by glh · · Score: 1

    Sounds really cool. I think I'd like to get one of these bad boys. I wonder how the details will work from an implementation perspective.. will one have to put all of their documents on one file server, or will it span multiple machines?

    Also, how will it detect relavance? I'm pretty sure right now Google analyzes hyperlinks as part of its relevance algorithm... How will that work with internal documents if they aren't hyperlinked? How useful will this thing actually be? I'm sure they will think of something.

  23. Ouch. Try HTDIG. by Kozz · · Score: 3, Informative

    Yes, quite CLEARLY it's only for those who've got some cash to blow. If you've got a modest-sized Intranet site, I would highly recommend htDig. I've installed and configured it in several places and it works like a charm. Best of all, it's GPLed! Sure, it doesn't have all the fancy matching algorithms used by Google, but it does a damned good job nonetheless.

    --
    I only post comments when someone on the internet is wrong.
  24. Quick Indexing by Mattygfunk · · Score: 2, Insightful

    I could see one of the advantages that this would have is the ability to index pages/emails/whatever very quickly. No need for the wait that accompanies a index request on a web search engine because the spider will be around every hour or less in an intranet.

  25. Corporate search engines by Alomex · · Score: 2, Interesting

    Surprisingly few corporations are willing to spend money indexing their internal document set, as other search engine companies discovered.

    Excite, Altavista, HotBot, Lycos all at one time or another tried to sell to the corporate market with little success. So either things have changed since, or Google management repeating an old mistake from other companies...

    Moreover, companies such as Verity which specialize in corporate search engines have reported falling revenues as of late...

    1. Re:Corporate search engines by gorilla · · Score: 2

      I think that a heck of a lot of companies know that they don't have much on their internal networks which is actually worth searching.

  26. We're using it here...it rocks! by HRH+King+Lerxst · · Score: 4, Informative

    They just implemented this were I work, it's a vast improvement over what we had before. It even includes the cache and newsgroup features!!

    Two thumbs up!!

    --
    No one got beat up more often than the mimes of the old west!
    1. Re:We're using it here...it rocks! by sam+the+lurker · · Score: 1

      I sooo wish that the Fortune 500 company that I work for had a couple of these.

      Finding anything on the company intranet is next to impossible. The search engine that we have now returns links to documents 2 or 3 years old with URLs on a network that is no longer in service. Arrgg!!

      $20K is an insigificant cost for a large company. Yet I think that (at least where I work and I suspect other large corporations) being able to effectively search the company intranet just doesn't seem all that important to the people making the money spending decisions.

    2. Re:We're using it here...it rocks! by HRH+King+Lerxst · · Score: 1

      Yeah, we were using inktomi, and you couldn't find anything.....things are much better now. I think we have close to 200,000 employees, spread out all over the US and the rest of the world, with lots of data on our intranet.

      It's nice to for internal webmasters, since you can include parameters in your html form that will limit the search to just your host, eliminating the need for me to maintain my own search engine.

      --
      No one got beat up more often than the mimes of the old west!
    3. Re:We're using it here...it rocks! by rainer_d · · Score: 1

      Hehe.
      Do we work at the same company ?
      ;-)
      Our intranet-pages are IE-only and totally overengineered. Nobody really looks at them.
      And I doubt you can index them with a normal "google-bot" - there are no "links" in the
      www-sense because most things come out of a database...
      Just bullshit. And an immense waste of time and money.

      cheers,
      Rainer

      --
      Windows 2000 - from the guys who brought us edlin
    4. Re:We're using it here...it rocks! by selectspec · · Score: 2

      What does it use for storage? Does it have its own drives? Does it talk to a database? Or does it talk to direct attached, SAN or NAS?

      --

      Someone you trust is one of us.

    5. Re:We're using it here...it rocks! by HRH+King+Lerxst · · Score: 1

      Oooh, sorry dunno, not part of that group....not even in the same state as that group.

      --
      No one got beat up more often than the mimes of the old west!
    6. Re:We're using it here...it rocks! by laserjet · · Score: 2

      I like your sig. Made me laught out loud. Also made me remember what a piece of shit edlin was. :)

      --
      Moon Macrosystems. Sun's biggest competitor.
    7. Re:We're using it here...it rocks! by Error27 · · Score: 2
      500 Gigs of DRAM of course.

    8. Re:We're using it here...it rocks! by Wesley+Felter · · Score: 2

      Given that the specs don't mention any kind of external storage, I'd guess it has internal disks.

  27. Cheaper to beef up... by heretic108 · · Score: 2, Interesting

    ... the ht://dig search engine.

    In this climate of IT layoffs, I reckon it would prove cheaper and better to hire a programmer to take the GPL'ed ht://dig code and hack in some Google-like improvements.

    The major improvement needed is the ability to search on phrases, and to do boolean searches.

    Such a beefed up search/indexing system would not be subject to licensing fees, and would be freely redistributable (say, to other company offices).

    --
    -- In the beginning was the WORD, and the WORD was UNSIGNED, and the main(){} was without form and void...
    1. Re:Cheaper to beef up... by kz45 · · Score: 1

      In this climate of IT layoffs, I reckon it would prove cheaper and better to hire a programmer to take the GPL'ed ht://dig code and hack in some Google-like improvements.

      Only if that programmer is willing to work for less than $20K a year.

    2. Re:Cheaper to beef up... by ghutchis · · Score: 2, Informative

      Nah. Keep in mind that the ht://Dig project has several contributors. A few contributions of code go a long way.

      Keep in mind, though, that ht://Dig already implements many "Google-like" features such as indexing the text of links to documents and keeping track of the backlink count.

      http://www.htdig.org/attrs.html#backlink_factor
      http://www.htdig.org/attrs.html#description_fact or

      A proximity weighting would be nice, but there's some work to be done before that.

      -Geoff

  28. Hey, maybe slashdot can get this... by powerlinekid · · Score: 5, Funny

    At least then the search feature would work right and they can finally cache all those sites that we take down.

    --

    can't sleep slashdot will eat me
  29. The GPL (and Go Google!) by base3 · · Score: 4, Insightful
    Google's product selling for $20,000, and being based on Linux, is a good counterweight to the FUD being spread by Microsoft et al that cries "If we write a product that so much as uses one GPL library, we have to GPL it. Waaaaa."

    Unless Google reimplemented their own operating system, or <shudder> ported it to Win2K, they have a very expensive product, that runs on Linux, that is not GPL.

    More power to Google--I'm glad to see them finding a way to make money without trashing their search engine, like happened with the previously good search engines that came before (e.g. Altavista, Lycos).

    --
    One CPU cycle wasted on digital restrictions management is ONE TOO MANY.
    1. Re:The GPL (and Go Google!) by Anonymous Coward · · Score: 0

      The thing is, $20,000 is not truly 'expensive software' - I know a company whose software goes for up to $3 million, and is selling it to large enterprises even in these straitened times. It runs on Unix at present, but it could equally well run on Linux (and probably will at some point). The cost of the hardware and OS is not such a big deal, what matters is whether Linux or some version of Unix is the customer's chosen platform - at these prices, you put the software on what the customer asks for :)

  30. I think this is pointless by gTsiros · · Score: 1

    Why would i need a *search engine* for my internal documents?

    If there is decent hierarchy in how i organize my files i suppose it won't be hard to track down anything without the need for a very heavyweight search engine like google's.

    Say i want to find an mp3. I will look under music/hard_rock/evil_peas/there_it_is.mp3

    maybe it is something i haven't thought of. And my example is very silly, i know.

    --
    Looking for people to chat about multicopters, coding, music. skype: gtsiros
    1. Re:I think this is pointless by Anonymous Coward · · Score: 0

      You must only have your files on a single, small computer. When you get like me, stuff crammed onto 40 servers and about 300 Linux/SunOS machines all over the US and Africa, finding your data gets a bit harder. Our solution is to NFS mount all of the servers and UNIX clients onto a single central server and run updatedb/locate. It only takes about 28 hours to run updatedb ;).

  31. I look at it this way.... by penguin_dance · · Score: 1

    If they can make money off their engine, more power to them. Keeping them solvent will keep their public search up and freely running for the rest of us.

    --
    If you've never been modded as "flamebait" or "troll," you've never tried to argue a minority viewpoint here!
  32. Google has betrayed the Open-Source Community by duffbeer703 · · Score: 0, Troll

    If the management of Google were truly enlightened OpenSource evangelists, they would have given away the software for free. Google has built it's success on the back of Free Software developers, to the point that it should be called GNU/Google. Using Free and Liberated software to create a commercial monster is offensive and wrong.

    I demand that Google allow Jesse Jackson, ESR and RMS on-site to persuade Google to go GPL and to investigate alleged GPL violations.

    In addition, I call for the formation of the GNUggle project, an entirely Free search engine that runs of GNU/Hurd systems only.

    --
    Conformity is the jailer of freedom and enemy of growth. -JFK
  33. Set your watches by NiftyNews · · Score: 2

    Note the date, gentlemen. If Google is selling wholesale software solutions, the countdown clock to paid searches begins today. I'm betting that in less than a year's time we'll be asked to pay for Google searches. Hopefully by that time someone will have figured out a good system for micropayments.

    Free is wonderful, but free doesn't scale when it comes to indexing the majority of the internet.

    1. Re:Set your watches by HeUnique · · Score: 2

      Look at the posts above - there is a link to a BBC report that said that Google is *already* profitable...

      So, if they're now profitable (actually, for the last 2 quaters), why should they charge money now? where's the logic?

      Another issue that someone mentioned here - Yes, Alta vista and other companies did try to sell their search engines and have fallen - but google got 2 points:

      1. They're number 1 in search on the net.
      2. Dead easy setup - plug the machines, give IP, and open your browser - from there you just have to setup where to get the data from and let the machines do the job. Nothing more...

      I wish good Luck for google - I always use it (gg: in konqueror)..

      --
      Hetz (Heunique)
    2. Re:Set your watches by VFVTHUNTER · · Score: 2

      Hrmmm. NO.

      Have you used google? They even have a page explaining why their site doesn't have pop-ups (I hadn't realized Yahoo had become such a pop-up pain till I used IE recently - GOD I love Galeon).

      The paid search watch has already gone off, with Yahoo offering premium content at a price. You can put your watch back on GMT now.

      Google are good people. They recognize that the dot com boot is over, and they are pursuing good, honest, value-based business models. Yes they have patents, but they don't patent silly, no-brainer things like the idea compressing a file before you transmit it. They patent hardcore search algorithms, which they paid a bunch of CS/IT people to develop. You can tell it's not a common-sense patent by the fact that no one has written a mod_google for apache that can rival Google's indexing abilities.

  34. How will page rank work on a corp site? by ajm · · Score: 3, Interesting

    Part of the success of the google technology is based on the page rank system which depends on many people linking to pages and so "ranking" them. On a corporate site you don't have as many separate opinions (i.e. pages managed independently) so perhaps the page rank part of google won't be as successful. OTOH just having fast search of all the docs would be good here :)

    1. Re:How will page rank work on a corp site? by martin-boundary · · Score: 1

      Your average corporate site is probably very hierarchical, with one or more home pages at the top followed by several levels of navigational pages and finally the actual pdf and doc files. So this looks a lot like a tree structure, or the unix filesystem. You can easily tune the pagerank system for such a tree structure, and then the document's importance will be related to how many ways (and how quickly) you can get to it from the corporate homepage. All you do is use the corporate homepage(s) as "seed" pages in the pagerank equation (in google's model, when a websurfer gets "bored", he jumps to a "seed" page). On the internet however, every html document is considered a "seed" page, so the importance measure is a lot more democratic, but requires a lot more linking also.

    2. Re:How will page rank work on a corp site? by Anonymous Coward · · Score: 0

      I really think the problem has to do with the fact that there is no feedback. If you look at the pageRank system you find out that the weights are computed iteratively. In short, the weight of one page gets to change if the crawler gets to see the page a second time. So for example you have :
      c:/thuria/big_sat/hughes/hs702.pdf

      if the page rank system were to be applied to this link, then the crawler would get to hs702.pdf but it would not be able to go back up in the tree hence leaving the weight of that page the same it was originally -> defeating the purpose of pagerank.
      So the real question I think Google is facing when selling solutions like this is, what is the feeback mechanism that would allow for pagerank to work ? anybody has any thoughts on this ?

    3. Re:How will page rank work on a corp site? by Anonymous Coward · · Score: 0

      Yes but the "bored surfer" part of this technology is linked to the facts that you have had several iterations in the pagerank algorithm hence there is no need for further weight definition (it has become accurate enough). The bored web surfer is a mechanism that enables to algorithm to split from inifinite loops essentially.

  35. 150,000 documents isnt that many by digitalsushi · · Score: 1

    It is, but it isnt. I mean I've got... about 16k html files on my one computer. 2 grand to search through them seems like a lot. Then again I'm just a dumb kid with a lot of junk. To a company in the business of information, thats prolly a pretty good deal. Boy, I wish I could make up my mind.

    --
    slashdot: where everyone yells sarcastic metaphors to themselves to understand the issue
  36. That's cheap... by Anonymous Coward · · Score: 0

    ...at least compared to AltaVista - and theirs really sucks in comparison. (Speaking as one who has coded to their API)

    I look forward to our license expiring so I can consider a change to google - I can't wait!

  37. Why not? by loraksus · · Score: 2

    It's not that expensive, considering the amount of money a corp wastes every year. If you put it in perspective - it is half of an average worker's yearly salary - and if management thinks it will save that much money over a year. . .
    Companies have private jets so the pres / vp can get wasted while traveling across the country - $20k is nothing.
    Google roxxor! :)

    --
    1q2w3e4r5t6y7u8i9o0pqawsedrftgthyjukilo;p'azsxdcfv gbhnjmk,l.;/
  38. They'll have competition... by Froobly · · Score: 1
    Several years ago, I learned about a context-based search engine that my friend's dad was developing for corporate intranet use. You can find it here, and I think the site describes it far better than I could.

    When I spoke with him, they were wooing some fairly high-profile clients, but I can't rightly say I know where they are right now.

  39. Document management by stinky+wizzleteats · · Score: 4, Interesting

    This has a LOT more business application that appears on the surface. And $20K for such a solution is comparable to paying $50 for Red Hat to run a server.

    Back in my systems integration days, we had very many law firm clients who used document management to organize the truly prodigious quantity of information they had to deal with. Spending $50K on the solution was not unheard of even among small firms. In fact, they usually wound up spending $20K just on third party maintenance utilities to support their document management systems!

  40. Didn't we know this all along? by SplendidIsolatn · · Score: 5, Interesting
    Sorry if this sounds uninformed, but I had always been under the impression that Google's Business Plan was based on the idea of a free public search engine and a commercial private one for companies, which would also offer more and better features.


    Isn't this just confirming what we already knew?


    On top of that, depending on the size of your intranet and how efficient/inefficient indexing already has been, $20K may be a bargain.

    Of course, how many companies are really going to have a use for it? For giggles, lets say the entire Fortune 500. That's 500 * 20K = 10,000 K = 10 Million Dollars US. In the grand scheme of things, that's a lot of money, but not a LOT of money. Perhaps they'll add on pay-per-use functions for even ritzier search features?


    Sigs? We don't need no goddamn sigs!

    --
    sig--we don't need no goddamn sig
    1. Re:Didn't we know this all along? by travisd · · Score: 3, Insightful

      $20k is jsut the tip of the iceberg - there's also a good revenue stream to be had in those yearly support contracts for the software.

    2. Re:Didn't we know this all along? by marauder404 · · Score: 1

      Only 500 customers? There's a whole lot more potential than just 500 customers. Any company that has a large number of documents is likely to be interested. You're thinking way too big. Think about all the law firms, research labs, consulting firms, schools, universities, libraries, newspapers, publishers, agencies, accountants, ... The list goes on and on.

    3. Re:Didn't we know this all along? by neonstz · · Score: 3, Informative

      If you read the entire article you would know that there are two versions for sale, one small $20k box which can index up to 150,000 documents, and one "millions of millions" version which costs $250k.

      If a large company puts out all the revisions of all their documents it will be quite a lot of documents :). $250k is still quite cheap for something that will index all electronic documents the company has ever produced.

  41. Like infoseek.... by CDWert · · Score: 3, Interesting

    Years ago Infoseek offered a version of their search engine to Index LARGE collections of documents. We had over 500,000 IT was around 15k if I remeber correctly. Python on a Sparc 20, (20k itself at the time with mem proccesors array and tapes) So we had alomst 4k tied up in the whole thing, There was if I remeber correctly a per site, or per page fee in addition over so many documents, I made an error in a config file once and allowed it to traverse links, other than filling the hard drive, quickly, the additional costing we did after to see how much it would be should we decide to keep those docs was hilarious.

    20k, Isnt bad at all if your talking some serious indexing. We indexed 5, F500 compaines techincal documents at the time, before they were all in house, this was 97-98. It was slick, I often wondered what happened to that software package.

    Anyone know what google is written in ? I decompiled a fair bit of Infoseeks just to see what was what, and because I could :) Indexing LARGE repositories isnt easy and config can be a pain. 20k sounds ok to me. I have YET to see anopen source solution that can handle VERY large document sets ASPSeek, but it still has issues, and over about 2.5 million docs I hear its a dead horse.

    --
    Sig went tro...aahemmm.....fishing........
  42. $20,000??? by Anonymous Coward · · Score: 0

    Okay, the concept is good, but I don't see anyone paying $20k for it. I think we'll see a clone of this on freshmeat.net in about two months.

    1. Re:$20,000??? by silverbax · · Score: 1

      For 20k, yuou can get a decent search engine, not that junk Google passes off as search functions.

  43. incomplete knowledge by timothy · · Score: 1

    In a company (shorthand here for any organization, whatever its purpose), there could be all kinds of information that you don't quite know the categorizations work for every part of the company, or if someone else has a document you might need ...

    Being able to search for keywords within your organization might find you a lot of useful things. Have we dealt with Client X before? Is there anything on the company mailing list about a problem I'm having with remote access? Do we still have a specific report around? It doesn't mean you can't ask coworkers or send a company wide email looking for things you need, but it offers another first option that puts the time / effort burden on inanimate objects instead of people with better things to do.

    timothy

    --
    jrnl: http://tinyurl.com/c2l8yr / foes: http://tinyurl.com/ckjno5
  44. Rather have a WayBack machine! by Bluedove · · Score: 3, Interesting
    Rather than a google engine to index everything there, i'd rather have a WayBack Machine that allows me to see the variant versions of documents. (that aren't in a revision control system accessible to me)


    Wouldn't it be great for when they say "your code doesn't meet the specification of what the product needs to do" and you can use it to say "let's look to the wayback machine to see when you changed the spec but didn't bother telling me"


    :-)

  45. Wish them all the best by billcopc · · Score: 2

    I know I'm biased (and ignorant), but Google is probably the best general-purpose search engine out there, with truly innovative quality filtering like PageRank(tm) and other very neat tricks. They have been around long enough that even the weakest of minds know Google. If this new retail product is as efficient and clean as their websearch, and well supported, they're going to make a killing! I really hope they find huge success, they've earned it.

    --
    -Billco, Fnarg.com
    1. Re:Wish them all the best by carlos_benj · · Score: 1

      They have been around long enough that even the weakest of minds know Google.

      Au contraire! I introduce people to Google a couple times a month at least! They're not dumb though, just ignorant.

      --

      --

      As a matter of fact, I am a lawyer. But I play an actor on TV.

  46. Controlled vocabulary by SurgeMaster · · Score: 1

    I work for a firm that indexes a scholarly database of research articles in psychology. We use a controlled vocabulary to describe the content of each abstract, which can vastly simplify life (for the users who know how to use it, natch.) Does Google (or anyone else) pursue this sort of strategy?

    --
    "One empirical experiment is worth a thousand expert opinions." -Bill Nye
  47. Will this work well on the intranet? by rfischer · · Score: 1

    I don't know too much about Google's technology, but I thought it used a scheme were web pages having many referring links would score higher in the search results.

    For a corporate intranet, do you have this information? I mean are there people building home pages linking to their favorite corporate policy page?

  48. Thats all well and good... by Quicksilver31337 · · Score: 1

    but I think I will stick to using Grep and Locate.

    --
    _______
    Death wish, n.:

    The only wish that always comes true, whether or not one wishes it t
  49. Why does google get a slashdot-patent-pass? by victim · · Score: 5, Interesting
    Just curious about people's opinions here. Google gets covered fairly regularly on slashdot. Usually when a company that uses software patents to protect its business from competition comes up on slashdot they get reamed along with the USPTO.

    slashdot talked about this in 1999 when the patent came up. Its 2+ years later now. google has mostly crushed the competing search engines because the results of their algorithm are preferred to other algorithms. Their revenue sources are not public, but I believe I read recently that half of their revenue is from advertisements and half from technology licensing.

    So, the point for discussion...

    The world's favorite search engine exists because of its software patent. This patent has caused great harm to the competing search engines. Is this ok because...
    • the software patent system is just fine
    • many software patents are silly, but this one is worthwhile.
    • it is a silly patent, but google is good enough that we forget about that.
    • no one cares how google got where they are. It is just good that they work well.
    • it is not ok.
    1. Re:Why does google get a slashdot-patent-pass? by ostiguy · · Score: 3, Insightful

      Because they don't do evil or annoying things. That isn't a tremendous excuse, but it just works in practice. No intrusive ads, performance is always great for a free service, etc.

      Philosophically, however, I'd imagine that parsing/indexing patents are far more legitimate in many people's eyes, than say, one click purchasing patents.

      ostiguy

    2. Re:Why does google get a slashdot-patent-pass? by ethereal · · Score: 5, Insightful

      I agree with the "many are silly, but this one is worthwhile". Google's approach was non-obvious, innovative, and really advanced the state of the art. It wasn't just another "do what we did before, but with a computer this time" patent.

      I'll admit that it helps that their site is non-painful to use, but that's just gravy. Google's search is so much better that even if their site was a pain, it would still be a worthwhile search tool.

      --

      Your right to not believe: Americans United for Separation of Church and

    3. Re:Why does google get a slashdot-patent-pass? by jallen02 · · Score: 2, Interesting

      I think one could argue that ease if use is part of what makes their results so useful.

      If it was too complex to use for the average computer user to pull the data they need I doubt they could stay profitable. Currently its the best, not only for the results, but how the end user interacts with their system.

      Its amazing how often the "I'm Feeling Lucky" button gets exactly what your looking for.

    4. Re:Why does google get a slashdot-patent-pass? by Anonymous Coward · · Score: 0

      Google's algorithm isn't that original - it's based on an algorithm used to rank papers based on the number of citations.

    5. Re:Why does google get a slashdot-patent-pass? by swb · · Score: 4, Interesting

      Because they don't do evil or annoying things. That isn't a tremendous excuse, but it just works in practice. No intrusive ads, performance is always great for a free service, etc.

      Tremendous excuse? I'd say its a future model for all businesses.

      Forget the tedious absolutism of the neosocialists -- that model will never be implemented anywhere (except at the barrel of a gun), and anyone who won't be happy until they get there will never be satisified. However, a company that does a good job at what they do and produces something that they can either give away or appear to give away something without doing the annoying, evil greedy things that other companies do should be the benchmark.

      For example, Mercedes Benz -- what if they still sold their really expensive cars to rich guys who would pay for them BUT they would also sell a car that went 200,000 miles without major service for $10k?

      I think the list goes on -- subsidize basic, honest products and services with expensive stuff that others are willing and able to pay for. It makes you a saint. I don't see why so many other businesses hold onto the "rape everyone" philosophy.

    6. Re:Why does google get a slashdot-patent-pass? by kz45 · · Score: 2, Insightful

      Just curious about people's opinions here. Google gets covered fairly regularly on slashdot. Usually when a company that uses software patents to protect its business from competition comes up on slashdot they get reamed along with the USPTO

      This is only alright for google, because the average joe slashdot user doesn't have to pay anything to use their services. (proving further that it's all about the "free beer").

      Look at the .gif or .mp3 standards. When the creators asked for a certain amount of money per usage, slashdotters were in an uproar.

    7. Re:Why does google get a slashdot-patent-pass? by nakaduct · · Score: 2

      I contend they would have succeeded with or without the patent. Like the old Altavista, Google has a cohesive picture of what a search engine should (and shouldn't) be.

      The unwashed mass of portal-shopping-news-flowers-and-oh-yeah-searching engines might mimic the ranking scheme, but the vision and interface? I'd be less surprised if the giant pandas solved their endangerment problem by building underwater colonies.

      cheers,
      mike

    8. Re:Why does google get a slashdot-patent-pass? by jesser · · Score: 3, Interesting

      I agree with the "many are silly, but this one is worthwhile". Google's approach was non-obvious, innovative, and really advanced the state of the art.

      Since the "state of the art" advances more quickly in CS than it does in most areas, should we expect Google to place its original patent in the public domain after several years? Or do you think that in several years, someone will invent a completely different algorithm that yields better search results, rendering Google's patent obsolete?

      --
      The shareholder is always right.
    9. Re:Why does google get a slashdot-patent-pass? by jfinke · · Score: 2, Insightful

      It might be about what they patented... They aren't suing AltaVista for having a search engine. When Amazon sued BN it was because they provided a similar feature, not becuase they copied the code. But, what do I know....

    10. Re:Why does google get a slashdot-patent-pass? by carlos_benj · · Score: 1

      I don't think the consensus has ever been patents==evil. There has been wailing and gnashing of teeth over many patents here, but it is usually the "one-click" or other obvious "innovations" that prompt the backlash (or should that be backslash?) on /.....

      --

      --

      As a matter of fact, I am a lawyer. But I play an actor on TV.

    11. Re:Why does google get a slashdot-patent-pass? by cpeterso · · Score: 1

      If Mercedes Benz also sold $10,000 cars, it would destroy the mystique of owning a Mercedes Benz. They would surely lose sales of their high-end cars. People in the USA drool over their Lexus sedans, but would they do the same if they knew it was just an overpriced Toyota?? Toyota/Lexus doesn't even sell Lexuses in Japan; they just called them Toyotas there.

  50. $20k ought to be enough... by Junior+J.+Junior+III · · Score: 2

    to make them profitable. Google does so many things so well, and provides it all free to the world. It's not asking too much, I think, for them to ask companies to foot the bill for something like this if that's what it takes for them to continue to stay in business and keep doing all this neat wonderful free stuff.

    --
    You see? You see? Your stupid minds! Stupid! Stupid!
    1. Re:$20k ought to be enough... by laserjet · · Score: 2

      Google is and has been profitable. They are a private company that makes a profit. I don' know where all this crap comes about "finally Google can make a profit...". Google is expaning their already successful busines...

      --
      Moon Macrosystems. Sun's biggest competitor.
  51. why oh why... by Anonymous Coward · · Score: 0

    Now the Crackers have an easy way to *search* for passwords and confidential docs!

    "Hey dude, find anything?"....."Nothing other than a google search engine"...."Alright!!!, now let them do our work!"

  52. This would be great for my company by Rude+Turnip · · Score: 1

    Unfortunately, we've already made the investment in a SQL 2000 database. I think the Google solution would be better because the SQL database relies on people entering the data correctly (and just entering it, period) to work well. It looks like the Google product would actually search through the documents for you. Right now I keep a pile of old reports on my text to pull out some recyclable material, but Google's search engine would eliminate that need.

    1. Re:This would be great for my company by C.+Mattix · · Score: 1

      I work for a company that does what you are looking for. Check out Maxim-IT.

  53. May not be that great by Codex+The+Sloth · · Score: 1, Insightful

    They have been doing indexing of public intranet sites (like try here) but this is different since it is in the intranet and has to host the hardware.

    Does anyone but me think that this may not work so great? The way that google works for the web (filtering down way too many hits and ranking them) is quite different than an intranet where fuzzy searching / regular expressions is alot more necessary. The Apple Developer Site (link above) uses google and it stinks!

    --
    I am not a number! I am a man! And don't you ... oh wait, I'm #93427. Ha ha! In your face #93428!
    1. Re:May not be that great by Lars+T. · · Score: 2

      Hmm. If Apple does use Google, why do they and Google get different results?

      --

      Lars T.

      To the guy who modded me down from perfect to terrible Karma - Apple haters still suck

  54. Why? by GreenJeepMan · · Score: 2, Interesting

    Google is great search engine for the Intenet, because it ranks pages according to how many other pages link to it. Its very democratic. I don't see how Google behind the firewall would be a viable product, what will it rate document on how many other company documents link to it?

    There a number of other existing indexing engines that are signigiantly cheaper and more mature. Google should stick to what it does best. I guess this shows they aren't very profitable and are looking for other sources of revenue.

  55. quite a bit late actually by slashkitty · · Score: 2

    We've already spent way to much just for the software from someone else. Still have yet to launch it though. Google should have done this long ago as soon as they realized their software works. Well, ok, that's an oversimplification, but still, the worked on these corporate search programs before, and they just weren't up to par.

    --
    -- these are only opinions and they might not be mine.
  56. Kind of OT... by Timmeh · · Score: 1

    But I was wondering, how exactly does Google make money? They serve up so many goddamn pages and their bandiwth, storage, power consumption must be through the roof; so how do they pay for it all? This is a good start, but $20k can't go too far @ google.

    1. Re:Kind of OT... by NDPTAL85 · · Score: 0

      The bulk of their money comes from licensing their engine out to other search engine companies like AOL and Yahoo. Those are their two biggest customers.

      --
      Mac OS X and Windows XP working side by side to fight back the night.
  57. We *seriously* need this. by Moderation+abuser · · Score: 2

    Not kidding. I work for a very large multinational and the corporate search engine is an excercise in frustration. It's purpose in life seems to be to return bizarre and obscure documents as the results of it's searches.

    $20k is nothing to shell out[1] for the capabilities that Google has.

    [1] In corporate terms.

    --
    Government of the people, by corporate executives, for corporate profits.
  58. Google, I Want to Give You My Money by vodoolady · · Score: 1

    Yeah, I hope they're already thinking about the personal version, because I've been dreaming of Google on my machine for a long time. An intelligent search beats just about any other kind of infrequent interaction: menus, directory navigation, dialog boxes with lots of little pages on them. I want to hit ctrl-g, type in what I'm interested in, and get the right thing.

  59. Can too by Moderation+abuser · · Score: 3, Insightful

    Finding that vital piece of information can be far more important than $20k, especially to a large organisation.

    --
    Government of the people, by corporate executives, for corporate profits.
    1. Re:Can too by richieb · · Score: 2
      Finding that vital piece of information can be far more important than $20k, especially to a large organisation.

      Very true. However, try convincing the average corporate bean counter. So, instead install "htDig" and actually show that you can make $20K, with a search engine on the intranet. Once the people who use and need it are "hooked", you can proceed to getting Google (after all you should have supported software for "mission critical" functions, and you are much too important to administer htDig :-))

      --
      ...richie - It is a good day to code.
    2. Re:Can too by tonywong · · Score: 1

      hmmm...just like the government's searching on the Enron google box. Whoops Anderson just shredded the box.

  60. It's called a sponsored link by Vicegrip · · Score: 2

    The first item in your search results. Google matches up what you are searching for with a company offering a compatible service/product.

    This kind of directed advertising is valuable and a good application of their service.

    --
    Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
  61. Open source, right? by Zico · · Score: 3, Interesting

    Right now Google tends to be among the bigger darlings of Slashdot, but will they remain that way if they release this product and it's not Open Source? 'Cause they're nuts if they're planning on charging $20K for it but making it Open Source. Are they traitors to the cause, or is it just another understandable case of "Money talks, bullshit walks" when it comes to Open Source and the Real World?

    1. Re:Open source, right? by carlos_benj · · Score: 2, Interesting

      Right now Google tends to be among the bigger darlings of Slashdot, but will they remain that way if they release this product and it's not Open Source?

      Google gets its kudos because they USE Open Source, not because they ARE an Open Source company. Their current search engine (the one you use at Google.com) is proprietary already. There may very well be some who will bemoan the fact that Google isn't opening their source, but that doesn't mean everyone in the community is of the same mind.

      Yeah, I know this was probably a troll, but I needed to say that.

      --

      --

      As a matter of fact, I am a lawyer. But I play an actor on TV.

  62. ASPseek by AnteTempore · · Score: 1

    I have been using ASPseek for some time. The search results are remarkably similar to Googles. If you want a libre alternative to Google for your own sites, ASPseek is probably the way to go.

  63. "Network drives" by hey · · Score: 1

    Most companies don't have most of their docs
    on internal websites - they are on Network Drives.
    Zillions of "folders" full of ".doc" files - yuck.
    Since it isn't hmtl there are no links I wonder
    how Google sould deal with that.

  64. Open Source Search Engine... by 3D+Lover · · Score: 1

    I see that no one has mentioned the GPL perl script Perlfect www.perlfect.com. It's a very capable search engine including PDF search. Check it out.

  65. Nobody's perfect by Nicolas+MONNET · · Score: 3, Interesting

    So what, Google isn't a 100% libre-kosher company? Name any of their competitor that is. It's called "lesser of two evils".

    As far as I know, Google has never filed for frivolous "IP" lawsuits, they respect web standards, they provide gratis, decent service, they don't fuck with your browser, and they tell you who paid for word placement as opposed to just putting paying advertisers on top without mention. They also happen to use free software and give it good press.

    1. Re:Nobody's perfect by jesser · · Score: 1

      they don't fuck with your browser

      That's true in general. Google cares about that image, too: when spyware targeted search engines, Google said on its front page that it doesn't use pop-up ads and gave a link to some spyware-removal software. But here's one (minor) case where Google does fuck with your browser:

      1. Search Google for XML.
      2. Ctrl+click (Mozilla) or Shift+click (IE) on one of the banner-shaped text ads.

      Expected: ad opens in a new window.
      Result: ad opens in search window *and* in the new window.


      Admittedly, that's benign in comparison to pop-up ads.

      --
      The shareholder is always right.
    2. Re:Nobody's perfect by leviramsey · · Score: 1

      I see that behavior in Opera 6 for Windows. However, right-clicking and selecting Open in New Window gives proper behavior.

  66. Please let IBM get this!!! by Anonymous Coward · · Score: 0

    For IBM, $20 grand is pocket change, and it is SO needed!!! Anyone who has actually tried to use the 'search' on either the public or internal network knows that the search just doesn't work.

  67. Expensive? Ha! by jonbrewer · · Score: 2

    If anyone thinks $20k is expensive for 150k documents, they haven't bought a search engine recently!

    Check out prices for Inktomi . Of course the more documents you have, the lower the per-document cost, but still they charge $7500 for 10k documents.

    The "average" price of a Verity K2 license is $200k. (check this itworld.com link.

    Good content indexing is expensive. Google will be undercutting the competition with this release. $20k really is a bargain.

  68. GoogleBox by JWSmythe · · Score: 1

    Hmmm.. Looks like an interesting concept. If you have an admin with a little time on his hands, which would probably cost you a *LOT* less than $20k, you could set up something else.

    We've been using Namazu to make all of our documents searchable. It's shareware, and does a pretty decent job of it. If we make it public or private is just a matter of who you allow access. :)

    I guess the days of `grep "searchstring" *` are pretty much gone.. :(

    Next thing they're going to tell me is that I should start using something more modern than Pine to read my mail..

    --
    Serious? Seriousness is well above my pay grade.
  69. Another possibility? by partingshot · · Score: 1



    Because the majority of /.'ers are sheep
    and the head sheep are clueless?

    Flame away.

    --
    Anonymous posts are filtered.
  70. Hey, let's get in on it.. by Anonymous Coward · · Score: 0

    With all the thousands of people who view Slashdot daily we can get a few thousand (preferably 20,000) to chip in. And we'll all give each other copies and we can each have a copy. Email cmdrtaco@slashdot.org if you're interested. :)

  71. Another link by Anonymous Coward · · Score: 0

    ZDNet also has the story.

    As for personal reaction, I just wonder whether the option to search emails will be available to everyone, or just a select few. In either case, I don't think I like it very much.

  72. Altavista have had this available for YEARS. by Lord+Hugh+Toppingham · · Score: 1

    Altavista have had something like this available for years. Its pretty good.

  73. It will definitely work great, but ... by Circuit+Breaker · · Score: 2, Insightful

    Google's claim to fame is its ability to rank results properly (something no other search engine ever got right). The rank, if I recall correctly, is _mostly_ based on links from other sites.

    Now, when you're indexing thousands of doc and pdf files on a company network, how many of those link to each other?

    And how many companies have internal newsgroups that can be searched? (No, Exchange shared folders don't count - or can Google index those as well?)

  74. Just use the Windows "Find Fast" feature! by Da+VinMan · · Score: 2

    Like duh!

    *cough*

    (Please think about it before you roast me.)

    --
    Please mod this post only if you think others should/n't read this. I have enough ego^H^H^Hkarma. Thanks!
    1. Re:Just use the Windows "Find Fast" feature! by slakdrgn · · Score: 1

      I don't use fastfind much, but wouldn't this bog down a system, when searching thru 20,000+ PDF/DOC/XLS/TXT files?

  75. google's cheap by sl0ppy · · Score: 2, Insightful

    for $40000, you can get a sun e220, and run altavista's search engine on it. even then, if you want to integrate it, you still need to do 30-40 hours of work to make it all work right.

    having something for $20000 or so is a godsend, especially if it comes with its own hardware (even though its hardware is probably not as nice as an e220)... throw in that they'll probably do the work when it breaks, and this is a no-brainer for anyone needing to index even as few as 25000 pages.

  76. Try SWISH++ by pauljlucas · · Score: 1
    SWISH++ is the fastest freely available search engine. Briefly from the feature list, it natively indexes text, HTML, Unix manual pages (makes much better apropos(1) command replacement), e-mail/news (RFC 822), LaTeX. Through filters of your choice, it indexes PDF, PostScript, M$ Office.

    For high-traffic sites, the search engine can be run as a multithreaded daemon process that listens on either a Unix or TCP socket.

    You could write a filter or a native module to index the names of binary files.

    --
    If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
  77. Ah yes by llamalicious · · Score: 1

    now we know where /. should invest their next 20G's
    Something to replace that poor poor search box on the bottom every page.
    For chrissake's it's easier to search on Google right now and browse the cache! doh!!

  78. Mod This Up by Kozz · · Score: 2

    Mod this up! Indeed, this is a HORRIBLE script, stupid idea, lame lame lame.

    This would be a great way to introduce a really NASTY security hole into your site by using this script.

    --
    I only post comments when someone on the internet is wrong.
    1. Re:Mod This Up by Shiny+Metal+S. · · Score: 2
      Mod this up! Indeed, this is a HORRIBLE script, stupid idea, lame lame lame.
      Poeple, can't you see that it was a joke? In fact, quite a good one, for anyone who knows anything about Perl and CGI.

      And about ";rm -rf /;" as a query, I hope you don't run your CGI scripts with write privileges to your whole filesystem! Don't get me wrong, I always use taint mode and I always tell people to use it as well. It's just that this example can be quite misleading. If the CGI script can possibly remove the root directory, than you have a much more serious problem than the script itself.

      By the way, nice moderation: someone posts a script as an obvious joke -- it's Score:3, Informative. Then, someone says it's a horrible script -- again, it's Score:3, Informative. I wonder if people who moderated this thread, have ever read it, not to say about understanding the subject... Ok, I can understand that someone didn't get the joke and it's not moderated as Funny... But Informative?! That script doesn't even work for God's sake!

      --

      ~shiny
      WILL HACK FOR $$$

  79. ...Tonight on Springer.. by Anonymous Coward · · Score: 0

    When Slashbot's Attack...

    Today's Feature:
    Slashbot Moderators

  80. Six Degrees by gnugnugnu · · Score: 1


    Search engine on your desktop?
    Joel (On Software http://www.joelonsoftware.com ) has mentioned SixDegrees as a potential Google on your Desktop. http://www.creo.com/sixdegrees

  81. Could Save Significant Time, Effort by Merry_B.Buck · · Score: 2

    No more 25-man midnight raids that cart off your entire data center. Now the FBI or BSA can just pick up your search applicance.

    1. Re:Could Save Significant Time, Effort by funky+womble · · Score: 1

      You've not read the cunning plan in Cryptonomicon then...

  82. Corporate Intranet Index Engines? by Nonesuch · · Score: 2
    When I was an Intranet webmaster at Motorola, we used 'FreeWAIS' for Intranet indexing, until Corporate security decided that indexing everything was a security risk :-)

    Not kidding. I work for a very large multinational and the corporate search engine is an excercise in frustration. It's purpose in life seems to be to return bizarre and obscure documents as the results of it's searches.

    You actually got results returned from your search server?
    Lucky bastard. Our corporate Intranet search engine usually would just return 'Query Timed out'. Eventually they just took the search boxes off all the web pages.

    I've since built a simple Harvest index for the Intranet.

    It can be very interesting finding all of the 'cobweb' documents on intranet sites. Ancient documents relating to projects and managers long since vanished among other stuff that management would prefer to see forgotten...

    There are some cool features that are unique to Google, but I'm not sure if 'Convert PDF to HTML' and 'highlight search terms' are worth $20K.

  83. Heard of ht://Dig before? Any good? by Nonesuch · · Score: 2
    I've never seen ht://Dig before. Where I've needed search engines, I've deployed Harvest or WAIS.

    Aside from the GNU license and association with SourceForge, I'm not sure what advantages ht://Dig has over the other free/commercial indexing products. Perhaps somebody has a comparison page?

    1. Re:Heard of ht://Dig before? Any good? by ghutchis · · Score: 1

      Don't take my word for it. Check the true expert in comparing search engines at searchtools.com

      Has just about everything on search engine use, including comparisons.

      http://www.searchtools.com/

    2. Re:Heard of ht://Dig before? Any good? by stardeveloper · · Score: 1

      We installed and tested ht://Dig for our website Stardeveloper.com on our local Win2k system along with Cygwin ( required to compile and run linux apps on Windows ) and it ran great. The reason we didn't finally install it on the site was that it runs as a CGI app under IIS, so every time a request comes IIS loads and unloads it. I think if it was possible for ht://Dig to keep running ( possibly as a Windows service ), it would have run lot faster and we would have no problem running it on the production site.

  84. Theoretically, no... by Da+VinMan · · Score: 2

    Actually, it's perfect for searches about that size, and bigger even. When you talk about fast find (at least the later versions), you're actually talking about the Windows Index Server in drag. Index Server is a fairly robust piece of work that allows sites to implement (as a part of Commerce Server, SQL Server, and others) full text searches across the media. It's componentized nature makes it convenient to use from VB/VBS/ASP/other COM capable languages. Not too bad actually...

    The joke was about Fast Find though which, IMO, is the most crufty unfriendly piece of sh*t ever incorporated into MS Office. In Office 95, 97, and 2000 (haven't tried Office XP yet) it's something I systematically eradicate on every machine I see. It's known for firing up it's re-indexing while the user is already using the machine, and it's also known for not being controllable by the user (i.e. the user can't tell it when to re-index).

    --
    Please mod this post only if you think others should/n't read this. I have enough ego^H^H^Hkarma. Thanks!
  85. Excuse me.... by NDPTAL85 · · Score: 0

    I'm sorry but what exactly is wrong with software patents? Did I fall asleep and suddenly wake up in a socialst country or something?

    --
    Mac OS X and Windows XP working side by side to fight back the night.
    1. Re:Excuse me.... by victim · · Score: 2

      In the field of software, the USPTO has a track record of granting patents on the obvious. The explantion I've heard is that evaluating the applications is hard so they grant them and let the courts and companies sort it out.

      There is also the issue of patenting mathematics. That is not allowed. Many software patents are really patents on a machine wink wink that happens to produce the same results as a mathematical formula.

      And I can't tell if you woke up in a socialist country or not. I woke up in one that is nominally capitalistic, but more socialist for the lowest castes.

  86. Google Search Appliances by Anonymous Coward · · Score: 0

    Imagine a Beowolf Cluster of THESE!!!

  87. Google is already profitable. by NDPTAL85 · · Score: 0

    Part of their business is licensing out the engine to other companies such as AOL.

    --
    Mac OS X and Windows XP working side by side to fight back the night.
  88. Google for Documents? by rblancarte · · Score: 1

    A bit late to get into the game, isn't it? I mean, there are a number of document management systems already out there. PC Docs, etc. And these are VERY powerful system. It makes you wonder how good Google's system is going to be.

    And while you cough at the $20k pricetag, that seams about right for what you are looking to do.

    RonB

    --
    It is human nature to take shortcuts in thinking.
  89. Re:Ouch. Try HTDIG. by ghutchis · · Score: 3, Informative

    Actually, saying it doesn't have all the fancy matching algorithms isn't really fair.

    Granted, we can't implement Google's patented things, but that's not to say we don't come close.

    Indexing the text of links to documents? Yes.
    http://www.htdig.org/attrs.html#description_fact or

    Keeping track of the weight of links pointing to a document? Yes.
    http://www.htdig.org/attrs.html#backlink_factor

    Probably the big "missing link" is a proximity weighting. Interested? Help is always welcome!

    -Geoff

  90. HTDIG! HTDIG! by fargo007 · · Score: 1, Interesting

    htdig has made me a hero here. Mostly because of its reliability and price.

    It astonishes me how people can sell something that's already free. Canned air will be next.

    - Freddy

  91. This is an exceptional deal!!! by z84976 · · Score: 1

    Google's selling this to corporations for $20,000 per two year license. Our company is hopefully about to buy one... to replace the $250,000 per year Verity product that just doesn't work at well! To be fair, the Verity engine also indexes Lotus Notes and Oracle databases, but apparently Google's about to do that too. Heheh I guess when they add that support the only two differences between verity and google will be that verity costs 20x as much (over time) and... verity doesn't work very well!

  92. Am I wrong, or... by weird+mehgny · · Score: 1

    ...is it weird that Slashdot doesn't have a specific Google topic yet?

  93. Does PageRank work on internal sites? by Kieckerjan · · Score: 1

    It seems to me that Google's (patent-pending) pagerank algorithm wouldn't be of much help on an intraweb. The linkstructure of a single website mostly reflects design decisions, and hardly says anything about the popularity/authority/value of a page. And even if it did, it wouldn't be very objective (let's call that "inter-subjective") since the site is probably maintained by a rather small group of people.

    If that is so, why choose Google over a cheaper competitor?

    --
    Being well balanced is overrated. -- John Carmack
  94. Bad title by felipeal · · Score: 1

    Google's Search Appliance - I thought that would be another of those internet appliance gadgets.
    Think about it: one in the kitchen so I-can't-double-click-mom can get her receipts, another in the garage for AOL-dad's do-it-yourself shop, and so on...

  95. bankrupt by Fissure_FS2 · · Score: 1
    They gonna go bankrupt in 3 month or less!
    not if they follow the links to MAKE MONEY FAST!!!!!!1111oneone
    --
    My life's goal is to get a score of +3!
  96. Mercedes DOES sell cheap cars by RealisticWeb.com · · Score: 2, Informative

    If you have been to europe you know that mercedes DOES sell cheap cars. They are like euorpean Fords. You see Mercedes busses, tractors, compacts, everything. They are so common that thats what people think of when they see the symbol, and they can't sell as many sports cars or SUV's. So they export all the high end cars here, where we buy them.

    Point is, I agree that this is a smart Google move. You separate the market, and give people in both places the things that they want. That's why you are never going to see an ad banner on google trying to get the average surfer to buy their $20 engine

    --
    Sigs are out of style, so I'm not going to use one...oh wait..
  97. What about ISYS? has been around for years. by Anonymous Coward · · Score: 0

    This is a very neat free-text indexing engine. Not sure about how much it costs though...

  98. Where's The Link? by Peteresch · · Score: 1

    The best way to ensure a high listing on Google is for your page to be linked from lots of pages on other sites.

    How would this model work in an intranet setting? Would it count the number of desktop shortcuts? Yes the algorithm works great in the internet world but is it a universal find-all?
  99. See it in action by J.J. · · Score: 2

    I noticed this last week when searching Cisco's site. The addition of the "powered by Google" snippet in the upper right hand corner of the search results threw me for a loop.

    I haven't noticed much of an improvement in their search results yet - perhaps it takes time to build the link relationships index?

    Cheers,
    J.J.

  100. Looks slick [link to a picture]... by ramakant · · Score: 1

    Don' think I'll be buying one anytime soon, but they sure do look slick. Here is a picture of the beauty.

    Also, here's the press release that was sent out on the googlepress group:

    -------------
    Media Alert
    -------------
    February 11, 2002
    Today, Google announced the availability of the Google Search Appliance, an integrated hardware/software solution that extends the power of Google.com to corporate intranets and web servers. The Google Search Appliance simplifies corporate search for administrators and makes it fast and easy for employees to find the intranet information they need.
    The new product comes in two versions: GB-1001 for departments and medium-size companies with up to 150,000 documents, and the GB-8008 for large corporations with millions of documents. Google Search Appliance features include:
    - Complete solution: both hardware and software
    - Easy install: up and running in less than one hour
    - Simple administration: simple and intuitive browser-based admin console
    - High quality: quickly delivers relevant search results
    - Affordable: pricing starts at $20,000 for two years of support and software updates
    The Google Search Appliance was designed to address the growing demand for simple, cost-effective search solutions within corporations. The Google Search Appliance is based on Google's award- winning search technology and provides a complete solution to companies that need search services to manage data behind the firewall.
    An image of the Google Search Appliance can be found here:
    http://www.google.com/press/images.html.
    Additional product information can be found at:
    www.google.com/appliance.
  101. what about those folks? by davidone · · Score: 1

    what about those folks in Norway? They did just the same thing two years ago and nobody noticed.

  102. RE: searching employee email by Anonymous Coward · · Score: 0
    will it also index employee email?

    That's what I am reqesting from our IT guys and managers for a long time. Of course there are personal and other matter that shouldn't be spidered and indexed. I imagine kind of an email client which has some default setting, but also asks you before sending off an email: should this message be open for corporate search?

    However, since am working for a large international financial institution where client data privacy is very very important, it's very difficult to get the attention of the management for such ideas.

  103. "Combine" harvester by SgtChaireBourne · · Score: 1
    If you're only doing a small site, then Ht://Dig is probably the way to go.

    For a larger site or for distributed harvesting then there is Combine which is an old one from 1996. It does text, HTML, and PDF. It's free, but takes a bit of time to set up and can even handle metadata (i.e. keywords). There are binaries for linux and solaris, but most is in perl.

    It's about to begin some modernization to make it easier to install and operate, perhaps even use MySQL as a backend.

    --
    Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
  104. Comparing Prices by lapointe · · Score: 1
    The 20k$ price is about right depending on how many documents it indexes. Altavista charges about 34k$ CDN for 50,000 documents (a limit we have blown on just one corporate web site). I have heard that other engines are similarly priced.

    This is not including fairly high maintenance fees...

    When you consider that any corporate site could be a window on a huge corporate database of information (literally, a each web page could be a database record), you could blow through hundreds of thousands of documents easily.

    For the uber-geeks out there, search engines work nice with configuration management systems (like Perforce) for searching source code for large projects.

    By the way, my experience with the Altavista product is it is very buggy and unreliable (they re-wrote it in Java, any surprise?), so Google's entry into the field is welcome.

  105. peanuts by Anonymous Coward · · Score: 0

    that is peanuts compared to thuderstone although its a special type of relational database. thunderstone runs around in the millions and people buy it!

  106. Re:Controlled vocabulary - see Endeca by Anonymous Coward · · Score: 0

    check out Endeca - they use your structured data (in your case, a multi-faceted controlled vocabulary) to help you explore the results of your search on unstructured text. e.g., type "early childhood" in search box, instantly get back thousands of matches-- but placed in a precise context that tells you exactly how to refine (by facets like subjects, medicines, labs, authors, dates, etc., with only categories valid for that unique search shown.) new and really cool technology.

  107. man, that thing's pretty. by AndyChrist · · Score: 1

    Really stylish. It'd probably look great next to a Cobalt.