Google's Search Appliance
An anonymous reader noted that Google is working on a Search Engine
that you can install behind your corporate firewall for indexing
your internal documents. It's a bit thin on information, but it
looks like for as little (cough) as $20k, you can have your own
google box. Not for everyone obviously ;)
People don't have THAT much pr0n do they?! :)
Aside from anything else, it gives Google a revenue stream so they can continue to provide their services (web, image and usenet searches) for free; they need to find a valid business model, and hopefully this can contribute.
Everywhere you look, companies are hawking products geared for searching internal documents. Google is making a good move; enter an expanding market as an established leader in searching.
hawaiianshirt
will it also index employee email?
Searched the intranet for 'herbal viagra'.
Results 1-10 of about 1,279,500. Search took 0.14 seconds.
your jesus is another mans xebu. chew on that hypocrites.
I see more of this in the future - if you want a search engine, buy one and put it on the network. If you want a web server, buy one and put it on the network. You want a disk server... Well you get the point.
As hardware continues to get cheaper and software more expensive as it gets more complex it makes sense to do this rather than trying to configure multiple applications all on the same server.
And good luck to google making money on this so they can keep their search engine fast and free of annoying advertisments.
Sig is taking a break!
I would like to find a search engine that will index:
- text files
- html files
- PDF files
- names of binary files
Unfortunately, I am not able to spend much to purchase such a search engine (say $20, not $20K). This would be for my personal use, not for any kind of commercial use, and would not be funded except by my anemic hobby budget.Does anybody have any recommendations?
Edward Burr
Having a smoking section in a restaurant is like having a peeing section in a swimming pool.
Google did exactly what us fanboys all whined and complained for - a company that made a good product (awesome search engine) without selling out (no popup ads). Google offered a free service, built up an enoumous following, and now offers its premium service for a premium price, while insuring its loyal customers continued free services. Forget eBay, Google is an Internet-Success-Story worthy of such praise!
The companies that are useing the apliance are Large Corporation with Hundreds perhaps Thousands of computers and Millions of files and documents to find. The real question is how much money is the company loosing from people who have to redo misplaced documents. or make new ones which are simular to an other document that someone else made a while back. In a large corportation a Thousand of people working at $20 an hour are taking 1 hour to redo a document or spend time finding it. It makes up for the caust. Also if it gives google more money the better change the search eng. Stays free and without a ton of anoying avertising.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
It's a little more indepth than the India times article.
-- Dan
Yes, quite CLEARLY it's only for those who've got some cash to blow. If you've got a modest-sized Intranet site, I would highly recommend htDig. I've installed and configured it in several places and it works like a charm. Best of all, it's GPLed! Sure, it doesn't have all the fancy matching algorithms used by Google, but it does a damned good job nonetheless.
I only post comments when someone on the internet is wrong.
They just implemented this were I work, it's a vast improvement over what we had before. It even includes the cache and newsgroup features!!
Two thumbs up!!
No one got beat up more often than the mimes of the old west!
At least then the search feature would work right and they can finally cache all those sites that we take down.
can't sleep slashdot will eat me
Unless Google reimplemented their own operating system, or <shudder> ported it to Win2K, they have a very expensive product, that runs on Linux, that is not GPL.
More power to Google--I'm glad to see them finding a way to make money without trashing their search engine, like happened with the previously good search engines that came before (e.g. Altavista, Lycos).
One CPU cycle wasted on digital restrictions management is ONE TOO MANY.
Part of the success of the google technology is based on the page rank system which depends on many people linking to pages and so "ranking" them. On a corporate site you don't have as many separate opinions (i.e. pages managed independently) so perhaps the page rank part of google won't be as successful. OTOH just having fast search of all the docs would be good here :)
development.lombardi.com
This has a LOT more business application that appears on the surface. And $20K for such a solution is comparable to paying $50 for Red Hat to run a server.
Back in my systems integration days, we had very many law firm clients who used document management to organize the truly prodigious quantity of information they had to deal with. Spending $50K on the solution was not unheard of even among small firms. In fact, they usually wound up spending $20K just on third party maintenance utilities to support their document management systems!
Isn't this just confirming what we already knew?
On top of that, depending on the size of your intranet and how efficient/inefficient indexing already has been, $20K may be a bargain.
Of course, how many companies are really going to have a use for it? For giggles, lets say the entire Fortune 500. That's 500 * 20K = 10,000 K = 10 Million Dollars US. In the grand scheme of things, that's a lot of money, but not a LOT of money. Perhaps they'll add on pay-per-use functions for even ritzier search features?
Sigs? We don't need no goddamn sigs!
sig--we don't need no goddamn sig
Years ago Infoseek offered a version of their search engine to Index LARGE collections of documents. We had over 500,000 IT was around 15k if I remeber correctly. Python on a Sparc 20, (20k itself at the time with mem proccesors array and tapes) So we had alomst 4k tied up in the whole thing, There was if I remeber correctly a per site, or per page fee in addition over so many documents, I made an error in a config file once and allowed it to traverse links, other than filling the hard drive, quickly, the additional costing we did after to see how much it would be should we decide to keep those docs was hilarious.
:) Indexing LARGE repositories isnt easy and config can be a pain. 20k sounds ok to me. I have YET to see anopen source solution that can handle VERY large document sets ASPSeek, but it still has issues, and over about 2.5 million docs I hear its a dead horse.
20k, Isnt bad at all if your talking some serious indexing. We indexed 5, F500 compaines techincal documents at the time, before they were all in house, this was 97-98. It was slick, I often wondered what happened to that software package.
Anyone know what google is written in ? I decompiled a fair bit of Infoseeks just to see what was what, and because I could
Sig went tro...aahemmm.....fishing........
Wouldn't it be great for when they say "your code doesn't meet the specification of what the product needs to do" and you can use it to say "let's look to the wayback machine to see when you changed the spec but didn't bother telling me"
:-)
Demonstrant's Open Source Tools
slashdot talked about this in 1999 when the patent came up. Its 2+ years later now. google has mostly crushed the competing search engines because the results of their algorithm are preferred to other algorithms. Their revenue sources are not public, but I believe I read recently that half of their revenue is from advertisements and half from technology licensing.
So, the point for discussion...
The world's favorite search engine exists because of its software patent. This patent has caused great harm to the competing search engines. Is this ok because...
No taint checking (What happens if 'q' contains ";rm -rf /;".
No warnings.
No proper formatting of HTML, on the output. If the grep matches "", then it's not going to display anything on netscape. You need to either strip tags, or force tag matches.
Finding that vital piece of information can be far more important than $20k, especially to a large organisation.
Government of the people, by corporate executives, for corporate profits.
Right now Google tends to be among the bigger darlings of Slashdot, but will they remain that way if they release this product and it's not Open Source? 'Cause they're nuts if they're planning on charging $20K for it but making it Open Source. Are they traitors to the cause, or is it just another understandable case of "Money talks, bullshit walks" when it comes to Open Source and the Real World?
So what, Google isn't a 100% libre-kosher company? Name any of their competitor that is. It's called "lesser of two evils".
As far as I know, Google has never filed for frivolous "IP" lawsuits, they respect web standards, they provide gratis, decent service, they don't fuck with your browser, and they tell you who paid for word placement as opposed to just putting paying advertisers on top without mention. They also happen to use free software and give it good press.
Actually, saying it doesn't have all the fancy matching algorithms isn't really fair.
t or
Granted, we can't implement Google's patented things, but that's not to say we don't come close.
Indexing the text of links to documents? Yes.
http://www.htdig.org/attrs.html#description_fac
Keeping track of the weight of links pointing to a document? Yes.
http://www.htdig.org/attrs.html#backlink_factor
Probably the big "missing link" is a proximity weighting. Interested? Help is always welcome!
-Geoff