What Desktop Search Engine For a Shared Volume?
kriston writes 'Searching data on a shared volume is tedious. If I try to use a Windows desktop search engine on a volume with hundreds of gigabytes the indexing process takes days and the search results are slow and unsatisfying. I'm thinking of an agent that runs on the server that regularly indexes and talks to the desktop machines running the search interface. How do you integrate your desktop search application with your remote file server without forcing each desktop to index the hundred gigabyte volume on its own?'
They already have it indexed for you.
You've stumped Slashdot. Bravo!
Ocean is land, covered with water.
Not that I've ever used it before, but it sounds like it does what you want: http://www.google.com/enterprise/search/gsa.html
Here's a few options you might want to consider: 1) Use Office SharePoint Server 2007 to index the share 2) Upgrade to Windows Server 2008 (or above) and Windows Vista (or above) and use the Federated search feature: http://trycatch.be/blogs/roggenk/archive/2007/11/05/windows-vista-amp-windows-server-2008-federated-search.aspx
MS does have a solution, it's called Windows Federated Search. Windows 7 with 2008R2 has it .. there might be a way to do with Windows Desktop Search 4.0. Here's some info on it - http://geekswithblogs.net/sdorman/archive/2009/05/14/windows-7-federated-search.aspx
Yes, Google's Search Appliance (GSA) could be used, I have seen it used with limited success. The main problem was how to respect access control on documents: either you index them or you don't, and if you index them with GSA, sensitive data may show up in search results. Also, we had a lot of trouble "taming" GSA: it would regularly take down servers that were dimensioned for light loads.
I would suggest using Alfresco http://www.alfresco.com/ as a CIFS (Common Internet File System) or WebDav store for all those documents. This would give you the simplicity of a shared folder and the opportunity to enrich the documents with searchable metadata such as tags, etc. Each folder (or any item, in fact) could have the correct access control that would be respected by the search engine, Lucene. http://lucene.apache.org/java/docs/
Alfresco comes in both Enterprise and Community Edition, it's very easy to try out -- even our non-techie project manager could install it on his PC within 10 minutes. Try that with Documentum, FileNet or IBM DB2 Content Manager!
Who?
You could just rsync the shared volume to a local drive as frequently as needed and run the search engine on the local copy.
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
SharePoint is $$$$. Try Alfresco. Alfresco can look like a file share (support SMB, DAV, FTP, etc). The indexing is built is and does not require a separate SQL Server license.
You mean the Document Management Alfresco and not the CMS software. The Community Edition is free but unsupported, and the Enterprise edition has a free 30 day trial. It looks like it won a government award for document management which is rare for open source document management software.
Remember, Slashdot does not have a -1 disagree moderation, and no, troll, flamebait, and overrated are not substitutes.