What Desktop Search Engine For a Shared Volume?
kriston writes 'Searching data on a shared volume is tedious. If I try to use a Windows desktop search engine on a volume with hundreds of gigabytes the indexing process takes days and the search results are slow and unsatisfying. I'm thinking of an agent that runs on the server that regularly indexes and talks to the desktop machines running the search interface. How do you integrate your desktop search application with your remote file server without forcing each desktop to index the hundred gigabyte volume on its own?'
They already have it indexed for you.
You've stumped Slashdot. Bravo!
Ocean is land, covered with water.
Not that I've ever used it before, but it sounds like it does what you want: http://www.google.com/enterprise/search/gsa.html
how about using a program like Documentum? We generate several thousand technical documents and drawing a month, and use it for all our document management needs.
How about Everything (assuming the server is Windows & NTFS)? Works well for me (quickest desktop search I've found yet), and can either run locally or connect to an ETP server. The site seems to be down right now, but here's the original Lifehacker article where I found it. Incidentally, I never heard of ETP til I started using it. Anyone know if it's an Everything-specific protocol?
"Once in Hawaii I had sex with a 102 year old male turtle. It is difficult to argue that it was consensual." - Steve Ma
Here's a few options you might want to consider: 1) Use Office SharePoint Server 2007 to index the share 2) Upgrade to Windows Server 2008 (or above) and Windows Vista (or above) and use the Federated search feature: http://trycatch.be/blogs/roggenk/archive/2007/11/05/windows-vista-amp-windows-server-2008-federated-search.aspx
I guess it could work, although you can't index the files directly. You have to run a local copy and one on the server as an EPT Server. www.voidtools.com, although it seems to be down at the moment, so here's a link to the FAQ on Google's Cache: http://74.125.113.132/search?q=cache:fcYHcEJKH3UJ:www.voidtools.com/faq.php
MS does have a solution, it's called Windows Federated Search. Windows 7 with 2008R2 has it .. there might be a way to do with Windows Desktop Search 4.0. Here's some info on it - http://geekswithblogs.net/sdorman/archive/2009/05/14/windows-7-federated-search.aspx
If you have a windows server, you can tell Share point to index the file share. See: http://dotnetmafia.sys-con.com/node/1046930
Curious about Storage and Virtualization? Check out
I use SSH to access my file server. Because I use it as a music server as well, I use X forwarding. As I'm accessing the actual server instead of just mounting fileshares (which I do also), I do the file searches directly on the server. Usually good old locate. Haven't really found anything that beats it yet, but then again I like the CLI. If you're running windows... Sorry.
sudo mount --milk --sugar
Yes, Google's Search Appliance (GSA) could be used, I have seen it used with limited success. The main problem was how to respect access control on documents: either you index them or you don't, and if you index them with GSA, sensitive data may show up in search results. Also, we had a lot of trouble "taming" GSA: it would regularly take down servers that were dimensioned for light loads.
I would suggest using Alfresco http://www.alfresco.com/ as a CIFS (Common Internet File System) or WebDav store for all those documents. This would give you the simplicity of a shared folder and the opportunity to enrich the documents with searchable metadata such as tags, etc. Each folder (or any item, in fact) could have the correct access control that would be respected by the search engine, Lucene. http://lucene.apache.org/java/docs/
Alfresco comes in both Enterprise and Community Edition, it's very easy to try out -- even our non-techie project manager could install it on his PC within 10 minutes. Try that with Documentum, FileNet or IBM DB2 Content Manager!
Who?
DTSearch (http://www.dtsearch.com/). Although not free, you can install it on a server, schedule index updates, and have the client use the indexes (provided they are placed on a shared folder of the server).
You could just rsync the shared volume to a local drive as frequently as needed and run the search engine on the local copy.
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
slocate
Really? I ssh to the fileserver, and then do something like
*ducks*
One way is to set up Microsoft Indexing Service on the server with the shared drive. The MSC console app provides a search capability and one can also use the Indexing Service SDK for client apps.
Basically, you need your desktop search application to look at the index file on the remote file server generated by an instance of the application running on the file server. Technically, incredibly simple but I don't know which application currently available is divided into front and back ends like that. Maybe open source...
Use Microsoft Search Server 2008 Express...its free, all you need is a free server box. Also Check out SharePoint Search and FAST enterprise search.
http://www.microsoft.com/enterprisesearch
It really depends on what you are looking for. Are you wanting to index file names or do you want names and contents? For me, I typically know what I'm looking for based on file name, so Locate32 works out great. It's the Windows equivalent to 'slocate'.
Google appliance...
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
dtsearch - google it
Final Cut Server, CatDB, Mediabeacon and a number of other asset cataloging management tools could do the job and offload it in a reasonable sensible way .
Autonomy search. Check them out. One of the best in the world. Obviously an enterprise solution and not inexpensive.
So I think I'd start by looking here.
ISYS Search Software (http://www.isys-search.com/) has a variety of enterprise search applications. Web based search interface or a local client depending on your needs.
-i
Wondering if there's anything cross-platform. I'm in the process of setting up an OpenSolaris fileserver (primarily to use ZFS/Raid-Z) and have both linux and windows boxes. It would be great to be able to have an index on each that could be read by a client app or a unified index perhaps.
Yeah, my thought exactly? I wasn't aware that it was a problem searching hundreds of gigabytes on shared volumes. We have a couple of terabytes shared by our Mac servers and I don't think I've had search times longer than ten seconds over a couple of million files.. MS Office files, PDFs, movies, audio, pictures, photographs, text, HTML, source code.. all indexed with metadata and contents.
Even the days before Spotlight, using AppleShare IP Servers in the 90s, finding stuff on the servers was never an issue. It has always been so fast that I have never even reflected over that it was fast. Maybe I should use some other operating system once in a while to experience what the majority experiences. Or not.. I'd rather stay care free and productive.
Don't call me when you figure this out.
- Henrik
- when the Shadows descend -
www.blackball.com
They do federated indexing/searching without having to import data. Scans the data where it resides...
The dog. Always helps me sniff ouut the files.
SharePoint is $$$$. Try Alfresco. Alfresco can look like a file share (support SMB, DAV, FTP, etc). The indexing is built is and does not require a separate SQL Server license.
As usual, Apple and Closed Source to the rescue! Is there anything open sores can do that comes even CLOSE?
RDP into the machine, then CTRL-F on the volume, which is now local. Don't bother with indexing service anyway, it just wastes time. Life's faster without it.
I am probably missing something obvious here or misunderstanding the question, however, I am very happy with the search integrated in Windows 7. I have about a terabyte of data across two different volumes, and when I use the regular Windows 7 search I get instant, detailed results.
updatedb
locate [bleh]
It's easy, really.
printf($randomline(sigs.txt) \n "-- "$randomline(authors.txt));
-- myself
You mean the Document Management Alfresco and not the CMS software. The Community Edition is free but unsupported, and the Enterprise edition has a free 30 day trial. It looks like it won a government award for document management which is rare for open source document management software.
Remember, Slashdot does not have a -1 disagree moderation, and no, troll, flamebait, and overrated are not substitutes.
Docs Open is a commercial document management system but right now their web page doesn't seem to be working. We used it at a law firm I worked at. IIRC it was able to search through the billions of documents that the 300+ lawyers used in their cases.
Remember, Slashdot does not have a -1 disagree moderation, and no, troll, flamebait, and overrated are not substitutes.
You don't allow every client to index. There's been several suggestions already, but most enterprises intentionally DISABLE desktop search. It absolutely slaughters the share. It's not a big deal when one user is doing it... but when 5,000 are, the I/O load becomes unsustainable.
If I try to use a Windows desktop search engine
I've abandoned my search for truth; now I'm just looking for some useful delusions.
"Earth allows you to find files across a large network of machines and track disk usage in real time. It consists of a daemon that indexes file systems in real time and reports all the changes back to a central database. This can then be queried through a simple, yet powerful, web interface. Think of it like Spotlight or Beagle but operating system independent with a central database for multiple machines with a web application that allows novel ways of exploring your data." http://open.rsp.com.au/projects/earth
While the parent's response is rather snide, it nonetheless highlights an important truth. Granted there was a many-year gap where Spotlight didn't have server integration like good old Classic Mac OS, but it does now as of Leopard.
To be honest I assumed Windows would have the same thing already, given how obvious it is. Why don't they ever copy the good bits? :P
I've tried ducks, but they tend to nibble the occasional one or zero, and they leave an awful mess on the platters when they poop. Try Spotlight instead -- not as cute, but easier on the data, hardware, and the nose.
"First things first, but not necessarily in that order."
- Doctor Who
Seriously. You're probably going to want a separate server(s) for this job. You didn't specifiy what you're indexing, how often, or where, however I'll make some assumptions and point you towards an enterprise search appliance or product. Many will probably point you to Google Enterprise Search. I've worked with the search functionality withing Microsoft Sharepoint 2007, and it's (ostensibly) free spin-off, Microsoft Search Server. Again, you'll probably need to dedicate some hardware to this. In addition to crawling all the content, the search product will also need to index and present it to the user. This requires a front-end crawling role, back-end indexing role, and a database to keep all the data in. Dealing with several hundred gigs mean you'll want to have separate servers for all 3 (again, basing this off of my knowledge of MS products. YMMV). The nice part is that your users will work through a webpage, and the workstation won't be tied up doing any crunching of it's own.
Try starting here: http://www.google.com/enterprise/
or here: http://www.microsoft.com/enterprisesearch/en/us/search-server-express.aspx
Shameless plug for my photos on Flickr
Everything is EVERYTHING you could want
1. it is blazingly fast indexing drives
2. it is truly instant searching
3. it can be run in client/server mode
Haven't used it in a couple of years since they went away from their free model, but X1 (www.x1.com) rocked as a desktop search engine. They have federated search and plugins for a variety of server apps (sharepoint, etc).
spotlight doesn't work on my ntfs volumes.
or am i doing something wrong?
http://www.apple.com/server/macosx/features/spotlight-server.html
I wrote a web site/spider to do this for the whole network at uni. It was beautiful C++ all the way. After I left some silly CS people rewrote it in Python/PHP (ugh) here: http://code.google.com/p/trufflepig/
No you're not seeing anything wrong, you were just distracted by him ducking.
(Disclaimer: I work for Extensis)
Portfolio Server can continuously index files on SMB/CIFS (and AFP) volumes using a feature called "AutoSync". Web and Desktop (Windows/Mac) clients then search by folder name, file name, document text, or other metadata. Indexing and thumbnail creation takes place on the server, so clients are relieved of any cataloging workload and metadata is centralized.
http://www.extensis.com/en/products/portfolioserver9/overview.jsp
One of the products in this category will probably meet your needs.
Sharepoint is $$$$ if you want search across portals. If you have a single "portal" (consisting of multiple "webs"), Sharepoint comes with Server 2003 and above and is included in the price. There is a difference between Sharepoint Portal Server and Sharepoint Portal Services. Portal Server serves multiple portals and is not free.
I am just embarking on a project to do exactly what the OP is asking for. Windows Server 2003 has an indexing service you can setup. http://www.windowsnetworking.com/articles_tutorials/Working-With-Windows-Server-2003-Indexing-Service.html It is limited in its own form but provides the back-end tools you need.
Combine that with the next article from that site and you have a solution: http://www.windowsnetworking.com/articles_tutorials/Making-Windows-Server-2003-Indexing-Service-Useful.html
This article shows you how to use the Indexing service from an ASP script. The solution I am working on will be done in PHP as it can also link to COM applications. This basically allows you to put a file search tool on your Intranet which is indexed and returns the results very quickly. Best of all, it uses existing software on Windows and doesn't cost any extra.
(\(\
(^.^)
(")")
*This is the cute bunny virus, please copy this into your sig so it can spread
Since this is a task that benefits from some optimization, there are so many different combinations of file servers/clients out there, and so many use cases to choose from, there are lots of different solutions but not many good ones that will do exactly what you want out of the box.
So, in order to narrow it down, you need to decide exactly what you're looking for. What server are you using? What clients do you need to support? Are you wanting to just search file names, or contents, ownership and modification times as well? Do you need the index to be completely up-to-date, or not? How long can you stand to wait for results?
"I assumed blithely that there were no elves out there in the darkness"
I had this same problem not too long ago - we have a shared documentation tree with tens of thousands of documents that I wanted to index. I tried dozens of search engines in my spare time, most of which were just horrible (Beagle), were a nightmare to install for someone like me who's not a full or even part time admin (Apache SOLR), wouldn't allow cross platform access (lots of Windows ones, obviously), store a complete separate copy of every document (Alfresco, which didn't seem to have an option to ) and especially ones that had trouble indexing pdfs and MS Office docs (which we have a lot of). I'm not the IT guy, and have no budget for this, so Google Appliance was right out.
What I eventually ended up with is Omega on top of Xapian - http://xapian.org/ - it's not too hairy to install, indexes pretty fast, points back to the original files (so it doesn't duplicate everything), and can handle multiple repositories. It will also detect dups and not show them twice, though similar files are treated as completely different (which is probably what you want in the absence of something more sophisticated).
Two downsides: It can't do incremental update (unless that's changed recently) so you have to rebuild the entire index nightly. And the search is really sparse and ugly, which turned off some of my users, but you can rewrite the templates if you want to.
If you can't grep or locate, go fsck yourself ;)
Colorless green Cthulhu waits dreaming furiously.
You can use Splunk if your files are heterogeneous. It is fast and easy to set up. It is pretty good for doing relatively advanced searches against tons of data.
http://www.splunk.com/
The Google GSA is quite good, too, and better for non-technical users.
From IBM and Yahoo called OmniFind. It runs on a desktop or server and can index multiple shares... and the basic version is free but offers a lot of functionality.
Although if your business is booming, a GSA is freakin' sweet.
1. uninstall windows
2. install linux
3. run locate
Sounds like a a job for Sharepoint!
we use ISYS at work.
and its danged handy. does large scale shared drives very well.
http://www.isys-search.com/
If the share is indexed, Windows Search 4.0 will query the index on the remote server. Here is a link to the admin documentation with more information: http://technet.microsoft.com/en-us/library/cc772446(WS.10).aspx
Novell's QuickFinder (http://www.novell.com/products/openenterpriseserver/quickfinder.html) works well in a Netware or OES2 environment. It even respects file permissions when displaying results.
There are a number of decent options, which one to pick depends on specific requirements not included in the original question. Did the OP even search?
http://en.wikipedia.org/wiki/List_of_enterprise_search_vendors
Seriously? Microsoft offers a free piece of software that runs on the server that does "exactly" what he needs and we are suggesting Sharepoint/Alfresco, grep/locate and Google Search Server?
http://www.microsoft.com/enterprisesearch/en/us/search-server-express.aspx
You don't have to like them, but you should consider that they want to steal the market everywhere enough to give away decent software to get their foot in the door.
GSA is the best way for enterprise search! It crawls, index and servers data. About a concern one of guys reaised above about access control. It authenticate you against the current authentication server (NTLM, Kerberos, LDAP, etc) and only shows you results that you are eligible to see that. Oh wait. There is one hiccup there. Licensing sucks. It is based on number of documents it crawls and after 2 or 3 years you should take back your appliance and buy a new one!
Our progress as a nation can be no swifter than our progress in education. The human mind is our fundamental resource.
Are you trying to index all files, or just documents, or what? If you are trying to cheap out on indexing documents, I highly recommend Alfresco
My software never has bugs.
It just develops random features.
Except then you have another terrible search solution which isn't meant for the amount of data you'd find on a large server. Worse, you have an operating system that is terrible as a server solution.
On the other hand, you could just use a unix/linux distro of your choice, and beagle (http://beagle-project.org) - which is meant for indexing large amounts of data and has many clients some of which can remotely access it.
BeauHD. Worst editor since kdawson.
I use X1 locally to index my hard drive, network drives, email attachments, etc. I love the search-as-you type functionality, unobtrusive indexing, and previews. They have an enterprise version available that might meet your needs. http://www.x1.com
i remembered how to use google...
For Linux/Unix:
1. Mount the CIFS share/s on a linux/unix box.
2. run updatedb
3. use the locate/slocate command to search OR use places->search in gnome?
For Windows:
1. Install cygwin+updatedb+locate/slocate
2. do as above
Some Other Options:
Femfind: http://femfind.sourceforge.net/
pySMBsearch: http://sourceforge.net/projects/pysmbsearch/
LAN-Crawler: http://code.google.com/p/lan-crawler/
Used this a while back to index all the samba shares at my res college, works great, and is accessible from a browser by other hosts on the LAN.
ffsearch.sourceforge.net
Microsoft has a few solutions you can consider depending on your specific needs.
With Windows XP/2003, Vista/2008, or Windows 7 - you can install Windows Search 4 (not necessary on Win7, but recommended for Vista) on the server side to index the content, and then if you have WS4 (or Win7) on the client, it will automatically query the remote index when you perform searches against that file share.
Alternatively, if you run the free Microsoft Search Server (the Express version is free) which is based on SharePoint, you can index files on the server and then set up a Federated Search connector in Windows 7. Windows 7 supports federating to OpenSearch + RSS/Atom enabled sources, and SharePoint / Search Server support this. On current versions there's a bit of manual work to create the right OpenSearch description file, but it's pretty easy. The upcoming 2010 SharePoint version provides those out of the box (as well as some additional enhancements supported by Windows 7).
I'm actually the developer who built the OpenSearch feature in Windows 7, so if you have questions about the search options in Windows 7, you can visit my blog (brandonlive.com) and/or e-mail me (via my site).
Hope that helps.
X1 is by far the most robust and accurate client I've used. It's not free but it's a great piece of software. There's even a server application that will allow you to search throughout your entire organization.
#updatedb
There is a Universal Life Value Check it
I have used locate32 on my Windows XP machine for indexing the local and shared drives. My shared volumes server runs Linux, so I tried utilizing its locate-database. Unfortunately I came to the conclusion that locate32 and the Unix/Linux variant of locate are not compatible at all. However, if you are using a Windows server then you can run locate32 on the server and allow the clients machines to access it.
Do not meddle in the affairs of dragons, for you are crunchy and taste good with catsup.
I use xfriend personal (20 US). For your problem you would need xfriend business: http://www.xfriend.de/de/business/loesungen/
try www.x1.com I think it will solve all your issues
My understanding is that you only get the full document text search when the data is backed by a real SQLServer license. The person was looking for a full search solution. This is built into Alfresco.
SQLServer is per CAL even though the app is a web app.
The free microsoft search server express does exactly that. Plus it is very extensable if you want to write some code for it. Free download at:
http://www.microsoft.com/enterprisesearch
Under linux/unix:
1. mount CIFS share
2. run updatedb
3. run locate/slocate
Under Windows:
1. Do the above with cygwin
Other Options:
Locate32
Femfind
Lan Finder
We use mnogosearch to index about 700GB: search results are on our intranet, but mnogosearch can return an XML I belive, to integrating into some king of web-service should be easy. PS: access restrictions are a little problem though. We now have to implement our own search on mnogosearch which checks permissions first.
We are the developers of the Zoom search engine.
http://www.wrensoft.com/zoom/
We have spent some time recently looking at the problem if indexing large amounts of data, for see,
http://www.wrensoft.com/zoom/support/faq_large_sites.html
Many people above have recommended using external appliances, or external hardware. This doesn't make sense in our opinion. Using an external indexer that crawls your files means that 1) You are loading up your network, 2) You are limited to network bandwidth speeds (rather than SATA or SCSI data transfer speeds) 3) You have the overhead of the HTTP protocol.
What makes sense is to run the indexer on the server that is hosting the files and index them directly off the disk. Don't spider them, and don't do it across a network. This can save you many days of indexing time.
But with this much data, I don't think there is any really quick solution. Whatever you decide to do is going to take some setup effort.
IBM OmniFind Yahoo Edition.
No charge (unless you want support). Install it on your shared fileserver, it will index the files on the server and provide a web interface for searching it.
It does need a reasonable server, though, if you have a lot of documents.
SharePoint is $$$$.
It's traditional to use **** to imply expletives. Although I don't really see how just four characters covers "unbelievable half-baked cocksucking pile of shit".
... seconded
Herve S.
Desktop search is by design not fit for indexing large quantity's of data over the network. You should consider switching to a dedicated search appliance or software solution.
examples are:
Google Mini: http://www.google.com/enterprise/search/mini.html
Thunderstone: http://www.thunderstone.com/texis/site/pages
SophiaOne: http://www.sohpiaone.com
Microsoft search server: http://www.microsoft.com/enterprisesearch/en/us/search-server-express.aspx
Spotlight is the obvious answer if you have OS X. Not everybody in the world is lucky enough to be in that
position, most are stuck on one of the inferior platforms. Your rubbing it in, is not helping it just
alienates people who already have been through enough and have it tough.
Autonomy IDOL with mapped security.
Dedicated search is the way to go.
Desktop search is not meant for share indexation. Network load of multiple users will kill you.
Possible solotions are:
Google Search appliance:http://www.google.com/enterprise/search/mini.html
Thunderstone: http://www.thunderstone.com/texis/site/pages
SophiaOne: http://www.sophiaone.com
MS Search server 2008: http://www.microsoft.com/enterprisesearch/en/us/search-server-express.aspx
Slackware is at version 13 which makes it much more advanced than a version 7.
Read at Slackware got to version 13 so quickly at this link:
http://en.wikipedia.org/wiki/Slackware ;-)))
Everything I write is lies, read between the lines.
http://www.novell.com/products/openenterpriseserver/quickfinder.html
You should try regain http://regain.sourceforge.net/?lang=en . There exists a desktop and a server version. And if you are clever you mix them. You can create your indices with the server version (crawler) and copy the to your desktop computer. So you can search without the need of an application server to run the search frontend.
It's Microsoft's fault.
Or then it's your fault for formatting something with NTFS, and then putting your data there.
Never blame Apple. All hail Apple.
i'm a avid user of Copernic Desktop Search. It indexes my shared network drives. Once the index is created,CDS only updates the index when the computer is idle which does NOT slow down your computer. There is a free trial on their web site. www.copernic.com
Check it out!
I had a look at some solutions last year, and ran into one hell of a road block; most solutions I had a look at presume that all the information you're indexing should be searchable and/or available to anyone that can reach your search tool's client.
Has anyone had experience with something that will search the indexes for items based on your credentials? (Meaning that if you're not in accounting you can't get results for that data set)
Ever consider hiring a librarian? I've worked at 3 small companies that had one and they were far more profitable than most of their competition because there was someone in charge of organizing the data.
You could use Microsoft Enterprise Search Server Express which is free (if you have a Windows Server license laying around). It's the same search engine as MOSS without the CMS functionality and it can crawl just about everything either natively or with connectors. You can use MSSQL Express as the database engine which is also free.
Or you could go completely open source with Apache SOLR, though I hear it's so featureful that it's very difficult to install and configure.
Http://www.tntshoes.com
we are a prefession online store, you can see more photos and price in our website which is show in the photos ,hellow pls see our website in the photos attached attached is our store's website, we are a online shopping store, we are selling large brand new shoes,clothing, handbag,sunglasses,hats etc, our products are all best quality with the cheapest price. You will see the more pictures and the price for our product in our website, we are selling all brand new handbag, please see below some price list of the product. We accept paypal as payment, and give free shipping. Jeans : A&f Armani artful dodger jeans Bape BBC christian audigier COOGI D&G diesel ED HARDY lrg etc $33-50 free shipping. Jersey NBA Jersey MLB NLBM nike puma adidas $12-30 free shiping.
if you are interested in our product, please email me by
OUR WEBSITE:
YAHOO:shoppertrade@yahoo.com.cn
MSN:shoppertrade@hotmail.com
HTTP://www.tntshoes.com
I agree. Ducks are not all they're quacked up to be. You bet I'm posting as Anonymous Coward! :)
I've used this program with some of my users on a large network volume that they needed to do keyword searches on. The initial index build takes time (although not days), but subsequent updates are fast, and searching is super-fast. You can tell it what file types to index, too, which is nice. Everyone I've installed it for loves it, even though we have enterprise content management - they just find this easier to use and faster. http://s3.amazonaws.com/redtree/wilma/en/help/index.html
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
The Beagle tool on Linux can be used. It also has a web based
interface for running queries over the network. From looking at the
configuration screens, it appears to be able to cascade to other
systems to send queries to beagle services there also.
Dogs have a great sense of smell for finding things, so put a Beagle
to work databasing your files for instant access in the future.
I've used beagle for a long time, and I even added the Beagle plugin
for firefox so I can find content on web pages I've visited.
Thanks to an open source tool such as Beagle, I don't have to trust
proprietary code that has the potential of giving away my secrets
through hidden back doors.
- -Joe
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAkrd1GMACgkQ7J1dPd3sAmCRiACgieFtKL8IsB1ub4V7zBTEVQ8/
WQEAoJHh8jO6FFr1LPFfCiqVMSeD78a6
=3DGr
-----END PGP SIGNATURE-----
Again you are incorrect on several counts.
The whole point of Federated Search is that the server is NOT running Windows Search. Instead it can be running SharePoint / Search Server, FAST, or virtually any other indexing solution. Lots of indexing solutions can output RSS or Atom over HTTP, or can be easily extended to do so. Then you get the same local file experience users expect from Explorer, but with whatever remote index you want.
Also, Windows 7 and Server 2008 R2 do not run WS4, they have a newer version of the indexer that *is* significantly faster and scales significantly better.
Federated Search is a Windows 7 feature and did not exist in Vista.
Vista can query a remote Vista or Windows Search 4 index for a file share, but that's separate from the feature we call Federated Search, where Windows 7 can federate queries to OpenSearch enabled sources such as SharePoint, Search Server, FAST, etc.
If you're able to get a hold of it, IBM OmniFind Yahoo Edition would do the trick. Unfortunately Yahoo pulled the plug when they went into bed with Microsoft (Bing). I'm using it on a local intranet, and it works great. If you have a deep wallet, you can always look into the commercial version IBM offers, but it is really nothing but a packaged version of Apache Lucene.
Go to BlackBall,Inc.s Website and download a free trial of SearchIn 7.0:
http://www.blackball.com/b/nav
Best Indexing tool I have ever used, for all types of data anywhere on a network, and the new version is cross platform.
And Im not just saying that because I helped develop it. Haha, but seriously, it does everything you are talking about, and it outperforms other indexing and searching tools on the market.
It's so good, Toshiba even branded it under their name for distribution with their MFPs, for indexing documents that are scanned in from paper.
Check it out there too, its the same program, jus different branding;
http://copiers.toshiba.com/usa/software/document-management/e-bridge-research.html
Use something like the community version of KnowledgeTree. Searches are done on the server, not on individual desktops. I run it in combination with the open source Amahi server project and really like it.
The upcoming Solr 1.4 release (http://lucene.apache.org/solr) includes the Tika document parser. You can throw Word, PDF etc files to it, and it will index them for you. Solr is a web service; a few hundred lines of PHP would be enough to use the Solr to index your shared volume, and a again that much for a web-based search tool.
A cron job that does a find with file and maybe grep should be a good start. Then you can get clever with command line tools for pulling meta data out of images, videos, audio files, documents, etc.
"Open sores". Freudian slip?
This is impossible to do with any product now, but you'll be able to do this with Windows Server N+1 as long as all your clients are also Windows N+1, but only if you also buy Microsoft Windows Office SharePoint Enteprise Search Server X. You could try to do it with current technology, but you'd have to remove one of your arbitrarily self-imposed restrictions. I could tell you which one, but I'd be reducing my chance that anything I'm saying is relevant if I guessed wrong.
Full text search in Alfresco uses Lucene. Or at least it did when I deployed it on Debian with PostgreSQL.
Depending on your needs you may find RSP's Earth to be a viable solution. http://open.rsp.com.au/projects/earth/ It's cross-platform and has a web interface (though you could build a desktop one easily enough if you wanted to). Cheers, Alan.
I think you are being a bit too vague on what kind of files you are searching for/indexing. We have 5TB of storage that's 70% full, that includes quite a few smaller files/code files, and I've never had an issue with "find -name ". Of course I rarely use that because our files are organized and we have naming standards. Do you have a shared volume full of random files all mashed together? Are you in a college dorm and someone grabbed an external drive and shared it out for everyone to just throw porn and music and movies into and you want to index it so you can find Beastie Boys songs and XXX Blond Anal videos just that much faster?
have a look at the X-Friend product line of the german company x-dot. (http://www.x-friend.de).
They offer an enterprise search solution which seems to do everything you want.
They support a ton of file formats.
They even obey to any ACL set on the files, so a user only sees results of files which are readable by him.
You can also aggregate the search results over several distinct servers, so the end user only interfaces with his local client over a browser interface and then transparently can search over several servers at once.
Disclaimer: I have not tested the enterprise search product, but I am a happy user of their desktop search product.
Sorry I'm late, but you might want to look into Everything.
http://www.voidtools.com/
fnord
I recommend the Outlook add-in Lookeen for searching on shared volumes in combination with outlook. This tool has a shared index feature which will be very helpful!!!
FYI - Mac OS X is UNIX. Certified and all. It's a fantastic server platform.
http://beast.bio.ed.ac.uk/BEAGLE_on_Mac_OS_X
I'm guessing you've never used Spotlight extensively. It works great on large network shares.
Whoops, wrong link:
"The Beagle [1] developers decided to fill this search gap using Apple’s MacOS X search function as basic material."
http://www.linux-magazine.com/w3/issue/58/Beagle_Search_Tool.pdf
There's a free version of the expensive OmniFind product by IBM available with a little bit of Yahoo branding (which can be removed). You can't really integrate it easily with local search but I would assume that your users know whether they want local or shared documents. If you need security/permissions around people being allowed to access certain documents you're going to need to buy something. Otherwise try: http://omnifind.ibm.yahoo.net/
Welcome TO Our Website: Http://www.tntshoes.com
we are a prefession online store, you can see more photos and price in our website which is show in the photos.
pls find the more photos and the price for our product in our website,hellow see our website in the photos attached, on line shipping sotre, selling all kinds of brand new shoes,clothing, handbag,sunglasses,hats etc, if interested please email me by we are selling all brand new handbag, we take paypal as payment, . shoes Nike jordan1-23 $23-$28 free shiping.
OUR WEBSITE:
YAHOO:shoppertrade@yahoo.com.cn
MSN:shoppertrade@hotmail.com
Http://www.tntshoes.com
We offer kinds of Newest Style Handbag,Brand Handbag,Fashion Handbags,
Ladies' Leather Handbag,Replica Handbag--AmmonOnline
We ship to worldwide by EMS,TNT,DHL,UPS.
We supply you with smooth and fast services, and do dorp shipping.
Welcome to visit our factory.
Please visit our Website:www.tntshoes.com or products Album,
Contact us now, We can send you more details.
OUR WEBSITE:
YAHOO:shoppertrade@yahoo.com.cn
MSN:shoppertrade@hotmail.com
HTTP://www.tntshoes.com
Someone know how i can submit my own question..
Clucene and Postgres are combined, sure you could manage clucene and clucene, via libferris?
http://www.linuxjournal.com/article/9298
Surely some clever sort can whip up a quick GUI for you...
SharePoint is $$$$. Try Alfresco. Alfresco can look like a file share (support SMB, DAV, FTP, etc). The indexing is built is and does not require a separate SQL Server license.
Windows SharePoint Services (WSS) is free with any copy of Windows 2003 Server R2 or above. It can also use SQL Express, which is also free.