How To Manage Hundreds of Thousands of Documents?
ajmcello78 writes "We're a mid-sized aerospace company with over a hundred thousand documents stored out on our Samba servers that also need to be accessed from our satellite offices. We have a VPN set up for the remote sites and use the Samba net use command to map the remote shares. It's becoming quite a mess, sometimes quite slow, and there is really no naming or numbering convention in place for the files and directories. We end up with mixed casing, all uppercase, all lowercase, dashes and ampersands in the file names, and there are literally hundreds of directories to sort through before you can find the document you are looking for. Does anybody know of a good system or method to manage all these documents, and also make them available to our satellite offices?"
Isn't this the sort of thing that a google search appliance would be helpful for? Then you don't need to know the exact filename, just some specific information that can identify the file. This certainly solved my problem with having thousands of emails.
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
and there is really no naming or numbering convention in place for the files and directories.
I think you already know the answer.
"linux is just DOS with a UNIX like syntax" -- Galactic Dominator (944134)
If they're going to consider Hummingbird, they need to be ready to cough up the dollars to get an *EXPERIENCED* Hummingbird administrator. If not, the product will be set up, but basic search functionality will be hosed because of some of the same issues in the original problem description (arising from differences in how the document's properties sheets are populated). If done well, it can be fantastic. If not, it users will hate it and do everything possible to avoid it (including installing their own NAS devices).
I use irony whenever I can, but my shirts are still wrinkled...
Or some other corporate content management system
Here's to the crazy ones
Most print companies like Xerox have their own proprietary Document management tools you can buy, and a bunch of CRM and ERP solutions (like OpenERP - it's free AND Open Source) provide some good simple document searching and indexing tools.
Really it comes down to how complex you want searching to be? Are there specific keys in the document you could index by? Do you require the full-text search capabilities of a Google search appliance?
A really good solution I've come across for some clients in Edmonton is Called MetalTrace by Trace Applications. Don't let the name fool you about the specificity, software like this can Scan, Index, and even read barcodes on all sorts of documents then let people search for it via the web. Their "killer-app" has multiple user-defined document types with multiple search fields, combined with some back-filing (digital and scanning) really saved the day.
Do your research though on "Document managment" and see what product best fits your needs. It's a really well established field so reinventing the wheel is a little masochistic... not that there's anything wrong with that. ;)
-Matt
--- Need web hosting?
Why should you give sharepoint a chance? Even it it works well, it is proprietary and you are locked in.
I'm gonna say nothing beats a proper folder structure and naming convention. I'd also recommend using svn. Also spend some time to develop some macros to assist in the creation/saving/retrieval of said documents from the repository. Maybe create some standard templates too... just my 2cents!
Hire a document manager / clerk person who will create order. Your engineers won't.
Boing boing boing....
Or better yet talk to people who've done it before. I mean seriously there have been organizations managing hundreds of thousands of documents since the Roman Era, its nothing new.
I know I'm gonna get hit for blurting out the Microsoft Solution but...give SharePoint a shot...
Just avoid the wiki functionality like the plague. It completely sucks.
Since your organization probably has Windows clients, you can only long for something as nice as Mac OS X Spotlight Server.
Google Search Appliance is definitely what you want.
If you have a mid sized company you definitely don't have the surplus of highly talented systems administrator talent laying about to run one of the document management systems that others here are likely to suggest. Be very careful going down the document management server path. It's far, far more work than you think it will be, than the vendor will tell you it is. Not simply more work for you, but for your IT staff and your users, too.
The Google Search Appliance, by contrast, is "fire and forget". Plug it in. Turn it on. Patch it when Google suggests you do so. That's about it.
If you mod me down, I shall become more powerful than you could possibly imagine.
What you say:
Why should you give sharepoint a chance? Even it it works well, it is proprietary and you are locked in.
What you mean:
Regardless of how perfect a solution might be for you, if it doesn't conform to MY personal ideological viewpoint, it shouldn't be given a chance.
God I hate people like you.
--AC
Skilled consultants are great but without training employees you'll keep on paying big $ for consultants whenever there's a change to make. Let the consultant show how and let the employees do the work. BTW: We have 3000+ users (all happy) on their system and no consultant.
Views expressed do not necessarily reflect those of the author.
The weird characters could easily be taken care of by something like Ant Renamer (even supports RegEx). Just replace the weird ones with an underscore or some other suitable character.
There is a whole profession dedicated to this, and there is a major in college specifically designed to assist in organizing documents into meaningful collections.
I suggest your company look at hiring a library sciences major, since this is what they do.
who prays for Satan? Who in 18 centuries has had the humanity to pray for the 1 sinner that needed it most? ~Mark Twain