Slashdot Mirror


How To Manage Hundreds of Thousands of Documents?

ajmcello78 writes "We're a mid-sized aerospace company with over a hundred thousand documents stored out on our Samba servers that also need to be accessed from our satellite offices. We have a VPN set up for the remote sites and use the Samba net use command to map the remote shares. It's becoming quite a mess, sometimes quite slow, and there is really no naming or numbering convention in place for the files and directories. We end up with mixed casing, all uppercase, all lowercase, dashes and ampersands in the file names, and there are literally hundreds of directories to sort through before you can find the document you are looking for. Does anybody know of a good system or method to manage all these documents, and also make them available to our satellite offices?"

6 of 438 comments (clear)

  1. Answered your own question by Sir_Lewk · · Score: 5, Insightful

    and there is really no naming or numbering convention in place for the files and directories.

    I think you already know the answer.

    --
    "linux is just DOS with a UNIX like syntax" -- Galactic Dominator (944134)
    1. Re:Answered your own question by CorporateSuit · · Score: 5, Funny

      No kidding, men are practically born with this instinct.

      The most basic is dividing the images up according to hair color or the number of girls appearing in each photo. Then you usually divide them up between hardcore and softcore, type of performance, fetish, etc. For your favorites, you can keep a folder in the home directory, of course. I know this guy works for an aerospace company, but keeping track of 500,000+ files isn't rocket science! We've all been able to do that since the advent of the 200GB harddrive.

      --
      I am the richest astronaut ever to win the superbowl.
  2. There is a right way. by mrmeval · · Score: 5, Informative

    http://en.wikipedia.org/wiki/Document_management_system

    For that level of documentation you need to have a staff and get it properly indexed. You need a high level librarian. This would be someone with a masters degree at minimum in library science and at least a bachelors in information technology. They will not come cheap and they are a long term investment. The software is available, it is not trivial. Hiring a large number of people to recategorize and tag all the documents for the length of time that takes is also an expense but worth it. Once it's all in place maintaining it gets much easier.

    I've seen a system developed for Raytheon. They took all the old compartmentalized data Hughes had and put every scrap of paper through a scanner. It was exceptionally well done. This would display electronic files and would have the location of hard copy. Classified documents were in some cases indexed but were hard copy only afaik. There were some documents that were hard copy only, those were usually ones with an NDA or other restriction on making electronic copies. It had every thing mentioned wrt versioning and such. Documents spanned decades with hundreds of revisions and you could pull up and view any revision. Depending on how recent and what type of document you could view a change log. Older scanned ones did not have that unless they'd been important enough to reenter as modern documents which meant OCR or manually transcribed. Some schematics were reentered into the system in a modern format. The effort was worth it. Having that data is the only way some devices or parts could be made or repaired.

    http://en.wikipedia.org/wiki/Document_management_system

    --
    I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
  3. Re:Google wave by theNetImp · · Score: 5, Funny

    monks work for free, they just need food and enlightenment, and if you get lucky they fast and then only need the enlightenment aspect.

  4. Re:Google to the rescue? by liquidsin · · Score: 5, Insightful

    use your users, if you can. i'm just talking out my ass here, but i'd think it a not-too-difficult matter to add some sort of user input form along the lines of "hey, now that you've found the document you need, does the name fit the new naming scheme? if not, why not rename it so it fits!". this is assuming you can trust your userbase not to be asshats and to be able to follow the naming protocol.

    --
    do not read this line twice.
  5. Re:Google wave by Anarchduke · · Score: 5, Insightful

    There is a whole profession dedicated to this, and there is a major in college specifically designed to assist in organizing documents into meaningful collections.

    I suggest your company look at hiring a library sciences major, since this is what they do.

    --
    who prays for Satan? Who in 18 centuries has had the humanity to pray for the 1 sinner that needed it most? ~Mark Twain