Slashdot Mirror


File Organization — How Do You Do It In 2011?

siddesu writes "After 30 years of being around computers, I have, like everyone else, amassed a huge amount of files in huge amount of formats about a huge amount of topics. And it isn't only me — the family has now a ton of data that they want managed and easily accessible. Keeping all that information in order has always been a pain, but it has gone harder as the storage has increased and people and files and sizes have multiplied. What do you folks use to keep your odd terabyte of document, picture, video and code files organized — that is, relatively uniformly tagged, versioned, searchable and ultimately findable, without 50 duplicates over your 50 devices and without typing arcane commands in a terminal window? I found this discussion from 2003 and this tangentially relevant post from 2006. How have things changed for you in 2011? And how satisfied is your extended family with the solution you have unleashed upon them?"

16 of 356 comments (clear)

  1. Directories by Anrego · · Score: 4, Informative

    .. seriously.. they still work for me.

    I’ve got a 12TB file server (~6TB filled). It’s arranged as follows:

    documents/
    incoming_downloads/ (before you ask.. yes.. _legit_ downloads)
    media/
    media/video/
    media/video/movies/
    media/video/tv_shows/
    media/video/tv_shows/some_tv_show/
    media/video/standup
    media/video/etc..
    media/music/
    media/images/
    media/images/various_subfolders/
    code/
    virtual_machines/
    tmp/
    backup_links/
    backups/

    That’s always been enough for me. Never got into all this tagging/meta data stuff. If there’s anything I’d ever want to search on... I put it in the file name. Indexed every night via slocate.

    backup_links is part of my hacked together backup system.

    The thing is raid6, setup so two drives can fail without loss of data. I see this as adequate “backup” for stuff that is replaceable (the large portion of my media is rips of DVDs I own... so although it would be a huge pain in the ass to re-rip them all... it’s not impossible). Stuff that is irreplaceable, I backup to separate hard drives (via hot swap trays).

    I leave one backup drive plugged into the machine, and keep the other elsewhere. I periodically swap these drives. I have a script that just rsyncs the files and directories pointed to in backup_links (the irreplaceable ones) to the currently plugged in drive (and yes I verified that I’m not getting a backup of my links ;p). This way I always have one drive that has a pretty recent backup (runs nightly), and one drive that has at most a month or so old backup if the plugged in one fails for some reason.

    backups is backed up files from other machines.

    Keeping everything in one place helps with the organization I think. Most of the other machines on this network are basically just OS installs. All the real files are on the file server. My desktop runs of a small SSD, which is not even half filled.

    1. Re:Directories by RuiFerreira · · Score: 5, Interesting

      I basically use the same structure as you but I have an extra directory called "attic" where in practice I end up putting everything.

    2. Re:Directories by Ponder+Stibions · · Score: 3, Funny

      I find these useful, but for family stuff I can't recommend a simple hard drive crash enough. They will suddenly know where copies of everything important is, and it'll come down to only a few gigabytes....

  2. No Porn? by Anonymous Coward · · Score: 4, Funny

    I think you left a directory out. ;)

    1. Re:No Porn? by Anrego · · Score: 4, Funny

      media/video/etc..

      I figured it didn't even need to be said ;p

  3. Ultrafast search and metadata filesystem by Twinbee · · Score: 5, Interesting

    I have recently found an incredibly fast search tool called Everything. We're talking about Google-like searching where the results pop up as you type. It must be something on the order of a fifth of a second for my 1.5 million files. This kind of technology should be widespread - it makes searches actually *pleasant* to do. Anyway thanks to Everything, I worry less now about where I store my files, and I also try to pack in keywords into the filename.

    Anyway, this kind of program is just a glimpse of what a future OS would look like. Imagine a system where everything is stored in tags and where folders become obsolete or used far less often. What you have then is a database or metadata file-system. The relatively new Haiku OS uses such a system, and I wrote about the massive advantages from this old page:
    http://www.skytopia.com/project/articles/filesystem.html

    Honestly, we'll all be better off the sooner we switch.

    --
    Why OpalCalc is the best Windows calc
    1. Re:Ultrafast search and metadata filesystem by timeOday · · Score: 5, Interesting

      Imagine a system where everything is stored in tags and where folders become obsolete or used far less often.

      It bothers me when people think tags are fundamentally different from folders (directories) in the first place. I'm going to re-introduce directories as "hierarchial tags" and blow everybody's mind.

      Maybe it's because people think of directory membership as exclusive? But it isn't. You can link a file into as many directories as you like with the 'ln' command. If that hasn't caught on, and if Windows Folders don't even really support that, it's because most people just don't bother... and the same is/will be true of tags by any other name.

    2. Re:Ultrafast search and metadata filesystem by lennier · · Score: 3

      You can link a file into as many directories as you like with the 'ln' command.

      Well, sort of. You can create a hardlink or symlink in the Posix model easily enough, certainly. But the link is only one way - you can't easily find, given a file, where all its links are. So they can tend to get caught up in the bit-rot. And there's enough of a stigma around symlinking - let alone hard-linking - that very few tools can be relied on to support it in all cases.

      A true tagged or non-exclusive-directory filesystem would, I assume, have proper two-way linking between a file and and its links, so you could query a file and get a list of its tags/locations. And all the tools, without exception, would fully support it. This would include things like copying a 'folder' to removable media - you would need to standardise what it means. You can't just copy the links and you can't just turn the links into unlinked files.

        What you could do, perhaps, is store all the originals (including folders) in a single universal folder as a globally-unique identifier (it can't be just system-unique, because what if you copy a file to someone else's machine?), then make the other folders on a system contain only hardlinks, and have the file-and-folder copy algorithm copy both a subset of the originals folder and all the appropriate tag folders...

      It gets messy, is what happens, because things like disk drives fundamentally have a notion of containment (my file is either on this disk or it's not, it doesn't help if it's 'virtually somewhere out there in the cloud' once I've pulled the network plug) while tags don't. I'm sure we could solve these problems, but they need to be solved correctly and with mathematical rigor at the lowest layer of the filesystem. I don't see any serious attempts to do that in any of the tagged filesystem approaches I've seen yet.

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
  4. Keep the irreplaceable stuff in a separate tree by traindirector · · Score: 3, Insightful

    I also still use a similar directory structure, but I've made once change in the past few years that makes it much easier to manage: I keep the special, personal, irreplaceable in a separate hierarchy.

    This negates the need for something like a backup_links directory, and makes it much easier to just share the "normal" media directory with everyone/thing on my home network and then handle permissions on the personal stuff with more granularity. It's also much easier when I know I'm looking for a photo I've taken or a document I've made that it'll be in the personal hierarchy under those categories rather than the main ones.

    It's a small change, but keeping a separation between stuff I've made and the easily replaceable stuff I've acquired has gone a long way to making my personal data and treasures more secure--both from loss and accidental sharing.

  5. Tags are useless for personal organization by icemaze · · Score: 3, Interesting

    Who has the time to hand-pick all the relevant tags for every file they download? Yeah, me neither.
    Finding time to put things in their own directory, and not dumping them all in "downloads", is a great accomplishment.

    However finding a meaningful, hierarchical structure is non-trivial. I'm still working on it.

  6. google desktop (RIP) by meeotch · · Score: 4, Interesting

    I had great success with Google Desktop Search (on windoze) for a while. It would index my mail, files, and web history (if instructed to) - and the best part was hitting one key to get an instant, minimalist search box with auto-preview. From there, you could jump straight to what you were looking for, or open a further page to narrow the search.

    Sadly, it doesn't work with Thunderbird 3.0, and Google doesn't appear to care, or even to be supporting it anymore. So now I'm on a hodgepodge of GDS, Windows built-in search, and the sucky T-bird search bar.

    I honestly can't believe that nobody has duplicated this Spotlight-esque functionality yet. I realize there are other desktop search options, but none of the ones I've come across have that one-key mini search that goes away as easily as it is called up. For an operation that I'm performing dozens of times daily, that's pretty crucial. It even replaced the file browser for me - much easier to call up the GDS box & type a couple letters than to grab the mouse and drill down into some directory structure - even if I know exactly where I'm going.

  7. Re:Learn to delete by Hatta · · Score: 4, Insightful

    Do you need all those instalation files for 10 year old shareware?

    Sure do. In fact I just installed StuffIt Deluxe on an SE/30 last weekend

    Do you really need Gigabytes of movies you will never watch again? Music Collection so big that your playlist is months on lenght? Irrelevant TV shows?

    The bigger the collection, the more fun shuffle is.

    More ebooks than you can possibly read?

    You never know which one you'll need to refer to.

    --
    Give me Classic Slashdot or give me death!
  8. A word for "lifestreams" and against livelink by rbrander · · Score: 3, Insightful

    I'm pretty much a "have a lot of structured directories" guy myself; I don't see your complaint about rising file sizes, or even total number of files. They've pretty much increased linearly in number while the speed of the linux "locate" command has gone up exponentially with Moore's Law. It's the other way around from management trouble - with TB hard drives, I have so much space I leave around TV shows and other media files I'll likely never watch again, "just in case".

    At work, the search problems are harder, because I've got quite the multi-tasking job where I may spend just minutes on some problem, then be asked for an update months later, totally skeptical that I ever addressed the issue. And my favourite file-management with that is the most insane-sounding of all: one big directory. I sort it by date and rely on the fact that I take time to write out helpful file names like "downtown_condition_assessment_newmall_4_ernie.xlsx" (not actually that long, I use abbrevs in RL). Only files that have a whole lot of subject-matter friends get their own subdirectory; lonely "one-off" files go in the Big Pile.

    The "sort the directory by date" uses the theory behind "lifestreams" promoted by Eric Freeman and David Gelernter at Yale. It really is the best thing I've found (same 30 years) to stimulate the memory - seeing the names of other things you did at the same time; you can actually sense yourself getting close to the file as you remember, "Oh yeah, I worked on that in the spring".

    An additional word of Fear & Loathing for "document management systems" like LiveLink by Formark. Required to use this by work (shared directories are strictly for 'short-term' storage), it's awful. Terribly slow, the search function approaches useless, and it's hard (and slow, did I mention slow) to even re-sort a directory (sorry, that's a 'filter down' in Livelink's vocab) by name or date or whatever. After promising that photos would be displayed with thumbnails by the great new Version 4 for two years, it came, broke some stuff that was working, and did not provide thumbnails - all media files are unsearchable in any way. I suspect for long-term archiving, putting documents in a database would have advantages, but for active business usage, it's been crippling.

  9. Everything on PC, Spotlight on Mac by Xian97 · · Score: 4, Interesting

    Everything is what I use on the PC to quickly find any file I am looking for.

    On the Mac I use Spotlight.

    While it would be nice to be completely organized, these tools let me find my files anywhere they are located on my PC. I try to keep things organized into folders, but I am always falling behind so these are what I can use in the interim.

  10. Infinite monkeys technique... by fahrbot-bot · · Score: 3, Funny

    Applying the Infinite Monkey Theorem I put everything into one folder, assigning each file a pseudo-random name. Although there's only one of me, in time, I'm confident that a pattern will emerge...

    --
    It must have been something you assimilated. . . .
  11. Nemo Documents by daserver · · Score: 3, Interesting

    If you are on Windows you might want to give Nemo Documents a try. It gives a time based view and allows one to use tags. Disclaimer: author posting ;-)