Organizing and Analyzing Mounds of Research Text?
Andrew Green asks: "Four years ago, I stopped working on my Master's thesis in Social Anthropology. Now, I'm getting back to it again, and I find there's a _lot_ of text to deal with. I have a 350-page field diary, a dozen transcripts of recorded interviews, lists of books and articles, extensive notes about the books and articles, and the books and articles themselves in dead-tree or electronic format. I want to organize _all_ of it. I need to keep track of all the different text files on my computer (most are in MS formats, though I now use GNU/Linux), which ideally means keeping personalized sets of metadata about each file, linking files to other files and to entries in bibliography lists, and having some sort of version control. I'd also like to be able to do a free-text search on all texts on my local hard drive. And, most important of all, I need to build a hierarchical list of topics that are relevant to my thesis and relate specific sections of all these texts (not just whole files) to different topics. Any ideas? I know there are proprietary solutions out there, but I don't want to use them. What free applications can best deal with some or all of my needs? Would I be better off building something myself?"
Four years ago, I stopped working on my Master's thesis in Social Anthropology. Now, I'm getting back to it again, and I find there's a _lot_ of text to deal with. I have a 350-page field diary, a dozen transcripts of recorded interviews, lists of books and articles, extensive notes about the books and articles, and the books and articles themselves in dead-tree or electronic format. I want to organize _all_ of it. I need to keep track of all the different text files on my computer (most are in MS formats, though I now use GNU/Linux), which ideally means keeping personalized sets of metadata about each file, linking files to other files and to entries in bibliography lists, and having some sort of version control.
Dude. Quit procrastinating and write the damn thing...your teachers have no clue who you are and it will be a surprise when you dump said "thesis" on their desks.
Blarf.
Well it's not exactly opensource or even software but for your one time purpose you might consider just categorizing your data into sections and putting it in to one or multiple 3-Ring binders with subdividers w/ little tabs on them for easy lookup.
After that you will have a great start on digitizing the whole thing if you still feel like it later (presumably after you've actually gone ahead and written your paper).
Seriously, why take all the time to digitize this stuff when you're only going to use it once in your life as reference. When and if you get to publish it later you can spend the time inputting or maybe even hire someone to do it for you.
A fool throws a stone into a well and a thousand sages can not remove it.
Spend the time you'd otherwise spend scanning, acquiring software,in just reading the stuff. Underline, put stickies in to mark places, make up notecards, etc.
This ISN'T too much data to put in your head, and in your head is where it needs to be. Soak in it. Give your right brain a chance to have a crack at it. Then one day you'll wake up from a sound sleep at 3 a.m. in the morning and something will suddenly be clear that hadn't been clear before.
Admittedly, it will then take you about thirty minutes to find the supporting documentation (fifth pile from the left, half-an-inch down, with a lime-colored post-it on it; and that article from the journal with the buff cover. But if you spend the time computerizing, you'll be able to find everything--but no idea what you need to find.
Computers do have their uses in "augmenting" human intelligence (Engelbart's term) but if you can't master the data you've collected yourself on a deep, intimate, _topographic_ level, computerizing it probably isn't going to help that much.
Just my $0.02.