Slashdot Mirror


Organizing and Analyzing Mounds of Research Text?

Andrew Green asks: "Four years ago, I stopped working on my Master's thesis in Social Anthropology. Now, I'm getting back to it again, and I find there's a _lot_ of text to deal with. I have a 350-page field diary, a dozen transcripts of recorded interviews, lists of books and articles, extensive notes about the books and articles, and the books and articles themselves in dead-tree or electronic format. I want to organize _all_ of it. I need to keep track of all the different text files on my computer (most are in MS formats, though I now use GNU/Linux), which ideally means keeping personalized sets of metadata about each file, linking files to other files and to entries in bibliography lists, and having some sort of version control. I'd also like to be able to do a free-text search on all texts on my local hard drive. And, most important of all, I need to build a hierarchical list of topics that are relevant to my thesis and relate specific sections of all these texts (not just whole files) to different topics. Any ideas? I know there are proprietary solutions out there, but I don't want to use them. What free applications can best deal with some or all of my needs? Would I be better off building something myself?"

8 of 28 comments (clear)

  1. the open source app for your needs: by Tumbleweed · · Score: 3, Funny

    pico

  2. Dude. by dynoman7 · · Score: 3, Insightful

    Four years ago, I stopped working on my Master's thesis in Social Anthropology. Now, I'm getting back to it again, and I find there's a _lot_ of text to deal with. I have a 350-page field diary, a dozen transcripts of recorded interviews, lists of books and articles, extensive notes about the books and articles, and the books and articles themselves in dead-tree or electronic format. I want to organize _all_ of it. I need to keep track of all the different text files on my computer (most are in MS formats, though I now use GNU/Linux), which ideally means keeping personalized sets of metadata about each file, linking files to other files and to entries in bibliography lists, and having some sort of version control.

    Dude. Quit procrastinating and write the damn thing...your teachers have no clue who you are and it will be a surprise when you dump said "thesis" on their desks.

    --
    Blarf.
    1. Re:Dude. by tengwar · · Score: 3, Insightful
      'Fraid I've got to agree with parent. If you know the stuff, a paper index is enough. If you don't, you've just got to read it until you've internalised it before you can draw any vaild conclusions.

      If you insist on linking the documents, use plain hand-written HTML - I've done it before while getting in to a subject, but don't expect to need it after the first couple of weeks.

  3. wait, FREE solutions? by Anonymous Coward · · Score: 3, Informative

    well, one I've played with is:

    DEVONthink

    It's really cool and great for exactly what you describe: storing a bunch of loosely-connected information that you need to search and cluster into categories.

    You just add your text and it will automatically classify it using semantic analysis.

    But alas, it is for Mac only and is not Free.

    If anybody knows about anything like this for Linux, and Free, I'd love to hear about it....

  4. InfoSelect by falsification · · Score: 3, Funny
    As an advocate of free software wherever possible, I can confidently recommend that you format your hard drive, install Windows, and then install InfoSelect. You will thank me later.

    http://www.miclog.com/

    (Or keep Linux and try to run InfoSelect with WINE. I don't know if that would work.)

  5. Try TWiki by Demosthenex · · Score: 4, Informative

    TWiki has many of the features you mentioned, and is a web based app that you could publish with later. ;]

    http://www.twiki.org/

    keeping personalized sets of metadata about each file, linking files to other files and to entries in bibliography lists, and having some sort of version control.

    Each TWiki page can have custom searchable metadata in forms. Pages are linked to other pages by WikiWords. Version control and access lists are on every page.

    I'd also like to be able to do a free-text search ... I need to build a hierarchical list of topics that are relevant to my thesis and relate specific sections of all these texts

    Text searching is integrated. You can arrange TWiki pages into hierarchies with parent topic, and there is automatic crossreferencing.

    TWiki uses a pure text format with some simple markup, *bold* for example. HTML can be used as well.

    I'd suggest you check it out. No database required.

    Demo

  6. Nothing will think for you by RGRistroph · · Score: 3, Informative
    Nothing will think for you. But if you are the type of person who studies by accumulating a pile of books, reading random pages and then looking up the interesting terms in the indexes of several other books, then you may be able to do that with electronic documents.

    Here are some links to indexing and searching software. There is a lot of stuff oriented towards providing search functionality on web pages, but you may want something that just searches your local drive.

    • MG (it is not necessary to buy the book just to use it).
    • DesktopDig; nice graphical interface, I had trouble installing it.
    • Clucene, a C++ version of Lucene. Stay away from Lucene, it's in Java.
  7. 3 ring binder anyone? by foniksonik · · Score: 4, Insightful

    Well it's not exactly opensource or even software but for your one time purpose you might consider just categorizing your data into sections and putting it in to one or multiple 3-Ring binders with subdividers w/ little tabs on them for easy lookup.

    After that you will have a great start on digitizing the whole thing if you still feel like it later (presumably after you've actually gone ahead and written your paper).

    Seriously, why take all the time to digitize this stuff when you're only going to use it once in your life as reference. When and if you get to publish it later you can spend the time inputting or maybe even hire someone to do it for you.

    --
    A fool throws a stone into a well and a thousand sages can not remove it.