How Do You Keep Track of Your Web-Based Research?
time961 asks: "I use the Web extensively to research a wide variety of topics (weird, huh?). However, much of the time I end up printing out web pages and filing them on paper, because that's the easiest way I know to say 'OK, that was interesting, I'll hold on to it until I actually do something about this topic'. Often, I'll run across something that seems relevant to a long-term project or interest and just want to grab it without even reading the details. Paper is OK for reading, browsing, and scribbling, but it's hard to search, it's heavy, and it's wasteful (and I yearn for a day when browsers can reliably print what's on the screen, instead of cutting it off at the margin because some designer doesn't understand layout!). How do others deal with organizing the results of browsing?"
Bookmarks and histories aren't the answer — they're not very good for searching, the UI isn't very good for, say, adding notes, and they don't work offline. Also, stale URLs are a huge problem — a key advantage of paper is that it doesn't randomly fade out in a few days (or decades), so a good solution would have to keep copies, not just references. I imagine something like a FireFox plug-in with a 'Remember This' button and some options for category, keywords, annotations, etc., but I'll bet there are more creative approaches, too."
First off, install a good PDF printer.
You can find all the features in a nice list at the official homepage with tons of pretty screenshots. There's even a 50 page manual (PDF) created by Andrew Giles-Peters.
Even though development has seemingly halted since December 2005, it's still one of the most well rounded extensions for Firefox I've come across yet.
Perfect is the enemy of done.
wget is probably one of my favourite Linux command-line tools. All I need to do is wget -r http://www.doodahdoo.com/ and it saves a directory called doodahdoo.com and all the pages in it, as well as the images, and any embedded video and such. This is very handy, not only for getting a huge number of files (say my http backup server), but also for getting entire sites that I might have a use for in future.
At the moment, I have on order of 10GB just of websites, radio clips, and what have you that I have used for previous research. Not only that but I can also maintain a simple directory structure and never have to worry that that "firefox plugin" will still be compatible with version 4.765.
Another neat function is you can specify just a particular files (www.whatever.com/pic.jpg), or all the files with a particular extension *.jpg, or only the files in that directory. You can also use it to spider (limited) all the links on a site. Though be kind and don't do this too often, as I am sure it eats a lot of bandwidth.
The last (and greatest) thing, is it remains in a well-known and easily editable format.
Alternatively, I have also used a MediaWiki setup so that I could drop down notes for classes, or other interesting things in it, but this required substantially more overhead than wget.
PDFCreator is a free open source pdf printer http://www.pdfforge.org/products/pdfcreator
// MD_Update(&m,buf,j);
Maybe something in Firefox one day that'll tell you that your bookmarking something again?
5 3
Ask and ye shall receive!
http://bookmarkdd.mozdev.org/
Or the Mozilla Addons page for it :
https://addons.mozilla.org/en-US/firefox/addon/15
XenoPhage
Technological Musings