Document Management For Research With Annotation?
msimm writes "I'm currently looking for a document management system for personal and research-related use. Having looked at Alfresco and KnowledgeTree along with a slew of similar open source document management systems they seem to have a common set of features including version control, archiving, document permission/ownership and search/indexing. What I'd like, in order to help me manage my own continually growing collection of pdf/doc/odf/rtf/txt files, would be something that allowed me to view and annotate documents (and possibly collaborate/share notes) without requiring me to download, edit and re-upload each document. Obviously there are plenty of capable document management systems out there, so I really suspect I've simply missed something and am hoping someone can point me to a better way to index, search, collaborate and keep and share notes on the ever increasing glut of useful information I seem to use and collect."
Nothing much more to say here... I have found EndNote very useful.
if you want a low-tech approach, just install a wiki. Mediawiki is full featured while MoinMoin is easy to install and configure (no separate database needed). I haven't used any others.
Just whip one up in rails and post a 20 minutes screencast. Done.
Collaborate, in my opinion, implies that there is some advanced messaging going on in the background. And the persistence of that messaging (whether on a centralized server or via some P2P/Client routing protocol) is not only complex but often needs to be specific to what you want to collaborate about. Let's look at annotations. Where are they stored? How am I notified if you add an annotation to my document? How do I track my annotations? How do I share my annotations? Where is that stored? Etc. The questions raised are endless.
A coworker implemented a basic ruby service of this where I work and I have to say that he didn't find any open source alternatives before he started that fulfilled anywhere near what we needed. Ruby made it pretty easy (1 or 2 person job) with the emphasis just being javascript and DOM coding to get the interface correct. Then we just had a RESTful service for storing these and from there we'll keep adding on features like messaging/e-mail alearts/etc for the users when we get time. Yes, I'm aware that if I open sourced this you could help me out with that but I'm sorry, my employer is not on that boat (yet).
For your reference, even just document management is a sticky solution to find in open source, we've talked about it time and time again.
My work here is dung.
..please be nice...
Does jabref suit your purpose : http://jabref.sourceforge.net/
Would Google's Wave work for you? It's real time, centralized, and browser based. I say privacy concerns aside, because the protocol is available, and people could build their own servers (such as http://code.google.com/p/pygowave-server/)...
If a man isn't willing to take some risk for his opinions, either his opinions are no good or he's no good
I've been searching for something similar for a while but can't really find anything to fit the bill.
What I'm looking for is a system that will allow you to highlight a particular quote in a PDF and attach a comment to it. When I finish my review I would like to have all my comments organized in a tabular format. The table should have the quote, page number (ideally also chapter and paragraph but this is asking too much) and my comment.
This way I can attach my comment sheet to the top of the document and inspect it quickly without even having to open the actual document. This review sheets are also "portable" because they can be shared and anyone with no infrastructure could still identify the comment and quote.
Adobe Acrobat does only half of this, you can highlight, comment but you can not control the format of export (CSV or excel would be cool)
Does anyone now a system that does this?
I use Papers. It does not do everything you want, but it is a nice management tool. It is still growing in features, and the support staff is very responsive. (They provided me, same day, a new NIB file that allowed me to use it on my small hackintoshed Dell Mini 9 screen.)
The link is here: http://mekentosj.com/papers/
Otherwise, Endnote works well. I know many who use it. There are a few others that are also out there.
Good luck with it.
Zotero may well be what you're looking for. Much better and more open source than EndNote (mentioned above).
Hi, I currently use KnowledgeTree CE. When I want to add a note, a link or a comment to one of my document (without editing it), I add a discussion (in the Document actions portlet) to that document.
Zotero might be worth a look. It's a Firefox plugin (open-source), mainly designed for keeping track of a collection of academic litterature. It allows you to organize the papers in folders, tag, annotate, and share the papers and annotations with others, all easily available in the FF gui. You can export lists of references to Word/OpenOffice/TeX when writing papers, they can be autoformatted to a wide range of citation styles.
It works really well (with full-text search) for storing web pages/pdfs. I don't know how well it works for .odt etc. Even if your purpose is not that of the typical university researcher it might be useful. For instance, recently I've liked using it for storing job ads, and my corresponding applications.
If you already have it installed, iTunes may be a simple solution.
http://lifehacker.com/software/pdf/geek-to-live--organize-your-pdf-library-with-itunes-240447.php
Rich And Stupid is not so bad as Working For Rich And Stupid.
About a year ago I needed a piece of software that matches your requirements. I wanted to be able to do my research from anywhere and keep track of notes and annotations in a very simple but searchable way.
Zotero is the closest thing. It's not perfect, far from it, but none of the competition came even close.
Zotero is a Firefox plugin that allows you to link or store information, be it webpages, pdf's or anything else you may see online. It's possible to group & tag your documents in various ways and there are various options for taking notes and adding annotations.
All of it is stored online so you don't need to carry anything with you. Just install the firefox plugin, enter your credentials and off you go.
Wikindx3 is a full-fledged bibliographic database that can manage *any* type of document, and permits annotations. As an added bonus, you can export the biblio info in any number of formats (including my favorite, .bib for LaTeX).
I've had good success with OpenDocMan as well, but I'm not sure if that application permits annotation (at least I've never used that feature set).
my favorite: http://www.citavi.com/en/index.html
or zotero, or jabref. there's plenty of academic reference/document manager. and even more comparisons of them.
www.officelive.com
It is free, allows upto 5GB of data. Easy collaboration by sharing individual workspaces.
Built on Sharepoint technologies, worth a look.
Cheers
HS
Zotero is brilliant. I could go on about how I use it every day at work and it makes everything a hell of a lot easier, but instead, just check it out.
Versioning of documents it doesn't do - but that's what Mercurial is for I guess.
Depends largely on what your annotating and why. You might want to check out mendeley http://www.mendeley.com/ which has been pretty great for just managing documents. If your more interesting in annotating and learning a lot about what and why your annotating you might want to look into the fields of mixed methods research such as EthnoNotes.com http://www.ethnonotes.com./
http://www.mendeley.com/
Desktop client that syncs with online account. Keeps track of all kinds of documents. I use it for research (at university). For me, it works great for keeping track of journal articles, you can add them by title, DOI, arXiv, and it will look up all the details automatically. Entries can be linked to files, PDFs, etc. You can also just add PDFs and it will usually pull out the metadata (can also monitor folders for new documents to track).
And of course, it syncs, so it's with you everywhere.
Microsoft Office SharePoint includes the capabilities you mentioned (version control, archiving, document permission/ownership and search/indexing) and is on par price-wise with KnowledgeTree (though not free). They also have a hosted model, SharePoint Online.
The capabilities you list actually needing--index, search, collaborate and keep and share notes--might be better fit by Microsoft OneNote. It doesn't do version control and document permission/ownership, but it does what you described doing. At my place of business, there are two categories of people: those who love OneNote and those who haven't tried it.
For a basic, low-tech solution I'd suggest TagTeam (http://www.andrew-quinney.com/tagteam.html). It's a basic file tagging utility that makes use of filesystem metadata (PC and Mac), so any changes you make to a given file are immediately visible to others with access to the same file. It also includes a powerful searching language.
Have you tried it? It's quite powerful and free. They have a good tour video here: http://www.dspace.org/about-dspace/DSpace-Video.html
Check out Confluence [http://www.atlassian.com/software/confluence/]- wiki that allows for attachments, etc. - easy to search and use, and comments, version history
Alfresco does everything you require. Why are you looking into other solutions? With Alfresco you can keep track of comments on documents or have a conversation with others regarding the document which is archived along with the document. Alfresco Share allows you to view documents with a flash front end so that you never have to download the documents into Word, Excel or Adobe Reader.
I work in a biotech startup with 12 people total. We have several thousand pdfs, mostly of scientific publications downloaded from places like pubmed, along with some .ppts and .docs and other files.
We use a endnote, a program from the behemoth in this area, thompson research, which has most of hte software in this area.
see http://thomsonreuters.com/products_services/science/science_products/a-z/procite
Based on what I have seen, there is a huge need for software that meets our needs; the thompson products are very $$ and , awfull - a classic case of crappy software with a lot of marketing.
Programs like endnote were created back in the 90s, for DOS machines, and they still look and feel like it, once you get past the pretty home page gui of the software that thompson has added on.
if anyone out there is serious about making a product to compete, give me a hollar
I wonder if Google Wave would be worth looking at for this...?
I use a git repository containing a bibtex file that tells me where the documents are with an annote field containing information. documents are put in the git repository. If I need to annotate them on the paper for not forgeting something about it, I use xournal. And I push everything in the git repository.
It implies that people update the repository which is in my opinion not really a problem.
Try Evernote. It lets you markup everything and make notes on images, documents and everything else. It is multiplatform too.
http://www.evernote.com/
Good luck.
It does everything you want. The drawback is, it is not free software.
http://mendeley.com/
Adobe has some solutions for this problem space built on top of their LiveCycle platform. The Review Comment and Approval solution is all about managing document centric collaboration
http://www.adobe.com/devnet/livecycle/solutions/review_buildingblock.html
I was researching the same thing the other day. I came across a scribd demo where they are associating comments with individual pages and bookmarks with the entire document. So when you click on a bookmark, the viewer takes you to the relevant page. Each time you scroll to a new page, it displays the relevant comment for the page.
Of course their demo has hardcoded bookmarks and comments, but their data structures are clear, the code is readable, and it takes little imagination on how to provide a dynamic PHP back end to make the situation dynamic and persistent. I was able to graft it onto wordpress+Pods without too much difficulty (though I didn't do a good enough job to release into public).
I also was able to somewhat hack a similar thing for the google PDF reader, but because their API is still closed source I chose scribd -- even though it's a bigger pain to upload to scribd, at least I know that their API is supported.
I would still love to have "on document" annotation though, similar to what Acrobat or similar would give you, and I was hoping that since Google was drawing one character at a time that I could pull this off. But again, since it was all unsupported, I walked away from it and kept the suboptimal scribd version.
You're looking for a reference management system, not a document management system. (although, they might not deal with all of the stuff that you mentioned that a document management system will)
Zotero should work for a single person, but if you're trying to do this for an office, you might want to take a look at Aigaion.
If you want to look at others to see what best fits your needs, see:
http://en.wikipedia.org/wiki/Comparison_of_reference_management_software
And , if you still can't find anything -- try asking on the Code4Lib mailing list, as you might need one of the 'integrated' library solutions.
Build it, and they will come^Hplain.
This is exactly the area I've been feeling pain for years, and recently have been working to address. My key innovations are around interface / visualization methods, automation, and collaboration. Please email me at sdw@lig.net if you have a wish / idea list, pointers to interesting related ideas / technology, or want to be a beta tester.
Stephen D. Williams
I use OneNote. Generally, I really like the concept behind the program. However, it has one fatal flaw for a academic environment: its poor ability to handle pdf's. Given that OneNote is a Microsoft program, I have little hope this flaw will ever be fixed.
Currently, I insert pdf's into OneNote as print outs. This makes OneNote deal with each page as a separate image. While the images can be viewed, it is impossible to attach notes or highlights to an image. Markups can be placed over the image of the page, but nothing stops subsequent edits from moving the page image while the markups remain stationary.
So, I am quite interested in finding a better solution and have been reading these replies with interest. It is strange that this category of software, which seems so natural for a computer, has lagged so far behind.
Someone else mentioned Zotero, which looks really good and I'm meaning to try it, once I've cleared out over due projects.
What I have used for quite some time, with great results, is the Firefox extension called Scrapbook. Just select the HTML you want to keep from a web page, and you're nearly done. https://addons.mozilla.org/en-US/firefox/addon/427
I know I will probably get yelled at for posting this but Windows SharePoint Services 3.0 is free for download and does everything you need. You edit documents directly within the Microsoft Office suite. If you have MS Office this might be an option for you. If you are using OpenOffice, then probably not.
--TR
On a Mac, bibdesk wins hands down. It'll store and sort, search, use external editors, etc. Open source, and uses bibtex. http://bibdesk.sourceforge.net/
If you use a Mac and are in a Latex-centric field, I find Bibdesk (http://bibdesk.sourceforge.net/) really great for managing reference pdfs and use cvs or svn if I really want to manage a document I'm working on. There's no annotation in Bibdesk but you can record notes and it generates bibtex for you.
m0nstr42.blogspot.com
I use Bibdesk on the mac, and I like it. Specifically, I like that it organizes all my PDFs into folders and stores all the data in a Bibtex file. The only problem I have with it, is that it stores the paths and macosx aliases and so instead of getting a nice pathname, you get 1500+ characters long hash. I'd really like a way to convert those back to paths so I could migrate in the future if I need to.
I used Mendeley for about 10 minutes, but I was impressed. It looked really good. It's cross platform, and web based. The only reason why I'm not using it is because I already started with Bibdesk, and it just wasn't quite worth converting over. (Again the pathname issues.), but I'd recommend it.
Anything that doesn't support BibTeX is simply a non-starter.
By the standards of many countries outside of the USA, Obama is not a centrist at all - he is well to the right of centrist.
If libertarians are so opposed to effective government, why don't they all move to Somalia?
On OS X, check out Papers (already mentioned in response to OP), Skim (free) for awesomely marking up and notating PDFs, and DEVONthink Pro (optionally, the DEVONthink Pro Office version for OCR and added functionality). I don't think these will provide you with all the functionality you mentioned (version control, however. But Skim and Papers play nicely together.
Jabref? http://jabref.sourceforge.net/ It's open source & cross platform (java). I use it to manage about 1500 articles and related academic texts in a mix of pdf, odt, and doc. You can add notes about files. It can operate as a standalone or can be connected to a shared mysql database (to allow sharing of the files, their cites, and any notes you add). The one thing it can't do directly is annotate the original documents, but you could presumably annotate them using something else before replacing them in the database. Finally, it allows saving of metadata to pdfs, which I think can be used to save your notes about the file to the pdf metadata. Not so useful for non-pdf documents though. Finally, it pipes direct to latex and has a good plugin for openoffice, so if you use openoffice or latex you're in business..
I'm not sure Cuba and Venezuela are great examples of where Obama is to the right... Thanks for playing however!
"I say we take off, nuke the site from orbit. It's the only way to be sure."
Sente (my favorite. From my point of view the one with most features)
http://www.thirdstreetsoftware.com/site/introduction.html
Bookends (great one - my main compalin is the interface)
www.sonnysoftware.com/
Papers (more limited compared to the others, but with the most appealing interface)
only Mac...
Abiword can sufficiently handle most all of the documents you want to manage (pdf support is better but could still use improvement) and you can mark them up and collaborate via abicollab.net. The best part about abiword is that is portable to a large number of platforms including handheld devices (maemo), portableapps.com (for win32 on usb), mac and most *-nix as a package
http://abicollab.net/
http://abisource.com/
That just shows how fucked up they are.
Using such stalwarts of The Left as Stalin, Hitler, Mussolini, Castro, Che Guevara, Chavez, or even Clement Attlee to show that Obama is on the right is silly.
Plus, the lack of democracy in the EU (see Constitution votes, and re-votes), the social decay of England, the nightly car burnings in Paris, the retreat from Freedom of Speech in Canada and the Netherlands, the finances of Greece, etc. etc. are not things to be emulated or to be proud of.
Just because nobody seems to have mentioned. Okular the default kde4 document viewer allows you to annotate any doc it opens, stick post-its on things and add marker highlighting. Presumably this also links in with the document search system which includes tagging via the dolphin file manager. The bit it doesn't appear to do is collaboration but it certainly seems that they're taking it that direction if indeed it isn't just a feature I've not found. Don't really know much about it.
And doesn't require you to download the document either. You can also put documents through a workflow if you need.
How about UpLib, at http://uplib.parc.com/? It's open source and designed explicitly for this purpose. It can store and index PDF, Word, Web pages, photos, email, etc., and supports extensive annotation and cross-linking capabilities. Automatically full-text indexes everything, and uses Lucene for searching. It runs fine on a laptop.
Nuxeo
So far it's one of the best I've tried and it does a pretty great job of extracting all the reference/author data. As a desktop application, for my purposes at least, it seems just about perfect with my only current quibbles (only an hour or so into use) would be 1) the way it's search handles multiple matches within a document (hint: it doesn't) 2) they way it displays matched documents (matches aren't highlighted and must be manually paged/scrolled to).
Those 2 points are kind of important issues for an indexing/search/research tool, but overall I'm still really impressed with the project and features like the folder watch (rather then manually importing new documents) definitely add value.
Of course it's pretty slick too, which is always nice.
Quack, quack.
Onfolio was/is the best software for this, hands down. But M$ bought it and killed it. I'm still using a version I installed in 2007 and have tons of research in it, and I'm nervously awaiting the day when it no longer works with my browser and I have to dump all of my data as an .xml file and figure out what the hell to do with it. So I guess I have nothing to offer except to say: FUCK YOU MICROSOFT.
He's talking about Sweden, France, hell, Britain - the Tories are left of Obama on some issues, and Labor is well to the left.
I have been using Sente. It allows you to sync one library with three copies of Sente on different computers. It also allows some copies to have restricted access, so you can share your libraries with friends. It a way to annotate within a pdf or to annotate the record.
Try http://abicollab.net./
Clicking "open" on the doc automatically loads the doc into abiword, which you can annotate as you like. Clicking "save" in abiword sends it back to http://abocollab.net./
You can easily share and collaborate in real-time. You can tag your docs, share then amongst groups of people etc.
I guess it's not quite what you're after but it's collaboration features might make it work better than you expect.
EMC has a great product but it may be overkill. ApplicationXTender, in its most basic form, can do everything you are looking for. It is not cheap but it can handle any document type (although the built in viewer works only for PDF, TIFF, and MS Office documents to my knowledge). ApplicationXTender can also intgrate with any ODMA compatible application to allow new documents to be indexed and stored within the document management system. It can integrate with many other applications using an Integration module (for example you could index information from an invoice and send it directly into an accounting package as a new accounts payable item to be processed which is really powerfull if combined with a good scanning package such as Kofax which can automatically recognize a document and OCR the index fields it set up for, calculate its level of confidence, and automatically release the document or prompt a user to verify what it has read based on the confidence level). The client can run as a desktop application or as a web appliction.
Annotations can be added without checking a document out. Actually editing an editble document requires checking it out. Annotations can be as simple as highlighting something or as detailed as drawings and text. With press of a button annotation appear or dissapear.
The database back end can be MS SQL (express or full blown), MySQL, Oracle, as well as others. Some add-ons require full blown MS SQL but the basic system can run on a quite a large number of platforms. The software is modular so each add-on service can be put on any supported server. The license server (the most important part of the system since nothing works without it) seems to only run on Windows 2003 Server at this point but 2008 support is supposed to be out in a couple of months. I don't think they ever plan to release it for Linux.
The Web Server (if desired) requires IIS. EMC supports SUSE as a web client so it should work ony any Linux as long as you have Firefox (I did a quick web search and did find some Linux users having trouble trying to access something using Opera and Firefox but the error indicated an old version of ApplicationXTender).
ApplicationXTender is HIPAA complient and has a full audit trail so any changes to anything are logged. I don't let my users change index fields and since only TIFF documents are currently being stored only annotations can be added by a user which does not impact the original document I really have no use for the logs but I can definately see where it could be usefull. Office documents are strored as revisions so you can always revert to an earlier version and know who did what. Additional add-ons can allow full-text indexing (requires SQL Server) and document workflow.
I only use the basic system and use another software package (Kofax) to scan and automatically index documents which is very nice. Kofax can check what it reads against a user defined list which improves accuracy for certain field (such as a clients name). It can also use a database query to fill in information to be indexed which is great for accounting records (look up a vendor ID and fill in the full name for example). I also use another package (PlanetPress) to capture print streams index the document and send them directly into the system. Currently I am only using the system for accounting records and some rather static historical files. I plan to add the workflow module but it is expensive and I have to develop my strategy for capturing mail as it comes into our office (we get tons of it for a company of our size, it takes one person about 2-3 hours to open and sort on an average day, some days it can take all day when we get certain types of monthly bills. The number of different documents that come in also make this difficult to automate. A large portion of our business is record keeping and book keeping for our clients and the payoff on the system is very evident. What used to take hours of manual labor to find and put together can now be done in a matter of seconds.
Emacs!
So, I've been using LabMeeting and BibDesk on my Macs. But since I've got a Droid now, I'm still trying to figure out the best way to read all my papers on that device, and probably an iPad when it comes out. Since Flash is not supported, LabMeeting won't work. Obviously, BibDesk is not supported right now. In fact, I'm guessing most of the sites/programs mentioned in this discussion won't work. So? The only solution I can think of so far to store all my PDF's "in the cloud" is Google Docs or Dropbox. Dropbox might be the winner here, especially since you can get the revision history service. It's not free, but it's cheap enough to be worth it.
If potential document annotation s/w creators reads this: please build in a feature that simulates the ubiquitous yellow marker for PDF documents. Would have made my life much easier studying for the SCJP.
I've installed a combination of Refbase (http://www.refbase.net/index.php/Web_Reference_Database) and Drupal for our biological lab. One of the nice features of refbase is its ability to generate a permanent link to a reference and to store a pdf of it. That link can be pasted into a book in Drupal, so there is easy access to the actual reference. The book format with revisions and comments lends itself very well to group research and discussion. So far I have not found anything else that is as flexible and easy to use.
Insofar as every form of politics and government can be charted on a one-dimensional spectrum, I believe fascism is at least as often equated with the Right as with the Left.
docmeeting.com provides contextual annotations for Web pages, Microsoft Office and PDF files... Using a web-browser, you can very quickly store and annotate relevant web pages as you find them while browsing the Internet. The pages/documents are processed, stored on a centralized web server and can be labeled with "tags" for easy search and retrieval. All documents are enabled for instant collaboration via "in context" comments, directly within the Web browser. This simple access to annotation facilitates collaboration within workgroups. Added comments build value by providing an additional level of relevant information to the stored documents that can't be provided by automated search engines. I have designed this system, it's 12 years old and very unique. It's being used by big corporations (e.g. intranet HR departments) but free for the public.
As a member of the esteemed (please infer sarcasm) faculty of academia myself, I have been employing Drupal [drupal.org] to address this same issue. In my case, I have shifted toward converting my text-based content to plain text and simple HTML based, housed in the database as individual webpage posts. This affords better flexibility with what I can do with the content. Also, there are several file management options with Drupal for the media- and PDF-type of files. However, converting my text documents to web content resolves the issue you are referring to with download > edit > re-upload of each document. I realize it is essentially the same process when you edit an online post, but the process is a lot less cumbersome and can be done from a computer anywhere without downloading special document software to edit the document. What I like best about using Drupal for this is the flexibility to manage permissions, structure topics, tag, and reorganize content among other things. I wouldn't rate it as user-friendly in this regard, but it definitely has some powerful capabilities that I am taking advantage of. Modules in Drupal allow for Wiki-style editing, revision history, discussions, etc. You can also build a referencing and bibliography system associated with the content. I am also beginning to adress a peer-review system too and modules have been developed to focus on a workflow review process.
From what I've heard http://www.agorum.com/ is what you're looking for.
From what I've heard http://www.agorum.com/ is what you're looking for.
Do they have a page in English? I couldn't see one.
When you use condescending statements like, 'Thanks for playing however!' you should really make sure all your ducks are in a row, or you end up looking the fool. Seriously, have you ever traveled outside the US? Read anything about other countries? We are, quite simply, the most right wing country in the entire world.
Other countries have actual socialists. They make our rabid left wingers look like Reagan worshiping free market ideologues. Other country's centrists are our far leftists. No other country has anything like our right wing. At least not as a serious political party. Sure, there are fascist groups everywhere, but only in America are they taken seriously by even a tiny fraction of the populace.
- None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
I'm afraid no, but the program should be multilingual and "downloads" should not be to hard on English readers.
To add a small note to the thread, we've recently launched OQUMA, a Document Management service. It's like Alfresco or Nuxeo, but we are an online service (like KnowledgeTree).
We provide a business specific service to support Quality Management or Environmental Management.
Also, we have an open methodology (under Creative Commons), published in our Wiki. So you can just grab the documents, change, and upload them to create a management system with version control, archiving, document permission/ownership and search/indexing.
Thanks, Anibal
I don't know if this is quite what you're looking for.. but I have found Evernote "The all new Evernote 3.5 for Windows Evernote 3.5 for Windows is completely new. We rewrote it from the ground up to make it faster, more reliable, and just plain better than Evernote for Windows has ever been. Our goal was to use everything we've learned since our launch to make a great Evernote experience on Windows. If you're interested in trying Evernote 3.5, install it from our downloads page ( http://s.evernote.com/windows ). There's a ton more to say, so please read our blog post(s): http://s.evernote.com/win35blog" very useful when I was doing my Open University Course ..Good Luck..mm