Ask Slashdot: Best PDF Handling Library?
New submitter Fotis Georgatos (3006465) writes I recently engaged in a conversation about handling PDF texts for a range of needs, such as creation, manipulation, merging, text extraction and searching, digital signing etc etc. A couple of potential picks popped up (PDFBox, itext), given some Java experience of the other fellows. And then comes the reality of choosing software as a long term knowledge investment! ideally, we would like to combine these features:
- open source, with a community following ; the kind of stuff Slashdotters would prefer
- tidy software architecture; simple things should remain simple
- allow open API allowing usage across many languages (say: Python & Java)
- clear licensing status, not estranging future commercial use
- serious multilingual & font support
- PDF-handling rich features, not limiting usage for invoicing, e-commerce, reports & data mining
- digital signing should not go against other features
I'd like to poll the collective Slashdot crowd wisdom about if/which PDF related libraries, they have written software with, keeps them happy for *all* the above reasons. And if not happy with that all, what do they thing is the best bet for learning one piece of software in the area, with great reusability across different circumstances and little need for extra hacks? I'd really like to hear the smoked out war stories. It is easy to obtain a list of such libraries, yet tricky to understand whethe people have obtained success with them!
Make this becomes a requirement: support for making PDF/A.
sudo apt-get install wkhtmltopdf
wkhtmltopdf www.google.com google.com.pdf
Yes the Flying Saucer Java library. It is one of the best XHTML to PDF converstion tool.
I'm using a non-free, but source-provided library called Clib-PDF. It's a pretty nice library with a pretty easy API, and even has PHP bindings (so it must've been a viable mainstream choice at one point). But somehow the company (or was it just a single guy) disappeared years ago. Luckily, we paid for and got the source, and I've been able to keep using it (and even fixing things in the source) without any ongoing support. So not quite open source, but not quite the disaster of discontinued closed source.
I suspect that the author of this library sold it to one of the commercial companies who proceeded to shut down a viable competitor. But who knows...
Posted from my Android phone. Oh, I can change this? There, that's better...