Ask Slashdot: How Do You Automatically Sanitize PDF Email Attachments?
First time accepted submitter supachupa writes "It seems the past couple of years that spearfishing is getting very convincing and it is becoming more and more likely someone (including myself) will accidentally click on a PDF attachment with malicious javascript embedded. It would be impossible to block PDFs as they are required for business. We do disable javascript on Adobe reader, but I would sleep a lot better knowing the code is removed completely. I have looked high and low but could not find a cheap out of the box solution or a 'how to' guide for automatically neutralizing PDFs by stripping out the javascript. The closest thing I could find is using PDF2PS and then reversing the process with PS2PDF. Does anyone know of a solution for this that is not too complex, works preferably at the SMTP relay, and can work with ZIPed PDFs as well, or have some common sense advice for dealing with this so that once its in place, there is no further action required by myself or by users."
As far as I know, Foxit Reader strips out any JavaScript. The PDF readers in Chrome and Firefox also should do the same.
The way I'd do it is to create a dummy printer driver that just writes to a file. Print the PDF to the dummy printer, which in turn creates a new PDF without all the junk.
You can change the legality of a document for example by modifying it.
A solution that modifies the PDF viewer is much better than one that alters the document. That means not using Adobe. Pity the company refuses to build a version that doesn't do Javascript in the first place.
Great little app for just such issues.
If you rasterize and re-encapsulate your user's PDF attachments, your users will hate you, and work around your "stupid filter that breaks pdf attachments". You are better off blocking all PDF attachments by email. It'll save yourself a ton of work, and your users can skip the frustration of mangled attachments and go directly to working around your filter.
Signed PDFs can be read in any reader, but the signature will be still validated (if the reader is not defective.) Encrypted PDFs will not be even readable if they are not encrypted to you. Password-protected PDFs may require the password to be readable, let alone printable or changeable.
In other words, PDFs are not designed for wanton modification. Some of them can be modified, but others cannot. This means that you cannot build a reliable method for converting suspect PDFs into safe PDFs.
In the install tree find the file JSByteCodeWin.bin and rename it. Works for me.
I did not know it was possible to detect javascript in a PDF, and I think this is possibly a better approach than a full rewrite (btw: I found this python script: http://blog.didierstevens.com/programs/pdf-tools/ ) So instead of rewriting every PDF, you just choose to delete any PDF attachments that are detected with JavaScript. I assume this will then not break any legitimate PDFs that have comments or forms, etc? It will need testing, I guess.
The mail relay can then be configured to detect and delete any javascript-containing PDFs and allow everything else through (including encrypted, which is more likely to be legit than not). Once again, this is not the only protection against this malicious code, but just one facet. I found some recent exploits that don't need javascript at all, so it seems the safest, yet most likely to make you hated, approach is to rewrite the PDF completely or not allow PDFs at all.
I use Ghostscript when attempting to compress a "bloated" PDF (such as generated by Xsane). The input is a PDF, output is a PDF:
# Use ghostscript to re-write the PDF
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=new.pdf old.pdf
Also handy to combine multiple PDFs into a single document, or copy out certain pages from a PDF:
# Combine PDFs
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=combined.pdf 01.pdf 02.pdf 03.pdf
# Copy pages 3 & 4 from an existing PDF
gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -dFirstPage=3 -dLastPage=4 -sOutputFile=new.pdf current.pdf