Slashdot Mirror


Ask Slashdot: How Do You Automatically Sanitize PDF Email Attachments?

First time accepted submitter supachupa writes "It seems the past couple of years that spearfishing is getting very convincing and it is becoming more and more likely someone (including myself) will accidentally click on a PDF attachment with malicious javascript embedded. It would be impossible to block PDFs as they are required for business. We do disable javascript on Adobe reader, but I would sleep a lot better knowing the code is removed completely. I have looked high and low but could not find a cheap out of the box solution or a 'how to' guide for automatically neutralizing PDFs by stripping out the javascript. The closest thing I could find is using PDF2PS and then reversing the process with PS2PDF. Does anyone know of a solution for this that is not too complex, works preferably at the SMTP relay, and can work with ZIPed PDFs as well, or have some common sense advice for dealing with this so that once its in place, there is no further action required by myself or by users."

21 of 238 comments (clear)

  1. Foxit Reader? by Anonymous Coward · · Score: 5, Informative

    As far as I know, Foxit Reader strips out any JavaScript. The PDF readers in Chrome and Firefox also should do the same.

    1. Re:Foxit Reader? by fuzzyfuzzyfungus · · Score: 5, Insightful

      That isn't really 'sanitizing', though: It's certainly good that you practice safe text on your computer; but if you are the mailserver guy, and may or may not have as much control as you'd like over the users and their filthy, weatherbug-encrusted, systems, you want to modify the file such that it no longer contains a potential payload, not merely use a reader that doesn't execute payloads.

    2. Re:Foxit Reader? by Mashdar · · Score: 4, Informative

      I run a ghostscript shell script to print a PDF as a new PDF:

      gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=NEW_FILE.pdf -dBATCH OLD_FILE_1.pdf OLD_FILE_2.pdf

      In this case OLD_FILE_1.pdf and OLD_FILE_2.pdf will be combined into NEW_FILE.pdf. AFAIK this strips javascript.

  2. Print to PDF by digitalhermit · · Score: 4, Informative

    The way I'd do it is to create a dummy printer driver that just writes to a file. Print the PDF to the dummy printer, which in turn creates a new PDF without all the junk.

    1. Re:Print to PDF by Kludge · · Score: 4, Informative

      Like
      lpr -P Cups-PDF file.pdf

    2. Re:Print to PDF by DJ+Jones · · Score: 5, Interesting

      Sadly a lot of PDF printers will retain javascript code even if you print it and re-assemble it back into a PDF. The problem lies in the fact that Adobe allows javascript to be embedded inside image objects and compressed blocks of PDF binary. It's not as simple as opening the file and stripping out anything that starts with . Code can be fired on almost any user event and it can be attached to almost any high-level object. It's not impossible to create a scrubber but it's a lot more complicated than you might think.

      I spent the better part of a week attempting to create a PDF scrubber at my office for this same reason. We had become victim to highly targeted attacks from PDF sources. I wrote a scrubber in PHP using an open-source PDF parser and a series of regular expressions to strip out any javascript. At the end of the day, I came very close to a working solution but I ran into issues with encrypted PDF's.

      The project was shelved in favor of making users open all external PDF's on a virtual server that was hardened and re-imaged every evening to prevent any malicious code from running rampant. That's the simplest solution.

    3. Re:Print to PDF by nicolastheadept · · Score: 4, Funny

      Converting to JPEG? You're a terrible human being.

      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
  3. Be careful modifying documents by Anonymous Coward · · Score: 5, Informative

    You can change the legality of a document for example by modifying it.

    A solution that modifies the PDF viewer is much better than one that alters the document. That means not using Adobe. Pity the company refuses to build a version that doesn't do Javascript in the first place.

    1. Re:Be careful modifying documents by macbeth66 · · Score: 4, Informative

      I believe that for a PDF document to be a legal document, it needs to be in PDF/A format. This format prohibits the use executable code, such as Javascript.

    2. Re:Be careful modifying documents by godrik · · Score: 4, Insightful

      Where does this belief comes from? Why would there be any format requirement on these things? The requirement would need to be in the law or in a court judgment. Is the law going to be that precise over electronic communications? (Not trying to bitch, just really wondering)

    3. Re:Be careful modifying documents by Aaron+B+Lingwood · · Score: 4, Informative

      I believe that for a PDF document to be a legal document, it needs to be in PDF/A format.

      Where does this belief comes from?

      Many states have legislation regarding the font, margins and paper sizes used for some legal documents.

      US courts, archivists and many case management / COPS systems only accept documents in PDF/A.

      --
      [Rent This Space]
  4. Sumatra PDF by shellster_dude · · Score: 5, Insightful

    Check out Sumatrapdf http://blog.kowalczyk.info/software/sumatrapdf/free-pdf-reader.html. It's super fast and does not support javascript or actionscript in PDF's. I use it exclusively now.

  5. javascript? by sjames · · Score: 4, Insightful

    Why in the world is javascript included in PDF documents? PDF is already a Forth like programming language and environment.

  6. You don't by PNutts · · Score: 5, Interesting

    At some point you trust technology and also reinforce proper user behavior. I hate catch-phrases but your e-mail hygiene should have layers of protection (defense in depth). Assuming that the message got through IP reputation filters, SPAM analysis, malware scans, and was delivered to your user, you rely on desktop protection and cross your fingers that nobody opens it.

    We have SMTP appliances from Axway and we used to stop all executable attachments and deliver a notification to the user to call the help desk and request a release. Times changed and we don't do that any more. However, you could annotate the message to remind the user that if they don't know who it's from or what it is or if they weren't expecting it to not open it. And some will anyway. We also used to hold certain attachments for four hours until the virus definitions (and the other defenses) received a couple of updates and then reprocess the message.

    If you do try to roll your own, be aware that everyone and their dog creates PDF files with varying degrees of success and we had certain PDF files that caused services to fail on our gateway while they tried to scan and process them. You didn't mention the volume but make sure your solution scales well.

  7. Re:Why are you doing this? by tftp · · Score: 4, Informative

    Signed PDFs can be read in any reader, but the signature will be still validated (if the reader is not defective.) Encrypted PDFs will not be even readable if they are not encrypted to you. Password-protected PDFs may require the password to be readable, let alone printable or changeable.

    In other words, PDFs are not designed for wanton modification. Some of them can be modified, but others cannot. This means that you cannot build a reliable method for converting suspect PDFs into safe PDFs.

  8. Re: Just block PDFs with javascript by Anonymous Coward · · Score: 5, Funny

    That link is to a PDF! How do I know it's not a trap? Oh, the dichotomy :-(

  9. You Don't by SuperCharlie · · Score: 5, Interesting

    For a long time, I thought like you, that it was my duty to ward off and protect the "children". After a while, you realize 2 things.

    First, it is most likely your duty to inform and educate. Do that. Do it well, do it loud, and do it as often as you can. When someone eventually opens up one of those attachments, it will get around, and peer pressure will make everyone else gun-shy. After a user or two of mine got bit by an attachment, and I had repeatedly warned my users about these things.. I ended up with people at my desk occasionally asking..can you come look at this.. it just looks funny.. it was all about the peer pressure and not wanting to be That Guy who clicked the stupid link.

    Second, and I hate to say it, this is what we do, and this is job security. You can't save em all Hasselhoff, if ya did, there would be nothing left to do..

  10. Other options not always an option by fermat1313 · · Score: 4, Insightful

    Lots of people here saying "Don't use Adobe" and suggesting alternatives. Reality is, for many of us, we deal with complex PDF forms and applications that integrate directly with Adobe Acrobat. In my business (CPA firm) we use lots of applications, and most of them are highly vertical with often just one realistic competitor that can function adequately for a firm our size. Many of our apps integrate directly with Acrobat (and Office) so not using Acrobat simply isn't a choice we can make.

    So how do we deal with Adobe Acrobat? As some pointed out earlier, defense in depth. Spam filters, multiple virus scans, and our two most important measures: End users don't have admin on their computers and Adobe is one of our "High Priority" upgrade applications. Updates must be pushed out within one day of being released.

    BTW, the other other High priority apps are Java and Flash, again, both required by our software. With Acrobat, they make up my "Axis of Evil" of insecure software.

  11. acrobat reader sanitized 100% by jjohn_h · · Score: 5, Informative

    In the install tree find the file JSByteCodeWin.bin and rename it. Works for me.

  12. Summary by supachupa · · Score: 4, Informative
    So the vast majority of people are recommending to ditch Adobe Acrobat, which is not where I was wanting to focus the discussion, but I appreciate your advice. I do agree that using something like Sumatra would be a good part of a defense-in-depth approach, but that approach does not protect your organisation from inadvertently sending out an infected PDF to another organisation.

    I did not know it was possible to detect javascript in a PDF, and I think this is possibly a better approach than a full rewrite (btw: I found this python script: http://blog.didierstevens.com/programs/pdf-tools/ ) So instead of rewriting every PDF, you just choose to delete any PDF attachments that are detected with JavaScript. I assume this will then not break any legitimate PDFs that have comments or forms, etc? It will need testing, I guess.

    The mail relay can then be configured to detect and delete any javascript-containing PDFs and allow everything else through (including encrypted, which is more likely to be legit than not). Once again, this is not the only protection against this malicious code, but just one facet. I found some recent exploits that don't need javascript at all, so it seems the safest, yet most likely to make you hated, approach is to rewrite the PDF completely or not allow PDFs at all.

  13. Ghostscript by nullchar · · Score: 4, Informative

    I use Ghostscript when attempting to compress a "bloated" PDF (such as generated by Xsane). The input is a PDF, output is a PDF:

    # Use ghostscript to re-write the PDF
    gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=new.pdf old.pdf

    Also handy to combine multiple PDFs into a single document, or copy out certain pages from a PDF:

    # Combine PDFs
    gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=combined.pdf 01.pdf 02.pdf 03.pdf

    # Copy pages 3 & 4 from an existing PDF
    gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -dFirstPage=3 -dLastPage=4 -sOutputFile=new.pdf current.pdf