Slashdot Mirror


Ask Slashdot: Automated Verification For Uploaded Files?

VernonNemitz writes: There are a lot of ways for hackers to abuse a web site, but it seems to me that one of them is receiving less attention than it deserves. This is the simple uploading of a malware file, that has an innocent file-name extension. I'm looking for a simple file-type verification program that the site could automatically run, on each uploaded file, to test it to see if it is actually the type of file that its file-name extension claims it is. That way, if it ever gets double-clicked, we can be assured it won't hijack the system or worse. At the moment I'm only interested in testing .png files, but I'm sure plenty of web site operators would want to be able to test other file types. A quick Googling indicates the existence of a validator project under the OWASP umbrella, but is it the best choice, and what other choices are there?

14 of 74 comments (clear)

  1. Would be easier to check if potentially harmful by alzoron · · Score: 2

    It would be simpler to just check if it's executable in some way and then if it has a file extention that doesn't match throw up a red flag.

    1. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 5, Informative

      this is pretty easy in *nix:

      $ file lobotomy.png
      lobotomy.png: PNG image data, 298 x 300, 8-bit/color RGB, non-interlaced

      $ file jetpack.png
      jetpack.png: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

    2. Re:Would be easier to check if potentially harmful by gstoddart · · Score: 4, Insightful

      this is pretty easy in *nix:

      $ file lobotomy.png
      lobotomy.png: PNG image data, 298 x 300, 8-bit/color RGB, non-interlaced

      $ file jetpack.png
      jetpack.png: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

      This bears pointing out.

      UNIX systems have used "magic" for decades, and try to identify based on the actual file contents instead of its name.

      And then Microsoft came along, decided the extension was magic and reliable, and then also decided to hide well known extensions (which created new problems).

      Relying on the file name has pretty much always been a terrible way of dealing with this. Because it became exactly how things targeted people -- because calling .gif.exe hid the .exe part, and people thought it was a .gif.

      Trusting a file name for an operating system to take action has pretty much always been a terrible idea. But, historically, Microsoft has been more focused on dumbing down the system than making it more secure.

      --
      Lost at C:>. Found at C.
    3. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 3, Informative

      It's worth noting that this is just a heuristic. A pretty good heuristic for most cases, but a heuristic nonetheless. A file can be a valid-looking PNG and still be malicious. (Heck, it can be valid and still malicious.)

      As far as validity is concerned, if you want to go further than file magic checks, you can parse the uploaded file as the expected type. For example, opening it with a library or utility intended for working with those files.

      A simple PNG check with image magick:
      $ convert png:rot66.png info:-
      rot66.png PNG 116x128 116x128+0+0 8-bit sRGB 15.5KB 0.000u 0:00.069
      [Return status 0]

      $ convert png:rote66.png info:-
      convert: Expected 8192 bytes; found 434 bytes `rote2.png' @ warning/png.c/MagickPNGWarningHandler/1831.
      convert: Read Exception `rote2.png' @ error/png.c/MagickPNGErrorHandler/1805.
      convert: corrupt image `rote2.png' @ error/png.c/ReadPNGImage/4106.
      convert: no images defined `info:-' @ error/convert.c/ConvertImageCommand/3187.
      [Return status 1]

      The file utility thinks rote66.png is a PNG, and it's sorta correct. It has a PNG header at least.

      Again, even if you can load a file as a particular type doesn't mean it's not malicious. That rot66.png file in the example above has a chunk of PHP in it. Using another vulnerability to get it included in a PHP script gives some binary junk and the output of phpinfo when it's run. Parsing and resaving can help stem _some_ of this, but it's not without headaches and caveats of it's own.

    4. Re:Would be easier to check if potentially harmful by bjwest · · Score: 2

      UNIX systems have used "magic" for decades, and try to identify based on the actual file contents instead of its name.

      That sounds terrible.

      And then Microsoft came along, decided the extension was magic and reliable

      Better to use the file extension than it is to load and execute dancingbunny.png.

      No, UNIX systems don't arbitrarily execute just any file. dancingbunny.exe would not run unless it's executable bit is set, except on Windows where the .exe makes it executable, and if that's hidden - well, you have the nightmare that exists due to it. Of course, if you're just clicking icons on your desktop you're in deep shit with either OS.

      --

      --- Keep the choice with the user..
  2. fileutils by i.r.id10t · · Score: 3, Insightful

    Well, if you are running on a Linux of Unix/BSD host, you can use the "file" utility.

    Of course, that means that you need to have shell_exec() or exec() or whatever your programming language of choice uses for running shell commands, and the other security dangers/issues involved with allowing that type of stuff.

    What may be best/easiest/safest would be to NOT allow direct HTTP access to the uploaded files, but rather use a wrapper script that would send appropriate headers to make the browser believe that the file is of the type "x-application/unknown" or whatever content type that will force a "save as" dialog instead of opening with a plugin, auto opening with a local application, etc.

    --
    Don't blame me, I voted for Kodos
    1. Re:fileutils by zarr · · Score: 2

      Even if you manage to invoke file in a safe manner, you probably shouldn't. The file utility isn't isn't immune to security issues either. A quick google found at least 3 different CVSs from 2014 only. Don't expose stuff that wasn't designed with a hostile Internet in mind, to a hostile Internet. Anyway, if file says it's a png file, it doesn't mean it's a _safe_ png file.

      A paranoid (or sensible, depending on how juicy a target you are) way to handle it is to isolate the thing that verifies the file in some kind of sandbox, either a container or full VM with no access to anything. Pass the file to it and accept nothing back except raw pixel data. On the outside you re-encode it as a .png and pass that along to you users. Afterwards, assume the sandbox is full of nasties. Nuke it from orbit.

  3. PHP by darkain · · Score: 2

    In PHP, simply run something like the following against the file and see if you get a valid result back

    http://php.net/manual/en/funct...
    http://php.net/manual/en/funct...

  4. The file command? by mlts · · Score: 2

    The file command does exactly this. Type in "file foo", it will tell you what it is.

    No need to add any additional software to the Linux box.

  5. Not the right question. by Anonymous Coward · · Score: 3, Insightful

    test it to see if it is actually the type of file that its file-name extension claims it is.

    There are various ways to make "hybrid" files which are multiple types. Graphics files which are also archives, etc. What you really want to do is normalize the files to the type they're supposed to be. PNGs are a good candidate for this because PNG is lossless, so you can decode the image and re-encode it without losing information.

  6. Won't work by Bogtha · · Score: 3, Insightful

    test it to see if it is actually the type of file that its file-name extension claims it is.

    This won't work because a file can be a valid file in multiple formats at once and it can also be an invalid file that is nevertheless interpreted as a valid file as well.

    Take for example, a plain-text file. Harmless, right? Nope. It can also be a valid HTML file containing executable JavaScript. Or an XML file containing a billion laughs attack.

    Or take media type sniffing. Some browsers bend over backwards to interpret crap as HTML even when labelled otherwise by the Content-Type HTTP header. So one attack is to stuff enough HTML into PNG metadata to confuse a browser that doesn't follow the standards into thinking that it's HTML. This is a valid PNG file and anything that checks to see if it's really a PNG file will tell you that much. But it's still not safe.

    --
    Bogtha Bogtha Bogtha
  7. yes, but directory traversal and buffer dos, so. . by raymorris · · Score: 3, Informative

    This is on the right track, because as others have said, just because it's valid png doesn't mean it's not also valid PHP and Javascript. I just pulled a file like that off a server yesterday.

    HOWEVER, -all- of the "download.php" scripts I've ever looked at have at least two of the same three vulnerabilities. Protection from directory transversal is harder than it looks, fopen_url, and memory depletion from failing to disable the output buffer before reading and writing chunks of the file.

    A better, safer, higher performance option is to RemoveHandler PHP and RemoveHandler cgi-script in the designated upload directory, which should be the only directory that's writeeable.

    A further problem this solves is since the directory is writeable, the designated upload script which checks the files probably is NOT the only mechanism to put files there. Imperfections in other scripts will allow bad guys to upload any file they want, to the world-writeable directory* . Therefore, use httpd.conf to ensure that any scripts in that directory can not run.

    * Instead making it -explicitly- world writeable, you can instead use SuExec, which effectively makes the ENTIRE SITE world-writeable. This is extremely stupid.

  8. Unix 'file' is not sufficient by Techmeology · · Score: 5, Insightful

    Sadly Unix's 'file' utility is not sufficient for security purposes. Generally, file only checks for magic numbers near the beginning of the file. Many file formats remain valid, even with prepended data. For example, Python programs with several source files can be archived into a single zip file and still be executed, but you can stick a shebang onto the beginning, and still have Python (or most zip programs) recognise the archive as a zip file. There's a good video on youtube about this kind of thing: https://www.youtube.com/watch?... tl;dr: This is security. It goes wrong in amusing and unobvious ways.

    --
    Excuse for why is your room always messy?
  9. Re:Valid images can contain scripts by Zaiff+Urgulbunger · · Score: 2
    ^ this is really really important!

    But it could be even worse depending on your server configuration. I believe (but I haven't tested) that some Apache configurations can result in unknown file extensions being ignored. So if someone uploads a file named say "myhack.php.foobar" and it is placed in a publicly accessible directory, Apache will ignore the "foobar" extension because it doesn't recognise it, and then decide it's a PHP file, and execute it.

    Also check out Apache content negotiation (and mod_mime while you're at it) and here the you see that index.html.en and index.en.html could all evaluate as index.html and you can see a similar way file naming could potentially be abused.

    The parent post describes how PHP (or any script for that matter) _could_ be injected, but doesn't completely show how it could be executed. The above gives some ideas how that might work.

    You _could_ just test that the file name ends with (.png) and Apache _should_ serve it as "image/png". But that's not secure enough for my liking, so my recommendations are:
    • 1. Don't allow users to define their own file names, or if you do, massively restrict the format to alphanumerics and a single dot png|jpg|gif extension.
    • 2. Set the directory where uploaded files are stored to NOT execute any scripts, so even if everything else fails and some how a script gets in there, it still can't be executed
    • 3. Consider not keeping uploaded files in publicly accessible directories. Instead, use a script as a proxy to read those files and serve them with a specific mime type. Thus Apache won't try to execute them and you can be certain what mime-types are being served
    • 4. Be super careful when the file is uploaded that you don't move it into a public directory BEFORE you validate it otherwise there might be a brief window to try to execute it.

    And lastly, don't leave anything to chance. This is a really risky area that a lot of people screw up! Never be complacent. Always revisit it. Don't rely on server configuration to be correct because it's too easy to set things up, then move/rebuild a server, and then find you're vulnerable. You need multiple layers of defence.

    I have a question to any who anyone who knows - why doesn't Apache demand that PHP scripts have their execute bit set? Because it seems to me that would help quite a bit.