Slashdot Mirror


Ask Slashdot: Automated Verification For Uploaded Files?

VernonNemitz writes: There are a lot of ways for hackers to abuse a web site, but it seems to me that one of them is receiving less attention than it deserves. This is the simple uploading of a malware file, that has an innocent file-name extension. I'm looking for a simple file-type verification program that the site could automatically run, on each uploaded file, to test it to see if it is actually the type of file that its file-name extension claims it is. That way, if it ever gets double-clicked, we can be assured it won't hijack the system or worse. At the moment I'm only interested in testing .png files, but I'm sure plenty of web site operators would want to be able to test other file types. A quick Googling indicates the existence of a validator project under the OWASP umbrella, but is it the best choice, and what other choices are there?

74 comments

  1. Would be easier to check if potentially harmful by alzoron · · Score: 2

    It would be simpler to just check if it's executable in some way and then if it has a file extention that doesn't match throw up a red flag.

    1. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 5, Informative

      this is pretty easy in *nix:

      $ file lobotomy.png
      lobotomy.png: PNG image data, 298 x 300, 8-bit/color RGB, non-interlaced

      $ file jetpack.png
      jetpack.png: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

    2. Re:Would be easier to check if potentially harmful by gstoddart · · Score: 4, Insightful

      this is pretty easy in *nix:

      $ file lobotomy.png
      lobotomy.png: PNG image data, 298 x 300, 8-bit/color RGB, non-interlaced

      $ file jetpack.png
      jetpack.png: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

      This bears pointing out.

      UNIX systems have used "magic" for decades, and try to identify based on the actual file contents instead of its name.

      And then Microsoft came along, decided the extension was magic and reliable, and then also decided to hide well known extensions (which created new problems).

      Relying on the file name has pretty much always been a terrible way of dealing with this. Because it became exactly how things targeted people -- because calling .gif.exe hid the .exe part, and people thought it was a .gif.

      Trusting a file name for an operating system to take action has pretty much always been a terrible idea. But, historically, Microsoft has been more focused on dumbing down the system than making it more secure.

      --
      Lost at C:>. Found at C.
    3. Re:Would be easier to check if potentially harmful by Coren22 · · Score: 1

      Wow, did you piss off someone? That AC seems to have it in for you.

      --
      APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
    4. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 0

      You didn't use enough profanity, you imposter.

    5. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 0

      Better to consider it an executable if the executable bit is set.

    6. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 0

      That sounds terrible.

      Hardly, although it has been vulnerable to its own security issues in the past. A combination of the two isn't a bad start, but as soon as you start running a program you didn't write yourself over a file you run the risk of attack.

      OWASP have some good stuff, I'm too lazy to look this particular one up, but if I had to guess the appropriate methods would be:

      1. Check extension (exists, and is valid, confirming no nulls in string)
      2. Manually parse first N bytes from file and compare against protocol signatures (PNG is an easy and obvious one)
      3. Search for executable strings in the file (to *try* and probably fail to defeat Punk Ode and its cousins)
      4. Then run magic across the file

      Four steps, and still four possible ways to infect or overrun the code trying to protect against uploads.

    7. Re:Would be easier to check if potentially harmful by sunderland56 · · Score: 1

      Methinks you don't understand how 'magic' works. It is a tag, embedded inside the file, that identifies the file type.

      Since it is inside the file, it is far harder to change than a simple rename operation.

    8. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 3, Informative

      It's worth noting that this is just a heuristic. A pretty good heuristic for most cases, but a heuristic nonetheless. A file can be a valid-looking PNG and still be malicious. (Heck, it can be valid and still malicious.)

      As far as validity is concerned, if you want to go further than file magic checks, you can parse the uploaded file as the expected type. For example, opening it with a library or utility intended for working with those files.

      A simple PNG check with image magick:
      $ convert png:rot66.png info:-
      rot66.png PNG 116x128 116x128+0+0 8-bit sRGB 15.5KB 0.000u 0:00.069
      [Return status 0]

      $ convert png:rote66.png info:-
      convert: Expected 8192 bytes; found 434 bytes `rote2.png' @ warning/png.c/MagickPNGWarningHandler/1831.
      convert: Read Exception `rote2.png' @ error/png.c/MagickPNGErrorHandler/1805.
      convert: corrupt image `rote2.png' @ error/png.c/ReadPNGImage/4106.
      convert: no images defined `info:-' @ error/convert.c/ConvertImageCommand/3187.
      [Return status 1]

      The file utility thinks rote66.png is a PNG, and it's sorta correct. It has a PNG header at least.

      Again, even if you can load a file as a particular type doesn't mean it's not malicious. That rot66.png file in the example above has a chunk of PHP in it. Using another vulnerability to get it included in a PHP script gives some binary junk and the output of phpinfo when it's run. Parsing and resaving can help stem _some_ of this, but it's not without headaches and caveats of it's own.

    9. Re:Would be easier to check if potentially harmful by bjwest · · Score: 2

      UNIX systems have used "magic" for decades, and try to identify based on the actual file contents instead of its name.

      That sounds terrible.

      And then Microsoft came along, decided the extension was magic and reliable

      Better to use the file extension than it is to load and execute dancingbunny.png.

      No, UNIX systems don't arbitrarily execute just any file. dancingbunny.exe would not run unless it's executable bit is set, except on Windows where the .exe makes it executable, and if that's hidden - well, you have the nightmare that exists due to it. Of course, if you're just clicking icons on your desktop you're in deep shit with either OS.

      --

      --- Keep the choice with the user..
    10. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 0

      An improvement to the above method:


      $ convert rote66.png rote66_converted.png
      $ mv rote66_converted.png rote66.png

      If there exists a well-defined conversion such as an image file -> bitmap -> image file, running such a conversion will strip any possible exploit code inserted into the format.

      Of course, this will not work if the file format by nature is exploitable, such as full-feature XML (see XXE).

    11. Re:Would be easier to check if potentially harmful by dave420 · · Score: 1

      It is nothing of the sort. It iterates through a list of definitions looking for specific values at specific places and assumes a file which matches those to be of the type identified by the definition. They are (supposedly) chosen to be values which can't be changed without breaking the usage of the file, so editing them renders the file useless for its intended purpose. For example, replacing the first two bytes of a Windows executable will cause it to stop being identified as a Windows executable, but then will also stop it from being a Windows executable, as those two bytes are required for execution.

    12. Re:Would be easier to check if potentially harmful by dave420 · · Score: 1

      It gets problematic when the intended library is vulnerable to the malicious payload. If libpng, for example, was broken and decided to arbitrarily execute a malicious payload hidden within the PNG's otherwise-valid data, your approach is just what the attacker wants - get this file parsed by the vulnerable library.

      To combat this, it's a pretty common method for web uploads to be sent directly to a virtual machine which runs a locked-down OS which can be periodically reset, which performs a raft of tests on the accepted file types. As most file uploads are of limited types (i.e. you won't be uploading a tax return file to YouTube) it limits the amount of work required to ensure the file is clean (by thoroughly scanning it).

    13. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 0

      file had a number of security issues, e.g. CVE-2014-9653, CVE-2014-8116, CVE-2014-8117, CVE-2014-9620, so it is not necessarily a good idea to run file on untrusted files.

    14. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 0

      *** SMACK*** is the sound of dave420 going down eating his words getting bitchslapped by apk http://slashdot.org/comments.p...

    15. Re:Would be easier to check if potentially harmful by Anonymous Coward · · Score: 0

      *** SMACK*** is the sound of dave420 going down eating his words getting bitchslapped by apk http://slashdot.org/comments.p...

  2. fileutils by i.r.id10t · · Score: 3, Insightful

    Well, if you are running on a Linux of Unix/BSD host, you can use the "file" utility.

    Of course, that means that you need to have shell_exec() or exec() or whatever your programming language of choice uses for running shell commands, and the other security dangers/issues involved with allowing that type of stuff.

    What may be best/easiest/safest would be to NOT allow direct HTTP access to the uploaded files, but rather use a wrapper script that would send appropriate headers to make the browser believe that the file is of the type "x-application/unknown" or whatever content type that will force a "save as" dialog instead of opening with a plugin, auto opening with a local application, etc.

    --
    Don't blame me, I voted for Kodos
    1. Re:fileutils by Anonymous Coward · · Score: 0

      For an added benefit, you can get an API key for virustotal and check that for the status of files as well.

      One way around the exec permission issue for a server is to save all uploads in the same directory with useful names (could be /var with noexec set). Your processing script can just grab the files from there and do whatever with them and touch the results in a different directory. The webserver just checks for the result in /var/myserver/uploadresults/hash Finally, another benefit is if the webserver or processing script gets owned, finely-tuned permissions and different owners limits the damage that can be done to the server or the files.

    2. Re:fileutils by Anonymous Coward · · Score: 0

      Upload to a non accessible folder with exec turned off.

      Cron job to check for files, use file to check for what it is.

      Depending on the type of file forward then reverse an action to it to 'clean' it.

      If it says it's a zip, unzip it with Linux's unzip then rezip it. Use ImageMagick to clean image files.

      Move the new file to an accessible location.

    3. Re:fileutils by Anonymous Coward · · Score: 0

      "file" utility uses libmagic underneath. You may consider linking your app/script with libmagic.a or libmagic.so directly.

    4. Re:fileutils by zarr · · Score: 2

      Even if you manage to invoke file in a safe manner, you probably shouldn't. The file utility isn't isn't immune to security issues either. A quick google found at least 3 different CVSs from 2014 only. Don't expose stuff that wasn't designed with a hostile Internet in mind, to a hostile Internet. Anyway, if file says it's a png file, it doesn't mean it's a _safe_ png file.

      A paranoid (or sensible, depending on how juicy a target you are) way to handle it is to isolate the thing that verifies the file in some kind of sandbox, either a container or full VM with no access to anything. Pass the file to it and accept nothing back except raw pixel data. On the outside you re-encode it as a .png and pass that along to you users. Afterwards, assume the sandbox is full of nasties. Nuke it from orbit.

    5. Re:fileutils by CanadianMacFan · · Score: 1

      Why would the browser be opening a save as dialog for uploading a file? The question didn't say the the uploaded files were going to be accessible on the website. Many sites don't bother checking images if they are just going to be displaying them on the site.

      I would have the files upload to a designated directory. Once the upload was complete and the user doesn't need to interact with them right away then a message saying that the transfer was successful would be displayed. One of the last jobs of the upload process would be to signal a waiting process (or it could just wait for the process to notice the file that was uploaded) to verify the new file. Verification would check that the file is what it claims to be and also runs an anti-virus program along with anything else needed. If everything is okay then the file is moved into the safe file repository and any database updates done. If there's a problem the file is deleted and an empty file with that file name is created in another directory to indicate an error happened. An error code could be placed in this file. If the user is waiting for the result the page would have been making calls to check these directories. Otherwise another small process would be run to clean up the errors directory which would notify users that their uploads failed.

    6. Re:fileutils by i.r.id10t · · Score: 1

      TFS mentions "double clicking a file" - I took that to mean that someone downloads the file from the server and double clicks it on local machine, or someone is browsign the directory on the server itself (as their local machine) and opens files...

      --
      Don't blame me, I voted for Kodos
    7. Re:fileutils by CanadianMacFan · · Score: 1

      And if someone is browsing the directory on the server, or the files are copied somewhere else for them to do that, then there wouldn't be a save as dialog brought up in the browser. So don't go bringing up TFS with me when you answer your own questioning why there wouldn't be a dialog.

  3. file(1) by Anonymous Coward · · Score: 0

    man 1 file

  4. PHP by darkain · · Score: 2

    In PHP, simply run something like the following against the file and see if you get a valid result back

    http://php.net/manual/en/funct...
    http://php.net/manual/en/funct...

    1. Re:PHP by Anonymous Coward · · Score: 0

      I know nothing about proposition but definitive statement.

      Thanks for the valuable contribution to the conversation dude.

    2. Re:PHP by Anonymous Coward · · Score: 1

      Also

      http://php.net/manual/en/ref.fileinfo.php

    3. Re: PHP by Anonymous Coward · · Score: 0

      Did you look at the links? One was a function to return image size. That's most definitely abusable if used as a "security" protocol.

  5. Magic by Anonymous Coward · · Score: 0

    Just check the first 3 bytes of the file to be what it should be. If you're just worried about png, then check that its PNG. If you're looking for malicious png's its a different ball game, get a virus scanner. If you want other types then get an api for checking the magic number.

    This should be on stackoverflow, not slasdot

    1. Re:Magic by Anonymous Coward · · Score: 0

      The first 3 bytes of a PNG aren't "PNG". The first byte is something else (hex 89) then bytes 2-4 are "PNG".

  6. The file command? by mlts · · Score: 2

    The file command does exactly this. Type in "file foo", it will tell you what it is.

    No need to add any additional software to the Linux box.

    1. Re:The file command? by Anonymous Coward · · Score: 0

      For shell, yes. It's been working for the last decades just fine.
      If running PHP, for example, I usually call getimagesize(), which normally parses and returns an array with image info (width, height, type, and some others), returning false in case it can't parse it.
      This way you can check if it's a valid image, and save it somewhere (ideally with a controlled filename).
      Other languages should have similar ways of dealing with it.

  7. Who needs this? by Anonymous Coward · · Score: 0

    Anybody still using systems that run stuff found on a webpage? Wouldn't such hopeless systems die out from the damage?

  8. RTFM by mi · · Score: 1

    libmagic(3) and file(1). Plus, if you need to tune them, magic(5).

    --
    In Soviet Washington the swamp drains you.
  9. Who doubleclicks files anymore? by Anonymous Coward · · Score: 0

    This sounds like a bizarre use case, or at worst, a college project that someone can't figure out.

  10. Not the right question. by Anonymous Coward · · Score: 3, Insightful

    test it to see if it is actually the type of file that its file-name extension claims it is.

    There are various ways to make "hybrid" files which are multiple types. Graphics files which are also archives, etc. What you really want to do is normalize the files to the type they're supposed to be. PNGs are a good candidate for this because PNG is lossless, so you can decode the image and re-encode it without losing information.

    1. Re:Not the right question. by Wycliffe · · Score: 1

      test it to see if it is actually the type of file that its file-name extension claims it is.

      There are various ways to make "hybrid" files which are multiple types. Graphics files which are also archives, etc. What you really want to do is normalize the files to the type they're supposed to be. PNGs are a good candidate for this because PNG is lossless, so you can decode the image and re-encode it without losing information.

      This is exactly what we did on a production site. We wanted to support several different document types but wanted everything uniform so we use a locked down version of openoffice headless that converts everything doc, txt, png, spreadsheets, etc.. to pdf format. In our case pdf format made the most sense because 99% of the stuff that was suppose to be uploaded was documents and by using openoffice we automatically can support anything that openoffice does. You still have to worry about viruses that affect openoffice but even if that's the case, it's fairly limited because it's unlikely to make it thru the conversion process so it doesn't affect our end users and if they do by some miracle manage to infect our conversion server there is very little on it of any value and they would have to jump through several more hoops to get to somewhere useful.

  11. Valid images can contain scripts by Anonymous Coward · · Score: 0

    Image files can contain metadata. Metadata can contain PHP tags.

    Append a path to an image, and a suitably (mis-)configured server will treat the image file as a PHP script. e.g. if the image is available at .../path/to/image.png

    and you fetch the URL .../path/to/image.png/foo.php

    then the embedded PHP will be executed.

    1. Re:Valid images can contain scripts by Zaiff+Urgulbunger · · Score: 2
      ^ this is really really important!

      But it could be even worse depending on your server configuration. I believe (but I haven't tested) that some Apache configurations can result in unknown file extensions being ignored. So if someone uploads a file named say "myhack.php.foobar" and it is placed in a publicly accessible directory, Apache will ignore the "foobar" extension because it doesn't recognise it, and then decide it's a PHP file, and execute it.

      Also check out Apache content negotiation (and mod_mime while you're at it) and here the you see that index.html.en and index.en.html could all evaluate as index.html and you can see a similar way file naming could potentially be abused.

      The parent post describes how PHP (or any script for that matter) _could_ be injected, but doesn't completely show how it could be executed. The above gives some ideas how that might work.

      You _could_ just test that the file name ends with (.png) and Apache _should_ serve it as "image/png". But that's not secure enough for my liking, so my recommendations are:
      • 1. Don't allow users to define their own file names, or if you do, massively restrict the format to alphanumerics and a single dot png|jpg|gif extension.
      • 2. Set the directory where uploaded files are stored to NOT execute any scripts, so even if everything else fails and some how a script gets in there, it still can't be executed
      • 3. Consider not keeping uploaded files in publicly accessible directories. Instead, use a script as a proxy to read those files and serve them with a specific mime type. Thus Apache won't try to execute them and you can be certain what mime-types are being served
      • 4. Be super careful when the file is uploaded that you don't move it into a public directory BEFORE you validate it otherwise there might be a brief window to try to execute it.

      And lastly, don't leave anything to chance. This is a really risky area that a lot of people screw up! Never be complacent. Always revisit it. Don't rely on server configuration to be correct because it's too easy to set things up, then move/rebuild a server, and then find you're vulnerable. You need multiple layers of defence.

      I have a question to any who anyone who knows - why doesn't Apache demand that PHP scripts have their execute bit set? Because it seems to me that would help quite a bit.

  12. Won't work by Bogtha · · Score: 3, Insightful

    test it to see if it is actually the type of file that its file-name extension claims it is.

    This won't work because a file can be a valid file in multiple formats at once and it can also be an invalid file that is nevertheless interpreted as a valid file as well.

    Take for example, a plain-text file. Harmless, right? Nope. It can also be a valid HTML file containing executable JavaScript. Or an XML file containing a billion laughs attack.

    Or take media type sniffing. Some browsers bend over backwards to interpret crap as HTML even when labelled otherwise by the Content-Type HTTP header. So one attack is to stuff enough HTML into PNG metadata to confuse a browser that doesn't follow the standards into thinking that it's HTML. This is a valid PNG file and anything that checks to see if it's really a PNG file will tell you that much. But it's still not safe.

    --
    Bogtha Bogtha Bogtha
    1. Re:Won't work by Tablizer · · Score: 1

      That's what I was thinking also. One could hide a sinister executable inside an image file, for example. It might look like modern art when projected as an image, but still be a "valid" image from the computer's perspective. The file (or parts of) can be a valid EXE and a valid image at the same time.

      The trick is not to allow "running" a given file in the wrong application on both client and the server. For example, a text file with a script in it is only a text file with a script in it if one views it in a UI widget that displays text as text: the user sees the text of the script.

      But if malware or a sneaky user tricks a system into executing that text file (or portions of) in an interpreter (say Perl or JavaScript), then bad things can happen. Same with the image file example given earlier: one may trick a system into "running" the image file (or portions of) as an executable via clever tricks.

      That's not always easy to prevent, as the breach can be anywhere along the processing and distribution chain. But at least make sure the file extension is consistent with intended use. For example, if a user is only allowed to upload images, then make sure only image extensions are allowed (JPEG, GIF, PNG, etc.)

      You may want to also ban more than one period in the file name because different subsystems may interpret them differently. For example, "foo.jpeg.txt" may be interpreted as an image by one application and a text file in another. You don't want that risk so that it's better to play on the safe side and ban multiple periods.

      Diff apps/levels also interpret URL/URI encoding differently such that "hiding" a period in encoding perhaps also should be banned. (There may be a standard way to interpret such, but you cannot bet on each app following standards.)

    2. Re:Won't work by Anonymous Coward · · Score: 0

      There was a nice talk about this at 31C3:
      https://media.ccc.de/v/31c3_-_5930_-_en_-_saal_6_-_201412291400_-_funky_file_formats_-_ange_albertini

  13. file by Anonymous Coward · · Score: 0

    http://man.cx/file

  14. I like you! by Anonymous Coward · · Score: 0

    Modern app appers know that only apps can app apps, so if everything is uploaded as .app, then apps will app apps while apping other apps!

    Apps!

    I just want to piss up your nose and shit all over your face and smear it into your skin.

  15. The easy way by Kjella · · Score: 1

    If I recall correctly, you have the file in memory before you save it to disk. Check if the first bytes are 0x8950E4E70D0A1A0A and it should be "close enough".I'm not sure if anyone has compiled a list of headers and file extensions, but it seems a little overkill.

    --
    Live today, because you never know what tomorrow brings
    1. Re:The easy way by John+Bokma · · Score: 1

      Uhm, yes, that compilation is used by the file *nix command.

  16. Bobby Tables strikes again by Anonymous Coward · · Score: 1

    And what if there's a semicolon or another interesting character in the filename ?

    1. Re:Bobby Tables strikes again by Anonymous Coward · · Score: 0

      This was solved long ago in various FTP servers. Strip all characters from the input file name that you don't want.

  17. Don't do it by guruevi · · Score: 1

    Don't accept foreign input and put it out as your own (on your web page). It's just a disaster waiting to happen. Misconfigurations or bugs could happen at any point.

    What you do is you take the input and verify that's the input you're expecting. Not just a PDF file or a PNG file but make sure you only accept PDF/PNG and then parse it and rewrite it in a way that takes out any and all foreign input. You're expecting text, only parse text, images, only parse images and parse anything within a jail with limited permissions. If the file is 'broken' or contains any scripts or anything else (it doesn't parse well enough) reject it.

    There are all sorts of manner (called magic) to determine files but they only take a look at the first few bytes and return based on a table. You could easily fool those things and they don't check whether the files are valid or not. Additionally, check for viruses

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
  18. Re:Make everything .app! by Anonymous Coward · · Score: 0

    Yo dawg, is that you?

  19. yes, but directory traversal and buffer dos, so. . by raymorris · · Score: 3, Informative

    This is on the right track, because as others have said, just because it's valid png doesn't mean it's not also valid PHP and Javascript. I just pulled a file like that off a server yesterday.

    HOWEVER, -all- of the "download.php" scripts I've ever looked at have at least two of the same three vulnerabilities. Protection from directory transversal is harder than it looks, fopen_url, and memory depletion from failing to disable the output buffer before reading and writing chunks of the file.

    A better, safer, higher performance option is to RemoveHandler PHP and RemoveHandler cgi-script in the designated upload directory, which should be the only directory that's writeeable.

    A further problem this solves is since the directory is writeable, the designated upload script which checks the files probably is NOT the only mechanism to put files there. Imperfections in other scripts will allow bad guys to upload any file they want, to the world-writeable directory* . Therefore, use httpd.conf to ensure that any scripts in that directory can not run.

    * Instead making it -explicitly- world writeable, you can instead use SuExec, which effectively makes the ENTIRE SITE world-writeable. This is extremely stupid.

  20. Unix 'file' is not sufficient by Techmeology · · Score: 5, Insightful

    Sadly Unix's 'file' utility is not sufficient for security purposes. Generally, file only checks for magic numbers near the beginning of the file. Many file formats remain valid, even with prepended data. For example, Python programs with several source files can be archived into a single zip file and still be executed, but you can stick a shebang onto the beginning, and still have Python (or most zip programs) recognise the archive as a zip file. There's a good video on youtube about this kind of thing: https://www.youtube.com/watch?... tl;dr: This is security. It goes wrong in amusing and unobvious ways.

    --
    Excuse for why is your room always messy?
  21. look out for sql injection as well. by Joe_Dragon · · Score: 1

    look out for sql injection as well.

  22. Wat by Anonymous Coward · · Score: 0

    This doesn't even make sense. In what universe does double-clicking an executable file with a PNG extension cause the file to be executed?

  23. Reverse Proxy by chill · · Score: 1

    Try a reverse proxy with a malware scanning component.

    Or subscribe to the premium service for Virus Total and use the API to check all uploads to your server.

    --
    Learning HOW to think is more important than learning WHAT to think.
  24. Zip of death by OzPeter · · Score: 1

    Zip of death

    Is it a zip file? Yes
    Is it dangerous? Yes

    So how do you test for this without opening the file in a virtual environment and seeing what happens?

    I have a feeling that testing for malicious files is akin to solving the halting problem

    --
    I am Slashdot. Are you Slashdot as well?
  25. Nope by sexconker · · Score: 1

    There's no way to determine what type a file really is. File types are designated in the Windows world by extensions (the .jpg in bigdick.jpg), but applications and other OSes use actual file information (typically the first few / few dozen bytes) of the file to determine what to do with it.

    This typically involves some specific byte sequence, or "magic number", which alerts the OR/program to start trying to read a particular type of header, or tells it the file is big/little endian.

    However, ANY file can contain those strings, and I've run into cases where Office docs have contained the magic numbers for JPG (or was it PNG) and shit got all fucked up. They best you can really do is trust the file extension / mimetype after your virus scanner says it's okay. Then you can TRY to process the file as what it claims to be and handle failures gracefully. If you want to be nice you can try to scan the files for those magic byte sequences, but I gave up on doing that because it's a fucking pain. I just bail out and tell the user to upload working shit, not my problem.

  26. Just check the file headers by etinin · · Score: 1

    When you're talking about PNG, if you're looking to avoid malicious files, you can just check the headers.
    It's always the following decimal values:
    137 80 78 71 13 10 26 10

    Things get more tricky when you're talking about an exploitable file type, in which additional validation is required, but for most purposes, if the file being broken won't ruin the application, this is fine.

    --
    "I decided I could write something better than everything out there in two weeks. And I was right." - Linus Torvalds
  27. About extensions by etinin · · Score: 1

    In addition to the above method, I simply ignore the original filename (and save it somewhere) and rename the file to a random UUID+the auto detected extension (for images you only need a couple of headers, for example).

    --
    "I decided I could write something better than everything out there in two weeks. And I was right." - Linus Torvalds
  28. Automated Air Gap by Anonymous Coward · · Score: 0

    If the rate and resolution of .png's isn't too great I've found an elegant solution is to display the image on one cheap system's monitor (think Raspberry PI 1), with a camera on a 2nd system taking a image whenever the screen isn't blank and performing a OCR on text below for file name and any other metadata. You can wrap the whole thing in cardboard so light saturation isn't an issue. There is a history of image related exploits that have nothing to do with if the file is executable or not that this provides a guard against.

    1. Re:Automated Air Gap by Zaiff+Urgulbunger · · Score: 1

      You need to print it out, have the paper drop into a fax machine, fax to email, and then use that. It's the ONLY way to be sure you've stripped the meta-data!!!! For serious!!!

  29. Easy by JustAnotherOldGuy · · Score: 1

    For image files just convert it to another format at the highest possible resolution and then back again. Maybe an executable could survive that, but I haven't seen one that has yet to get through (and yes, I've tried it with some infected and/or bogus files).

    And yes, I fully admit that it's a sleazy trick but it seems to work pretty well.

    For other files type, I dunno.

    --
    Just cruising through this digital world at 33 1/3 rpm...
  30. Lots of layers to consider by Chuck+Chunder · · Score: 1

    There are several layers here that make a solution quite "interesting". On the one hand you are trying to protect your users by avoiding serving them bad content. On the other hand you want to protect your service. Protecting your users means doing more work on the uploaded content which increases your own attack surface.

    Personally if we are just talking about PNGs then I think that one of the safest things for your clients/customers would be to not serve the file as uploaded, but to serve a file that is the result of a successful render->save process (which might get you a bonus improvement of allowing you to optimise the image). That way you should end up serving a valid image without any dodgy stuff someone may have tried to sneak through. Of course there have been plenty of vulnerabilities in image handling over the years. So reprocessing the images does come with it's own risk that might suggest it's own mitigations (eg doing it on a seperate untrusted server that doesn't have access to anything interesting).

    There might be third party services you could use, but of course that opens up it's own questions in terms of trust, security and availability.

    --
    Boffoonery - downloadable Comedy Benefit for Bletchley Park
  31. Router firmware by AHuxley · · Score: 1

    Read up on the efforts some router and modem brands goto to try and protect their firmware like updates over the life of a product line.
    Signed checksum, private key, verified public key systems.

    --
    Domestic spying is now "Benign Information Gathering"
  32. Re:yes, but directory traversal and buffer dos, so by mcrbids · · Score: 1

    HOWEVER, -all- of the "download.php" scripts I've ever looked at have at least two of the same three vulnerabilities.

    1) Protection from directory transversal is harder than it looks,

    2) fopen_url, and

    3) memory depletion from failing to disable the output buffer before reading and writing chunks of the file.

    I'm a PHP dev, and the first two are relatively straightforward to prevent. EG: Check that basename($file) == realpath(Basename($file)) kind of stuff. But #3 is interesting to me; how would the following cause any problem?

    $fp = fopen($hugefile, 'r');
    while ($line = fgets($fp, 1024))
          echo $line;

    In this case, the buffered output will be spooled to Apache/end user as it fills. Or did you mean OOM errors from trying to load a 2 GB file into RAM?

    --
    I have no problem with your religion until you decide it's reason to deprive others of the truth.
  33. ob_flush() and flush(), Content-Length, x-sendfile by raymorris · · Score: 1

    You need to flush() and ob_flush() after each echo, or PHP will buffer ~ the entire thing in RAM. When a bad guy hits it, he'll have it buffer 100,000 copies in RAM.

    You'll also need to send Content-Length header manually in the PHP, otherwise the header can't be set without buffering the whole file. Compression and encoding can bite you here, so disable compression. Of course you've kinda broken resume, if someone loses their connection halfway through the download. OR ...

    Check out X-Sendfile. That's an all around better. Content-length, compression, partials, HEAD - all of that is already taken care of. If you use an older version of Apache, it will need to be installed as a module.

    As to #2, fopen_url - there are a shit ton of ways that gets exploited, so really the "right" answer, IMHO, is to make it's disabled, then double check the input anyway,

  34. Re:yes, but directory traversal and buffer dos, so by wallyhall · · Score: 1

    This is on the right track, because as others have said, just because it's valid png doesn't mean it's not also valid PHP and Javascript. I just pulled a file like that off a server yesterday.

    Yeah. Some would probably argue it's overkill; and of course it opens a potential new exploit (if imagemagic or the GD library or whatever you use has serious flaw) - but for the really paranoid applications I've worked on, I generate a new image from the old one, using a trusted library. I figure by converting whatever is "valid image format data" into plain RGB(a) and back to image format data again, will get rid of anything seriously nasty.

    --
    I think therefore I am... a Linux geek.
  35. Re: Would be easier to check if potentially harmfu by Guy+Smiley · · Score: 1

    For PNG files specifically, there is a "pngcheck" utility that parses the file and verifies the contents are valid.

    If you want to go a step further, you can use "pngcrush" to parse and repack/compress the file and strip out any extra data chunks that are not required to display the image. That should strip out any malicious or malformed content, and can be run on a sandbox that is not directly accessible, so if there is a compromise of pngcrush or pngcheck the effects can be isolated.