Slashdot Mirror


Ask Slashdot: Automated Verification For Uploaded Files?

VernonNemitz writes: There are a lot of ways for hackers to abuse a web site, but it seems to me that one of them is receiving less attention than it deserves. This is the simple uploading of a malware file, that has an innocent file-name extension. I'm looking for a simple file-type verification program that the site could automatically run, on each uploaded file, to test it to see if it is actually the type of file that its file-name extension claims it is. That way, if it ever gets double-clicked, we can be assured it won't hijack the system or worse. At the moment I'm only interested in testing .png files, but I'm sure plenty of web site operators would want to be able to test other file types. A quick Googling indicates the existence of a validator project under the OWASP umbrella, but is it the best choice, and what other choices are there?

5 of 74 comments (clear)

  1. fileutils by i.r.id10t · · Score: 3, Insightful

    Well, if you are running on a Linux of Unix/BSD host, you can use the "file" utility.

    Of course, that means that you need to have shell_exec() or exec() or whatever your programming language of choice uses for running shell commands, and the other security dangers/issues involved with allowing that type of stuff.

    What may be best/easiest/safest would be to NOT allow direct HTTP access to the uploaded files, but rather use a wrapper script that would send appropriate headers to make the browser believe that the file is of the type "x-application/unknown" or whatever content type that will force a "save as" dialog instead of opening with a plugin, auto opening with a local application, etc.

    --
    Don't blame me, I voted for Kodos
  2. Not the right question. by Anonymous Coward · · Score: 3, Insightful

    test it to see if it is actually the type of file that its file-name extension claims it is.

    There are various ways to make "hybrid" files which are multiple types. Graphics files which are also archives, etc. What you really want to do is normalize the files to the type they're supposed to be. PNGs are a good candidate for this because PNG is lossless, so you can decode the image and re-encode it without losing information.

  3. Won't work by Bogtha · · Score: 3, Insightful

    test it to see if it is actually the type of file that its file-name extension claims it is.

    This won't work because a file can be a valid file in multiple formats at once and it can also be an invalid file that is nevertheless interpreted as a valid file as well.

    Take for example, a plain-text file. Harmless, right? Nope. It can also be a valid HTML file containing executable JavaScript. Or an XML file containing a billion laughs attack.

    Or take media type sniffing. Some browsers bend over backwards to interpret crap as HTML even when labelled otherwise by the Content-Type HTTP header. So one attack is to stuff enough HTML into PNG metadata to confuse a browser that doesn't follow the standards into thinking that it's HTML. This is a valid PNG file and anything that checks to see if it's really a PNG file will tell you that much. But it's still not safe.

    --
    Bogtha Bogtha Bogtha
  4. Re:Would be easier to check if potentially harmful by gstoddart · · Score: 4, Insightful

    this is pretty easy in *nix:

    $ file lobotomy.png
    lobotomy.png: PNG image data, 298 x 300, 8-bit/color RGB, non-interlaced

    $ file jetpack.png
    jetpack.png: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

    This bears pointing out.

    UNIX systems have used "magic" for decades, and try to identify based on the actual file contents instead of its name.

    And then Microsoft came along, decided the extension was magic and reliable, and then also decided to hide well known extensions (which created new problems).

    Relying on the file name has pretty much always been a terrible way of dealing with this. Because it became exactly how things targeted people -- because calling .gif.exe hid the .exe part, and people thought it was a .gif.

    Trusting a file name for an operating system to take action has pretty much always been a terrible idea. But, historically, Microsoft has been more focused on dumbing down the system than making it more secure.

    --
    Lost at C:>. Found at C.
  5. Unix 'file' is not sufficient by Techmeology · · Score: 5, Insightful

    Sadly Unix's 'file' utility is not sufficient for security purposes. Generally, file only checks for magic numbers near the beginning of the file. Many file formats remain valid, even with prepended data. For example, Python programs with several source files can be archived into a single zip file and still be executed, but you can stick a shebang onto the beginning, and still have Python (or most zip programs) recognise the archive as a zip file. There's a good video on youtube about this kind of thing: https://www.youtube.com/watch?... tl;dr: This is security. It goes wrong in amusing and unobvious ways.

    --
    Excuse for why is your room always messy?