Slashdot Mirror


Deserialization Issues Also Affect .NET, Not Just Java (bleepingcomputer.com)

"The .NET ecosystem is affected by a similar flaw that has wreaked havoc among Java apps and developers in 2016," reports BleepingComputer. An anonymous reader writes: The issue at hand is in how some .NET libraries deserialize JSON or XML data, doing it in a total unsecured way, but also how developers handle deserialization operations when working with libraries that offer optional secure systems to prevent deserialized data from accessing and running certain methods automatically. The issue is similar to a flaw known as Mad Gadget (or Java Apocalypse) that came to light in 2015 and 2016. The flaw rocked the Java ecosystem in 2016, as it affected the Java Commons Collection and 70 other Java libraries, and was even used to compromise PayPal's servers.

Organizations such as Apache, Oracle, Cisco, Red Hat, Jenkins, VMWare, IBM, Intel, Adobe, HP, and SolarWinds , all issued security patches to fix their products. The Java deserialization flaw was so dangerous that Google engineers banded together in their free time to repair open-source Java libraries and limit the flaw's reach, patching over 2,600 projects. Now a similar issue was discovered in .NET. This research has been presented at the Black Hat and DEF CON security conferences. On page 5 [of this PDF], researchers included reviews for all the .NET and Java apps they analyzed, pointing out which ones are safe and how developers should use them to avoid deserialization attacks when working with JSON data.

99 of 187 comments (clear)

  1. Simpler solution by BarbaraHudson · · Score: 1, Insightful

    Just don't use JSON or XML. You can thank me later.

    --
    "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    1. Re:Simpler solution by vadim_t · · Score: 1

      So what do you recommend instead?

    2. Re:Simpler solution by hord · · Score: 5, Interesting

      JSON or YAML are probably both fine. XML is simply wasteful and unnecessary. Personally I think we should be using something like s-expressions (lisp-like). People hate them because of the parens but every other encoding has as many negative points in different ways. The advantage is that the syntax is far simpler to understand and parse leading to safer software. Some might say that having an "executable" format is bad but I'd point to bugs like this as being proof that even "text" formats are just executables in disguise. The Lisp creed is "data is code" and I've come to agree.

    3. Re:Simpler solution by vadim_t · · Score: 1

      That doesn't really answer the question I asked.

    4. Re: Simpler solution by alvinrod · · Score: 3, Funny

      XML is like violence. If it's not solving your problem, you're not using enough.

    5. Re: Simpler solution by Anonymous Coward · · Score: 1

      ASN.1

    6. Re:Simpler solution by Applehu+Akbar · · Score: 2

      Serialization without using one of these standards is going back to the bad old days of proprietary silos. You must work for Sony.

    7. Re:Simpler solution by Tablizer · · Score: 1

      web apps. Especially ones that need so many 3rd party libraries that they can never be secure.

      But PHB's want their shiny dancy UI/UX toys or they won't pay you.

    8. Re:Simpler solution by Sloppy · · Score: 1

      I understand why you'd recommend against JSON since it was originally intended to be an expression (and some fuckwits would eval() it) rather than really intended to do quite the same thing as, say, Python's pickles. But what's the beef with XML?

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
    9. Re:Simpler solution by angel'o'sphere · · Score: 4, Informative

      The serialization format has nothing to do with the deserialization vulnarabilities.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    10. Re: Simpler solution by skids · · Score: 1

      (mandatory missing sarcasm tag warning)

      Not that many developers would base a decision on an AC slashdot post, but...

    11. Re: Simpler solution by Anonymous Coward · · Score: 1

      YAML. XML without pretending to solve a bunch of problems it doesn't solve.

    12. Re:Simpler solution by hey! · · Score: 1

      Yep. Gin up your own solution with the exact same security flaws.

      I don't care how smart you are; everyone else is collectively smarter than you are. From a security standpoint you want to use popular frameworks that take security seriously and respond to the inevitable exploits promptly. Doing things in an idiosyncratic way is not protection because (a) systems can be probed using black-box methods like fuzzing and (b) chances are your way of doing it has been used thousands of times before.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    13. Re: Simpler solution by WaffleMonster · · Score: 1

      I've never understood why PUT things/1/subthing/52 is somehow better than POST /api?thing=1&subthing=52. And the second one works without over complicated mod rewrite rules (though you could certainly add a very simple one to decouple your filesystem from your app).

      It's because the Internet meme machine and lemming followers continually confuse exceptionally poor implementations of useful concepts for progress.

      Selling point of REST (via HTTP) was simplicity + reuse. Having objectively failed to deliver on both accounts vs. coherently designed HTTP APIs REST is a nonstarter to even consider at this point. Nobody wants to deal with it.

    14. Re:Simpler solution by hey! · · Score: 1

      Libraries are neither here nor there. This is 2017, not the 1970s. To build the kind of apps people want today to run on the platforms they use, you're using a framework, and it's going to be huge and complex.

      Now sure, we still use libraries. And sure, if you are talking about a small, simple library that will never handle information from a source you don't trust. by all means gin up your own if that's easier for you. But if you're gluing a javascript browser app to a server back end, if you're not using JSON (or a mature alternative like ASN.1), you're going to end up recreating a substantial subset of it, not as well.

      Now as for "provable" -- that's laughable when it comes to security, because security is a non-functional requirement domain. It's the behavior of the system in *abnormal* situations that's of concern; situations you haven't imagined yet. It doesn't matter how smart you are, you just don't have time to dream up all the things people might do.

      Anyone can convince themselves that they're a security genius, because anyone can gin up security measures they *themselves* can't break.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    15. Re: Simpler solution by BarbaraHudson · · Score: 1

      Au contraire, I did not say "do nothing". So STFU until you learn how to read, svp.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    16. Re:Simpler solution by BarbaraHudson · · Score: 1

      ... says the anonymous coward. Next you'll be spouting nonsense about how the W3C did the world a huge solid with emojis.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    17. Re:Simpler solution by BarbaraHudson · · Score: 1

      Maybe because the question doesn't really pose a question. Define the problem you're trying to solve; without that, how would you make any recommendations? Crystal ball? Mind reading? Time machine?

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    18. Re:Simpler solution by phantomfive · · Score: 1

      Your comment is prescient and face-palm worthy.......because it is clear and succinct.

      Face-palm worthy because a few years ago, a lot of these bugs were found in XML Java deserializers. A lot of people said, "Don't use XML! It's insecure!" then went off to write the same frameworks, but using JSON instead. They ended up with all the same bugs.

      I guess next people will rewrite them in YAML or binary.....nah, binary is scary, you never know what people could put in there!

      --
      "First they came for the slanderers and i said nothing."
    19. Re:Simpler solution by BarbaraHudson · · Score: 1

      What the hell are you talking about? Can you be any more off-topic?

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    20. Re:Simpler solution by BarbaraHudson · · Score: 1

      Nonsense. There are plenty of ways to store data and transmit it that aren't proprietary. It's not like this is unique to xml and json.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    21. Re:Simpler solution by BarbaraHudson · · Score: 1

      So go work somewhere else. It's not like you have to work for an idiot. There are at least 50 ways.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    22. Re:Simpler solution by BarbaraHudson · · Score: 1

      You can close your eyes. Won't change the fact that xml was a bad idea adopted by the w3c "because". Same as emojis.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    23. Re:Simpler solution by BarbaraHudson · · Score: 1

      Seriously? It's not like there weren't plenty of ways to store data that were far less verbose, more self-documenting, and took up less space and cpu both to create and search through.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    24. Re:Simpler solution by BarbaraHudson · · Score: 1

      Where is this "default binary format" you speak of? It sure wasn't the default for anyone with any brains who was storing and searching data before xml or json.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    25. Re:Simpler solution by Tony+Isaac · · Score: 1

      I remember life before XML or JSON. It wasn't pretty. I've reverse-engineered the .doc and .xls file formats. It was a time when everybody made up their own file formats, and there were no libraries to help you read and write those formats. No, thank you, I'll live with the potential serialization issues.

    26. Re:Simpler solution by BarbaraHudson · · Score: 1

      Obviously not that retarded, because the old solutions worked, and it was far quicker to implement and debug than "a few days".

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    27. Re:Simpler solution by BarbaraHudson · · Score: 1

      And there's your problem - you or your user was using a shitty format. This is a long-solved issue. Even plain text or SDF or tab-delimited or fixed field width are quick and easy to implement, and variable-field-width can also be made self-documenting with just a bit of work. All are far easier to implement than xml or json, and if it's become corrupted, you'll usually be able to see exactly where pretty quickly and recover everything else.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    28. Re: Simpler solution by PmanAce · · Score: 1

      So what would you do instead of XML or json?

      --
      Tired of my customary (Score:1)
    29. Re:Simpler solution by Tablizer · · Score: 1

      Leaving idiots doesn't scale.

    30. Re: Simpler solution by Anonymous Coward · · Score: 2, Funny

      You haven't used XML until you had to decode base64 encoded xml documents stored in xml attributes of a different xml document.

    31. Re:Simpler solution by Bongo · · Score: 1

      > I don't care how smart you are; everyone else is collectively smarter than you are.

      Provably not true. If you want a high-quality library with clean interfaces, make sure it is the work of a single smart person.

      As the first statement is oversimplified to make a point, maybe there's a better way to write it (given I am now part of the "everyone else who is smarter"):

      How many skilled people hours have already been spent on project x which were focussed on solving the quality issues, compared to the hours I can spend on it now? And are my own skilled hours following a very similar approach to the one they used, or am I consciously or accidentally pursuing a different approach which may lead to a different, perhaps better outcome? And am I holding this as an open question where there is no predefined "right" answer?

    32. Re:Simpler solution by vadim_t · · Score: 1

      The problem is exchanging tabular and hierarchical data structures, containing arbitrary values.

      So for instance the simplest of such structures is a table of id, state, name, description. The description field can of course contain arbitrary characters including quotes, commas and newlines.

      Sometimes there's metadata for the table. For instance think of the results of a mysql query: You want a table of the results, but there's also a list of the datatypes of each column, plus the time it took to answer the query.

      A more complex one is a directory tree structure, with the usual metadata.

      Of course in both cases there's the requirement to be able to add additional fields later, while retaining backwards compatibility.

    33. Re:Simpler solution by vadim_t · · Score: 1

      The parent is talking about .doc and .xls formats. These are absolutely not suitable for something as simple as tab or fixed field formats. They can contain arbitrary data like embedded images and videos. They have a very complex markup system. They have features like versioning, scripts, and oodles of metadata. They have to deal with arbitrary data of arbitrary length. They can attach arbitrary amounts of parameters to some piece of text. .doc and similar is one of the few cases where XML is actually not overkill for the task, because XML was made to solve precisely that sort of problem in the first place.

    34. Re: Simpler solution by Dog-Cow · · Score: 1

      In context, we are talking about data that is executed as code on purpose, you worthless bag of shit.

    35. Re:Simpler solution by Dog-Cow · · Score: 1

      If someone ripped your brain out and placed it on your dinner plate, no one would notice. That's how completely useless and unattractive both you and your cooking is.

    36. Re:Simpler solution by Dog-Cow · · Score: 1

      You are such a stupid shit that it's amazing that you can even string together coherent phrases. Or is someone ghost-writing for you?

    37. Re:Simpler solution by DarkOx · · Score: 1

      I will agree the XML is highly over prescribed. It is however useful in situations that do require heterogeneous systems to exchange complex and potentially changing data structures where changes cannot be 100% coordinated.

      That said XML is often hobbled for security reasons such that applications don't actually process DTDs etc. If you are doing those things you giving up a lot of the flexibility while keeping most of the complexity. You probably should be asking from a design perspective if perhaps XML isn't the correct solution.

      I don't understand you objections to JSON. Its easy to parse safely and for smallish data structures its easy enough for humans to understand. Virtually every ecosystem has tools for JSON parsing so you won't be left reinventing the wheel for some integration task. TML and YAML are good for more complex structures without introducing the complexity of XML. They might not as widely supported though, but even if you had to develop your own parser neither would be difficult.

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    38. Re:Simpler solution by bad-badtz-maru · · Score: 1

      Sure, you can use some crufty protocol like X12 EDI, which will help you understand the benefits of XML.

    39. Re: Simpler solution by bad-badtz-maru · · Score: 1

      And signed, don't forget that the inner document is signed to truly enable misery. See IBM Datapower appliance for that joy.

    40. Re:Simpler solution by angel'o'sphere · · Score: 1

      Bugs in XML deserialization don't allow for arbitrary code execution.
      Neither does JSON or YAML.

      So, what exactly would be the attack vectors (in a VM) via text only (de)serialization?
      I mean: buffer overflows, putting code on the stack or changing return adresses for JSRs obviously are impossible.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    41. Re:Simpler solution by phantomfive · · Score: 1

      The main thing is when you deserialize an object that has a constructor that does something (or a setter or a getter that does something). Since there are many objects of this type in the Java/C# standard library, an attacker can send a serialized copy of one of these objects, and send it over the wire. The deserializer will happily deserialize it.

      Buffer overflows are kind of rare these days. Because of things like ASLR, they are hard to exploit. It's mainly about logic bugs of various types.

      --
      "First they came for the slanderers and i said nothing."
    42. Re:Simpler solution by Gr8Apes · · Score: 1

      Your problem is incorrectly proposed. XML is ALWAYS a bad solution, at least for communication transports. It's a fine solution for markup, which is the realm it grew out of. I disagree with the json statement, however. json is fine for a sensibly specced communications protocol. I personally think that if your communications protocols can't be easily done as json, then it better be a closed distributed monolithic system, because otherwise you're in for a world of hurt.

      --
      The cesspool just got a check and balance.
    43. Re:Simpler solution by angel'o'sphere · · Score: 1

      And exactly that e.g is the reason why 'standard' deserialization of objects in Java/JVM does neither use ctors nor setters.
      No idea about .Net

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    44. Re:Simpler solution by phantomfive · · Score: 1

      Which one is the standard deserializing library in Java?

      --
      "First they came for the slanderers and i said nothing."
    45. Re:Simpler solution by Gr8Apes · · Score: 1

      Perhaps the better answer is to not try to instantiate an arbitrary object directly.

      --
      The cesspool just got a check and balance.
    46. Re:Simpler solution by vadim_t · · Score: 1

      I agree that XML is usually insanely overkill for most purposes. Still, there are worse choices than insanely overkill, such as trying to shoehorn a complicated hierarchical data structure into something like TSV, CSV or a fixed length format, as BarbaraHudson seems to be proposing in this discussion.

    47. Re:Simpler solution by Gr8Apes · · Score: 1

      I agree that XML is usually insanely overkill for most purposes. Still, there are worse choices than insanely overkill,

      CORBA comes to mind, or EDI, both of which suck hugely for different reasons. If I never have to see either one again it will be too soon.

      The real point for a heterogeneous environment is that you need to look at the basic units you have in common across all players, and then design with those limitations in mind. One of the first and major stumbling blocks for most is that the data representation may vary across the components, and some may have a concept radically different that even the minimum required by the system as a whole.

      --
      The cesspool just got a check and balance.
    48. Re:Simpler solution by angel'o'sphere · · Score: 1

      The build in ObjectOutputStream and ObjectInputStrream.

      They allow serialized objects to either implement java.io.Serializable or java.io.Externalizable

      https://docs.oracle.com/javase...
      https://docs.oracle.com/javase...

      ( Why google finds the 7 version and not the 8 as first hits is beyond me :D )

      The vulnerability comes from the option to overwrite "readObject()". Serialized data objects contain usually the classes as well. So when you read them, you also read and link the code, and hence use the supplied "readObject()" method.

      However the vulnerability in the Apache.Commons libraries was a different one (don't remember right now how exactly), they exploited a bug in the library, so you could sent "code" without sending really a classfile.

      https://blogs.apache.org/found...

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    49. Re:Simpler solution by Tony+Isaac · · Score: 1

      In fact, the new docx and xlsx formats are implemented in XML.

      There are many data sets that don't work well as CSV. Anything, for example, that has one-to-many relationships such as customer order history with names, addresses, billing info, etc., doesn't work well as CSV. That's the whole point of XML / JSON--you can easily store and retrieve data sets that are more complex than a spreadsheet. And that is just about everything.

    50. Re: Simpler solution by BarbaraHudson · · Score: 1

      Depends on the problem to be solved. There's no such thing as a one size fits all solution, unless it's "one size fits all - badly"

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    51. Re:Simpler solution by BarbaraHudson · · Score: 1
      Go look at the ascii table. You have 4 control characters that are used expressly for schemes to transmitting data that can be composed of fields of variable length (FS - 0x1c - the File Separator, GS - 0x0d - the group separator, RS -0x1e the record separator, and US - 0x1f - the unit separator. You can construct tables constructed of variable or fixed length units ended with US (today we'd all them fields). Each row of fields would be ended with an RS - the record separator. A table or collection of rows would be ended with GS - the group separator. And you can stuff multiple "tables" into one "file", delimited with the FS.

      There's no reason why multiple "files" can't be stored in a single physical file, so you could have multiple databases, each with multiple tables, that you can write to a single stream when you want to serialize, and read from a single stream when you want to unserialize.

      You can also adopt a "standard" that makes the first row of every table the field names, and the second row the data types.

      Or whatever else you want. This has been around since 7-bit ASCII and it worked just fine. Or you could just store a bunch of text files (such as email) linearly, separated by whichever control character you wanted. It's the same situation as xml - you still have to define your data format - but there's a lot less tag soup.

      Also, if you use fixed-width units or fields, it's trivial to do a bsearch to find an individual row in a "table" if it's sorted on that field. Unlike Microsoft's "binary xml."

      Why reinvent the wheel?

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    52. Re:Simpler solution by BarbaraHudson · · Score: 1

      Then just send them as one big binary blob with a list of offsets, sizes, and file names to the beginning of each separate file as a virtual header using a tab between each offset, size, and file name, followed by a cr. Be a hell of a lot more compact, and extraction of an individual file is as simple as an lseek to the offset, read the size and filename and read(size) number of bytes. Modify as needed, and you can store ANYTHING pretty much in its original form. XML is not needed. Same as emojis. The world would be a better place if neither existed.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    53. Re:Simpler solution by BarbaraHudson · · Score: 1

      Arbitrary data serialization was solved back before the PC was invented. See the following ASCII control codes

      0x1c - FS - File separator The file separator FS is an interesting control code, as it gives us insight in the way that computer technology was organized in the sixties. We are now used to random access media like RAM and magnetic disks, but when the ASCII standard was defined, most data was serial. I am not only talking about serial communications, but also about serial storage like punch cards, paper tape and magnetic tapes. In such a situation it is clearly efficient to have a single control code to signal the separation of two files. The FS was defined for this purpose.

      0x1d - GS - Group separator Data storage was one of the main reasons for some control codes to get in the ASCII definition. Databases are most of the time setup with tables, containing records. All records in one table have the same type, but records of different tables can be different. The group separator GS is defined to separate tables in a serial data storage system. Note that the word table wasn't used at that moment and the ASCII people called it a group.

      0x1e - RS - Record separator Within a group (or table) the records are separated with RS or record separator.

      0x1f - US - Unit separator The smallest data items to be stored in a database are called units in the ASCII definition. We would call them field now. The unit separator separates these fields in a serial data storage environment.

      If any of these control characters appear in your data, escape them as you serialize the data, the same as in c. It's not that complicated, and it works well with multiple databases, each with their own collection of tables, and each individual table having its own set of records, and each record having its own set of fields.

      And the fields don't have to be fixed size.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    54. Re:Simpler solution by BarbaraHudson · · Score: 1

      No more stupid than the folks who solved serialization back in the 60s using 0x1c, 0x1d, ox1e, and 0x1f to store multiple databases each with their own tables in a single file. And certainly not stupid enough to throw out a solution that was simple and worked for a piece of shit just because it's trendy.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    55. Re:Simpler solution by BarbaraHudson · · Score: 1

      This problem was solved in the 60s. See my comments here and here.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    56. Re:Simpler solution by Tony+Isaac · · Score: 1

      There is nothing that is inherently more secure about ASCII control codes over XML or JSON. And it's inherently less human-readable. There's a reason the world has moved on past ASCII control codes!

    57. Re:Simpler solution by vadim_t · · Score: 1

      First of all, all those meanings of those ASCII symbols are long obsolete and forgotten. Just because there's a character called "file separator" doesn't mean anybody uses it for any purpose, except perhaps some fossilized piece of software for dealing with tape drives from the 60s.

      Second, such an approach would suffer from various problems. For instance you obviously can't use such characters without escape sequences, which means you can't just stick a file in between file separator characters, you've got to process and escape it first, which is inconvenient as hell.

      Third, it's generally an inconvenient and annoying way of doing things. Besides escape sequences such a system means you don't know how much memory to allocate in advance, which is annoying to deal with.

      Fourth, it's not really suitable for dealing with hierarchical structures and nested pieces of information.

      Fifth, it's all a binary mess that's absolutely not self-documenting in any manner, not editable by hand, not humanly understandable without extensive skill with the very specific format, fragile, lacking in validation and error reporting without doing extensive work on that, the list goes on.

      Sixth, characteristics like extensibility and backwards compatibility are often very tricky in such formats, and make them full of bizarre hacks for those reasons.

      Seventh, all of this stuff requires a lot of specific parsing code that's boring to write and very error prone.

      And really, for what gain? If you want something easy and self-documenting, there's JSON. Key-value pairs in plain text are readable and editable by humans, the structure of the data is visible with the unaided eye, it's easy to skip past uninteresting sections, and it's often possible to do useful work with zero documentation.

      XML has some features that are well fit for various complicated needs. For instance namespaces allow nesting one document inside another, while DTDs provide a formal definition of what's supposed to go where. Now isn't that neat -- you can write a formal specification than any user, in any language can check a document against without writing thousands of lines of validation logic by hand. XPath allows to make queries against a document -- if you just want to say, extract the title of a document, you don't even need to write code that walks through the structure of it.

      XML may be somewhat ugly to the eye and has features that aren't needed for most purposes, but if you need the complication it's far better than some binary mess.

      Things like JSON and XML were created because binary formats were an enormous pain in the ass, and because for some uses, the benefits greatly outweigh the inefficiency. In modern times, programmer time is far more expensive than memory, CPU or disk space, so turning a document from a compact 1K of binary data into 32K of human readable text is extremely often a very good tradeoff.

  2. Real Developers never Deserialize into objects by zifn4b · · Score: 4, Informative

    Real developers use an XML or JSON reader instead of using direct deserialization. Trust me I've built systems both ways and deserialization directly into objects is no bueno. You end up with more problems with version compatability alone to negate the benefits. There are also performance issues as well.

    --
    We'll make great pets
    1. Re:Real Developers never Deserialize into objects by edxwelch · · Score: 1

      You are 100% correct.
      Unfortunately, going by the amount of projects affected by the bug, it seems that most programmers are not "real programmers"

    2. Re: Real Developers never Deserialize into objects by Sloppy · · Score: 1

      Parser loop.

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
    3. Re:Real Developers never Deserialize into objects by phantomfive · · Score: 2

      Trust me I've built systems both ways and deserialization directly into objects is no bueno.

      Yeah, running a auto-deserializer on untrusted data is basically guaranteed to be a security flaw. The NSA and FSB will pwn you at that point, along with anyone else who wants to (just ask PayPal).

      --
      "First they came for the slanderers and i said nothing."
    4. Re:Real Developers never Deserialize into objects by bad-badtz-maru · · Score: 1

      Absolutely correct. Any additional development overhead or memory use is acceptable in return for the gained compatibility, reliability and security.

    5. Re:Real Developers never Deserialize into objects by phantomfive · · Score: 1

      But can't you pretty much avoid this issue by means of predefining the allowed structure(s) for the data? If the deserialized/serialized data does not match the predefined schema, it's discarded as invalid.

      The deserializer reconstructs the objects by calling the constructor, or setters and getters. The setters and getters have logic bugs that allow arbitrary code execution.

      I just think that the standard serialization/deserialization libraries out there have been likely created by programmers a lot smarter than me,

      No, you're definitely wrong here. If you spend a few weekends, you could probably make one of these yourself. Then getting people to use it is a matter of marketing and such.

      --
      "First they came for the slanderers and i said nothing."
  3. Not a .NET problem by Anonymous Coward · · Score: 1

    This is a programming problem that can happen anywhere. No language is immune. No project is automatically secure from exploits, or able to patch framework universally for all deployments.

    Java and .NET will always have security issues, along with literally every other programming language. Anyone shocked, surprised, upset, or hostile to that concept is in the wrong profession.

    Assume everything is compromised. Assume nothing is secure. Design around that assumption and you will survive.

    1. Re: Not a .NET problem by peppepz · · Score: 3, Insightful

      The title is sensationalistic. Even the original bug the author talks about, calling it repeatedly a "Java" bug, was actually a bug in the Apache Commons Collections library, not in the platform, and it could only be triggered if a server using the library allowed customers to provide serialized data for itself to deserialize, which is severely wrong in the first place (it's akin to eval()-ing client-provided text).

    2. Re:Not a .NET problem by Tablizer · · Score: 2

      Assume everything is compromised. Assume nothing is secure. Design around that assumption and you will survive.

      But you won't be able to compete with shortcut takers. They will look more productive than you. The penalty for shortcut taking is not just large enough, I hate to say. I'm just the messenger.

    3. Re:Not a .NET problem by Tablizer · · Score: 1

      Correction: "just not large enough...".

      Lexdysia

  4. It's a trap! by Anonymous Coward · · Score: 3, Interesting

    Completely agree. We used .net binary serialization/deserialization because it was such a quick way to get things up and running...with like two lines of code. The fact that the serialized objects were about 10x bigger than they needed to be was not a problem.

    It turns out the namespaces are included in the serialized data, so the moment we did an ounce of lightweight refactoring we broke it. It took us less than a day to write our own serializer, but an extra three days of combined manpower to get a format converter built and deployed.

    We would have saved all that if we had just started with our own serializer.

    Prefab solutions to simple problems are a deal with the devil. Beware!

  5. Yeah, serialization can be annoying. by RyanFenton · · Score: 2

    I'm kind of surprised this hasn't already built into a more prominent issue over time.

    Performance issues I can stomach - there's going to be some unavoidable parsing logic no matter how you go at translating from runtime to storage or network logic - but instead, large swaths of objects just get ignored in major libraries. When using unity, for instance, can't serialize dictionaries, and many other objects in the default serializer - which is a major oversight.

    Google actually has provided some rather nice tools to help with this - I tend to use their 'Protocol buffer' libraries for their rather nice serialization options. This doesn't address security on its own - nothing does completely, but designing careful locked signal processing and independent cross-checking steps can help a lot. Well-salted encryption alone won't really save you.

    My pet peeve with protocol buffers the need to give everything an index number, with no real auto-numbering for rapid design - I can see the logical need, to be able to rely on that order for processing - it's just an extra babysitting step that gets me sometimes. For what it does, it's still the best I've found to be consistent between diverse projects and still leaving room for decent security.

    Ryan Fenton

    1. Re:Yeah, serialization can be annoying. by hord · · Score: 1

      I've looked at protocol buffers but everything I've ever read about people actually using them in production says they are a nightmare over time because they are binary. Supposedly the object versioning alleviates some of this but I think people were complaining about how to deal with mandatory fields over time. I can't remember but I suspect this plus JavaScript being in the browser is what makes JSON so prevalent. I have no idea why XML is used. I can't even think of a single advantage it has over anything. I guess I should just be glad ASN.1 didn't become more popular.

  6. Free time =/= 20% time by PRMan · · Score: 1

    Google engineers banded together in their 20% time.

    --
    Peter predicted that you would "deliberately forget" creation 2000 years ago...
  7. Re:Prevent data by bugs2squash · · Score: 1

    I presume that the code performing the marshaling or unmarshaling of objects tries to be overly generic and so treats it's input as if it were a mini scripting language of it's own. ie crawls through the input and handles what it finds by having the input dictate what methods to call to get the job done. I'm sure it can be done safely, but it's probably easy to err on the side of "clever generic" code that is exploitable.

    --
    Nullius in verba
  8. Re: Real Developers never Deserialize into object by Anonymous Coward · · Score: 1

    Walking a data tree is 1st year CS level work. If you're spending half your efforts on it then you're either vastly short on resources or your coders suck donkey balls.

  9. Re:Prevent data by hord · · Score: 1

    Data by itself doesn't do much. The way to think about data is that it is being fed into a machine that is doing stuff. That means I can program the internal state of that machine using data. Normally we just call this "processing" but bugs like this illustrate that you have to be very careful with how you handle state. Even for "simple" formats that are just "text" like JSON, XML, YAML, and everything else. Image (binary) formats are also not immune as there have been browser attacks using bugs in common image libraries. If your data is used as input, it is executing passively.

  10. Re:Bottom-line... apk by hord · · Score: 2

    Personally I think "exposing" objects is the problem. Your border should be a mailbox that exchanges messages and those messages should be inspected carefully before internal delivery. I have no idea why people want to dump a class they wrote onto a live internet service and just hope that it dumps data into the correct table somewhere. They dragged the "security api" icon onto the project space so it's secure.

  11. Agree. by Anonymous Coward · · Score: 2, Insightful

    It appears that the market is flooded with developers who can write scripts but not algorithms. They believe that something like parsing JSON is really hard and complicated, that any home-grown solution to doing that will be extremely buggy and slow, all because they themselves haven't taken the mental step-up.

    Of course, this mental step-up used to be a standard part of a CS degree. College students would be writing code that does this sort of thing as homework. This has changed, and I have seen the change in the candidates we interview. I ask them questions about their courses in algorithms and what they did, and they say things like "we learned what the foundational algorithms are and how to compare their performance." Did you actually write a merge sort? "No, there's no need because every major language has that sort of thing built in."

    So, there's the rub. They paid good money for a degree that glanced over the most important bits. Naturally, they feel completely justified in their beliefs that stringing third-party solutions together is the best way to write code.

    And a whole new crop of these scripters hits the job market every year, more than we have seen in a decade. Colleges have been lowering the bar due to higher interest among students that aren't really cut out for it, that in turn due to successful social engineering on the part of the tech giants.

    At least, that's my hypothesis.

    1. Re:Agree. by Joviex · · Score: 4, Insightful

      I ask them questions about their courses in algorithms and what they did, and they say things like "we learned what the foundational algorithms are and how to compare their performance." Did you actually write a merge sort? "No, there's no need because every major language has that sort of thing built in."

      Consider me a cultist follower of your hypothesis. 20 years in CS, the last 10 I have seen it take a sharp dive. The only explanation I have is the explosion over 15 years ago in OSS and that what you espouse is true: Everyone thinks they can develop or engineer, because the code is tied up in nice little solution blocks.

      Need a sort algo? Just codeproject.com
      Need some bi-directional comm between remotes? Just github.com...etc....

      The number of people I have turned away in the first two days of testing, who could not even write a simple priority Q... its more than disheartening.

      These are the "developers" who are supposed to code my future? Fuck me! I'll be working till I die.

    2. Re:Agree. by hawkinspeter · · Score: 2

      Although knowing how to write algorithms is a very fundamental part of programming (and they SHOULD cover that in a CS degree), I'd agree with the young developers in not writing their own implementations in the real world. Writing good algorithms is hard and I'd prefer the expert developers to put the most useful algorithms into libraries rather than the less experienced developers who are straight out of college.

      I reckon most coding jobs only really involve manipulating/displaying data from databases and having a nice GUI and you don't really need to be expert with algorithms for that.

      --
      You're a temporary arrangement of matter sliding towards oblivion in a cold, uncaring universe
    3. Re:Agree. by Joviex · · Score: 1

      If you are testing on problems which were solved 20 years ago - maybe you are using the wrong tests? Would you test using punch cards given the choice.

      Bro, if you think that Priority Qs are "old", you are the problem.

      Stop trying to use things like "outdated" to mean "I dont know how".

      Grow some balls, dont be an A.C. and make ignorant claims about things you obviously dont even know are in use (i.e hint PQ are just heaps) when they are the literal foundation of modern caches and heaps.

      Thanks for making my point.

    4. Re:Agree. by hawkinspeter · · Score: 1

      Your comments about the "CS light" programmers are exactly why I wouldn't trust a typical programmer to write a top quality algorithm.

      What to call these "CS light" people - I prefer the term "code monkey".

      --
      You're a temporary arrangement of matter sliding towards oblivion in a cold, uncaring universe
    5. Re:Agree. by Joviex · · Score: 1

      So you want someone to write a round-robin, lockless thread pool on a whiteboard in 30 minutes?

      No, I want someone to KNOW HOW, logically, to solve THAT PROBLEM.

      And I said DAYS, not hours. Reading comprehension is an obvious skill that has gone down with this "new" education paradigm,

    6. Re:Agree. by zifn4b · · Score: 1

      Consider me a cultist follower of your hypothesis. 20 years in CS, the last 10 I have seen it take a sharp dive. The only explanation I have is the explosion over 15 years ago in OSS and that what you espouse is true: Everyone thinks they can develop or engineer, because the code is tied up in nice little solution blocks.

      Our education system is broken. Not many developers have Computer Science degrees because that's actually a had degree to achieve. A lot of them have some type of Computer Science lite degree like Information Technology or something like that. I don't see it getting any better. Insisting it should is unfortunately wishful thinking at this point. Some people value the field of Computer SCIENCE. Some people are just in it for the money.

      --
      We'll make great pets
  12. JSON does not have code-execution ability by Anonymous Coward · · Score: 5, Insightful

    JSON only defines a bunch of basic data types. It defines no ability to run anything. These bugs are in (de)serialization layer above it, which uses JSON as a transport and extend the meaning of the data stored to be able to deserialize higher-level objects.

    JSON or XML are not the problem here. The same problem could happen if you serialized to CSV or TXT or anything else for that matter.

    1. Re:JSON does not have code-execution ability by Tablizer · · Score: 1

      It's probably a problem with "generic" reconstruction of objects based on data. If the data is used to (re) construct objects, then some objects can potentially have behavior because that's how objects are defined. If the data is "clever" enough, it may end up constructing objects you don't want.

      It's probably better to parse out to low-level "scalar" values and hand-code the part that stuffs them into objects or databases rather than let a parser actually build objects or object trees itself.

    2. Re:JSON does not have code-execution ability by phantomfive · · Score: 1

      It's probably better to parse out to low-level "scalar" values and hand-code the part that stuffs them into objects or databases rather than let a parser actually build objects or object trees itself.

      This is exactly right. Because the data is untrusted, you need to verify it anyway, and adding parsing code to that usually doesn't add much overhead (it can often be the same code).

      In the defcon talk they made a strong case that these generic de-serialization libraries are extremely difficult if not impossible to use securely. They were just grabbing at low-hanging fruit, as soon as you've imported these libraries, you're compromised. They didn't even discuss ways that the libraries might be used incorrectly.

      Say no to generic deserializers on untrusted data.

      --
      "First they came for the slanderers and i said nothing."
    3. Re:JSON does not have code-execution ability by pjt33 · · Score: 1

      It's probably better to parse out to low-level "scalar" values and hand-code the part that stuffs them into objects or databases rather than let a parser actually build objects or object trees itself.

      If you're dealing with enough different datatypes then it might be a big development and maintenance saving to have a generic object builder in your deserialiser. The key is to make it so that you whitelist the datatypes it will deserialise.

    4. Re:JSON does not have code-execution ability by Tablizer · · Score: 1

      I see a problem with white-listing. Objects are often part of a bigger ecosystem. You may have to white-list sub-sets of objects to do it right, making it non-trivial to guarantee you didn't leave a current or future hole.

      You are right that it might be a big saving to have auto-object generation, but at a risk.

  13. I don't get it by Hentes · · Score: 1

    Can someone explain what the problem is here? Serialized objects are just code, and if you're running untrusted code you've got bigger problems than bugs in your serialization libraries.

    1. Re:I don't get it by PhrostyMcByte · · Score: 3, Informative

      General rule of thumb as always... a vague security announcement is never as big a deal as its title makes it out to be.

      There really isn't much of a problem. Reading TFA, a few vulnerabilities have been discovered in a couple applications and libraries. None of these were part of .NET, and no systemic issues in how people code for .NET have been found.

  14. Re: Prevent data by phantomfive · · Score: 1

    These frameworks will de-serialize any object. Send them a Process object in .NET, and the framework will deserialize it into something that can fork a new process. The APIs in Java and .NET are so huge, that it is extremely difficult to filter out every kind of object that might cause problems (some frameworks try......and fail).

    There is no 'pure' data here, the purpose of these frameworks is to deserialize into objects, and objects by definition are functions combined with data.

    --
    "First they came for the slanderers and i said nothing."
  15. Re:Prevent data by skids · · Score: 1

    In this case it happens when "Object Oriented" is taken too literally. People think of data as inert. People think of "Objects" as inert. So they figure translating between data and objects is just transforming one inert thing into another.

    But "objects" are not inert in almost any dynamic language. They are quite active, with instantiation methods, etc., and some are quite dangerous. One has to adjust one's paradigm when learning OO programming from a procedural background.

  16. Never heard of those libraries by PmanAce · · Score: 1

    They aren't part of. Net itself, just third party libraries.

    --
    Tired of my customary (Score:1)
  17. Re: Prevent data by phantomfive · · Score: 1

    Please clarify instead of posting cryptic pointless posts.

    --
    "First they came for the slanderers and i said nothing."
  18. JSON.NET is not vulnerable by default by Tony+Isaac · · Score: 1

    As stated int he linked document, for JSON.NET to be vulnerable, you have to explicitly set an option making it less secure.

    As with encryption and security libraries, you are better off using well-established libraries like JSON.NET than rolling your own. A solo developer, or corporate team, just doesn't have the resources or time to work out all the security vulnerabilities, as can be done with a dedicated library.

  19. Re: Is Rust vulnerable? I'd expect not. by blackpaw · · Score: 1

    He'll be telling us Rust is webscale next

  20. DOS attacks on .NET and Java by jens.dietrich · · Score: 1

    This is not surprising ! We discovered recently some "billion-laughs"-style DOS attacks that exploit vulnerabilities in Java, and ported some of them to .NET and Ruby. Details here: http://drops.dagstuhl.de/opus/... (paper, there is also an artefact to run attacks in a VM), and the source code is here: https://bitbucket.org/jensdiet... . We did have some problems porting this from Java to .NET but managed eventually. Interestingly, some of these problems were caused by a bug in .NET: a broken contract between equals and hashcode (see https://github.com/dotnet/core...) .

  21. Re: Real Developers never Deserialize into object by zifn4b · · Score: 1

    Walking a data tree is 1st year CS level work. If you're spending half your efforts on it then you're either vastly short on resources or your coders suck donkey balls.

    Correct and then 2nd year CS work is red/black trees and other advanced algorithms. The whole time you're learning this you're taking increasingly higher levels of mathematics. You see the problem I run into is that many people couldn't make it to the 2nd year in Computer Science and went into Information Technology or some other "Computer Science Lite" field of study. These are the majority of people in the field now that really lack to ability to understand the difference between different implementations logically and mathematically. Computer Science is a hard degree. It's called Computer _SCIENCE_ for a reason.

    --
    We'll make great pets
  22. Re: Prevent data by YoungManKlaus · · Score: 1

    > will deserialize it into something that can fork a new process

    Only if you tell it "hey please put this into this insanely insecure class that will fork a process". Serialize your shit into stupid DTOs and you are dandy. That has nothing to do with the API surface.