Slashdot Mirror


Oracle Calls Java Serialization 'A Horrible Mistake', Plans to Dump It (infoworld.com)

An anonymous reader quotes InfoWorld: Oracle plans to drop from Java its serialization feature that has been a thorn in the side when it comes to security. Also known as Java object serialization, the feature is used for encoding objects into streams of bytes... Removing serialization is a long-term goal and is part of Project Amber, which is focused on productivity-oriented Java language features, says Mark Reinhold, chief architect of the Java platform group at Oracle.

To replace the current serialization technology, a small serialization framework would be placed in the platform once records, the Java version of data classes, are supported. The framework could support a graph of records, and developers could plug in a serialization engine of their choice, supporting formats such as JSON or XML, enabling serialization of records in a safe way. But Reinhold cannot yet say which release of Java will have the records capability. Serialization was a "horrible mistake" made in 1997, Reinhold says. He estimates that at least a third -- maybe even half -- of Java vulnerabilities have involved serialization. Serialization overall is brittle but holds the appeal of being easy to use in simple use cases, Reinhold says.

10 of 198 comments (clear)

  1. Was very obvious back then by gweihir · · Score: 4, Insightful

    But the Java fanatics just put in more and more features, regardless of whether sane languages had them or not.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    1. Re:Was very obvious back then by TheRealMindChild · · Score: 4, Funny

      If XML isn't the solution to your problem, you aren't using enough

      --

      "When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
    2. Re: Was very obvious back then by Z00L00K · · Score: 4, Insightful

      The disadvantage with xml is that it creates a lot of overhead, which could be a problem in embedded applications and large scale solutions.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  2. Re:Object serialization is dangerous. by goose-incarnated · · Score: 4, Interesting

    Regardless of language, object serialization is a dangerous idea. While it may seem like a nice idea at first, loading objects from unverified mutable data is an invitation for someone to tinker with that data.

    Okay then, smartypants, what do you propose for persisting fields of an object? Anything you propose is, by definition, "serialisation". The only alternative to serialisation is non-persistent objects.

    (TBH, I kinda like the thought of signed serialiased blobs)

    --
    I'm a minority race. Save your vitriol for white people.
  3. The reason why it is dangerous by Wookie+Monster · · Score: 5, Interesting
    I'm concerned that someone might hear "object serialization is bad, but JSON is good" and make the same mistakes that were made with Java object serialization. Java object serialization is bad for the following reasons:

    1. No validation. You might have a nicely designed object, well tested, and has all sorts of validation checks to ensure that the internal state is never broken. Java object serialization bypasses all validation, permitting an attacker to construct a malformed object. Exactly how that would cause a problem requires a bit more work on the attacker's part, by studying how the application reacts to the malformed object. Adding validation is supported with Java serialization, but its not used by default. The designers favored simplicity over safety. Does switching to JSON magically fix the validation problem? Nope.

    2. Loading of classes that you didn't expect to load. If I expect to receive a serialized list of strings, there's nothing to prevent an attacker to providing a list of any kind of object instead, due to type erasure. The application might fail to process the list because of a ClassCastException, but the potential damage is done. Java serialization /does/ support filtering out classes that aren't expected, but this is off by default. You need to define the blacklist yourself. Why is loading other classes a problem? See the next reasons:

    3. Custom code during deserialization, which is actually necessary for performing your own validation. You can define your own code which runs when the object is deserialized, and the code can do pretty much anything. An attacker might be able to trick the code (using malformed input) into doing something harmful.

    4. Additional classes on the classpath. Even if all of your code is well behaved, and has proper validation checks, and proper custom code, you're still vulnerable because additional classes exist that you're not aware of. You had no idea that there's this class 'Q' which has broken custom code, because Q was sucked in as a dependency of something else. That popular open source library you're using might be exposing your application to attack, and you didn't even know it.

    For anyone designing an object serialization mechanism, always consider the tradeoffs when trying to make the system easier to use. Always use whitelists for trusted code instead of blacklists. Always construct objects using the object's public API. Favor the use of standard representations (maps, lists, tuples) instead of supporting full-blown customization. A little bit of friction can be a good thing.

    1. Re:The reason why it is dangerous by Jeremi · · Score: 4, Insightful

      If I'm following you correctly, the problem isn't serialization per se but rather the fact that the deserialization is being done by the Java runtime (which has no way to validate the resulting objects against the application's requirements, since its deserialization code is application-independent, and also has the power to instantiate any kind of object, even those that are totally irrelevant to the task at hand), rather than by the application itself.

      A user-supplied deserialization-routine, OTOH, has at least a chance of being secure in the face of invalid source data, since it can check to make sure that its constraints are correctly satisfied and reject the data if they aren't.

      Of course, avoiding making every application developer write his own application-specific serialization/deserialization routines was largely the point of this Java feature, but in hindsight it appears that was a bad decision.

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    2. Re:The reason why it is dangerous by Anonymous Coward · · Score: 5, Informative

      If I'm following you correctly, the problem isn't serialization per se but rather the fact that the deserialization is being done by the Java runtime (which has no way to validate the resulting objects against the application's requirements, since its deserialization code is application-independent, and also has the power to instantiate any kind of object, even those that are totally irrelevant to the task at hand), rather than by the application itself.

      Java deserialization is magic. By which I mean it behaves in several ways that user code pretty much can't.

      The default system effectively loads a binary blob off the input stream and then creates each object without calling a constructor*. You can't just not call a constructor in Java, but Java deserialization does. All the fields are set by magic, by which I mean it ignores getters and setters and whatever access level might be on the fields. Any field marked as "not serialized" (transient) is left with default values - but those may not be the default values you think! If you write private transient int foo = 3; then foo won't be serialized, and when the object is deserialized, it will instead be ... 0. Because 0 is the default for ints.

      How does Java deserialization know if it's loading the right fields for a given object? Well, it's magic, but not that magic - you're supposed to let it know by setting the serialization ID for the class. And how do you do that? By declaring a static long serialVersionUID, and making sure you update it whenever your class structure changes. Don't do that and the deserialization logic might not notice that the structure doesn't quite match. No, you can't just have it autogenerate one - if not set, the serialization/deserialization code will create one, but it may be dependent on compiler and randomly break across identical code bases. Surprise!

      But in any case, the serialization system is magic. How do you write a custom serializer/deserializer? By creating the private methods writeObject(ObjectOutputStream) and readObject(ObjectOutputStream). Because the serializer is magic, it can access these private methods. (Note that readObject(ObjectOutputStream) gets called on a magically created object that has never had a constructor called on it, so all fields will have their default values! How does that work with final fields? Well... the short answer is "like shit." The longer answer is that the default deserializer just ignores the final modifier (which you can't do in generic code), and that if you want to do the same, there's some reflection magic or non-standard APIs you can do.)

      So anyway, there's a basic overview of how Java serialization defies expectations and basically guarantees that anyone writing code that involves serialization will do it wrong.

      * This is false. What it really does is go up the object hierarchy and look for the first parent class that does not declare itself serializable and calls its default no-args constructor. But that means that your class that you declared serializable therefore, by definition, does not get its constructor called. Surprise!

  4. It won't change anything by pestilence669 · · Score: 4, Insightful

    Serialization isn't inherently bad. It's bad practices and misuse, which won't change. It'll just be replaced by many developers with XML, JSON, Protobuf, YAML, or other. Then, someone will inevitably sprinkle on some reflection or code generation, and you've almost done a 360... but with a lot more code and even more that could go wrong. I don't agree that adding more training wheels and/or removing features is always the best way to fix bad developer habits.

  5. Re: Object serialization is dangerous. by K.+S.+Kyosuke · · Score: 5, Funny

    Thank Go weaning me off ruby's eval().

    That's because Google's motto is "Do no eval".

    --
    Ezekiel 23:20
  6. Re:I don't get it by angel'o'sphere · · Score: 4, Informative

    Java is in so far unique as when you use build in serialization, you also serialize the class files.
    There are two "marker interfaces" to make a Java class serializable: Serializable and Externalizable.

    In casse of the first one, the Java Framework/VM uses reflection to serialize and deserialize objects.
    In case of the second one, you are required to implement the methods writeExternal() and readExternal().

    As the class files are in the serialized data stream, a program reading "untrusted" serialized data might also load classes aka code from that stream. If that code implements Externalizable and thus has an "unknonwn foreign" method readExternal(), the deserialization framework will call that unknown/untrusted method readExternal() which means: you run code coming from outside, which can do what ever it wants besides reading the object from the object stream.

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.