Oracle Calls Java Serialization 'A Horrible Mistake', Plans to Dump It (infoworld.com)
An anonymous reader quotes InfoWorld:
Oracle plans to drop from Java its serialization feature that has been a thorn in the side when it comes to security. Also known as Java object serialization, the feature is used for encoding objects into streams of bytes... Removing serialization is a long-term goal and is part of Project Amber, which is focused on productivity-oriented Java language features, says Mark Reinhold, chief architect of the Java platform group at Oracle.
To replace the current serialization technology, a small serialization framework would be placed in the platform once records, the Java version of data classes, are supported. The framework could support a graph of records, and developers could plug in a serialization engine of their choice, supporting formats such as JSON or XML, enabling serialization of records in a safe way. But Reinhold cannot yet say which release of Java will have the records capability. Serialization was a "horrible mistake" made in 1997, Reinhold says. He estimates that at least a third -- maybe even half -- of Java vulnerabilities have involved serialization. Serialization overall is brittle but holds the appeal of being easy to use in simple use cases, Reinhold says.
To replace the current serialization technology, a small serialization framework would be placed in the platform once records, the Java version of data classes, are supported. The framework could support a graph of records, and developers could plug in a serialization engine of their choice, supporting formats such as JSON or XML, enabling serialization of records in a safe way. But Reinhold cannot yet say which release of Java will have the records capability. Serialization was a "horrible mistake" made in 1997, Reinhold says. He estimates that at least a third -- maybe even half -- of Java vulnerabilities have involved serialization. Serialization overall is brittle but holds the appeal of being easy to use in simple use cases, Reinhold says.
But the Java fanatics just put in more and more features, regardless of whether sane languages had them or not.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Regardless of language, object serialization is a dangerous idea. While it may seem like a nice idea at first, loading objects from unverified mutable data is an invitation for someone to tinker with that data.
Okay then, smartypants, what do you propose for persisting fields of an object? Anything you propose is, by definition, "serialisation". The only alternative to serialisation is non-persistent objects.
(TBH, I kinda like the thought of signed serialiased blobs)
I'm a minority race. Save your vitriol for white people.
1. No validation. You might have a nicely designed object, well tested, and has all sorts of validation checks to ensure that the internal state is never broken. Java object serialization bypasses all validation, permitting an attacker to construct a malformed object. Exactly how that would cause a problem requires a bit more work on the attacker's part, by studying how the application reacts to the malformed object. Adding validation is supported with Java serialization, but its not used by default. The designers favored simplicity over safety. Does switching to JSON magically fix the validation problem? Nope.
2. Loading of classes that you didn't expect to load. If I expect to receive a serialized list of strings, there's nothing to prevent an attacker to providing a list of any kind of object instead, due to type erasure. The application might fail to process the list because of a ClassCastException, but the potential damage is done. Java serialization /does/ support filtering out classes that aren't expected, but this is off by default. You need to define the blacklist yourself. Why is loading other classes a problem? See the next reasons:
3. Custom code during deserialization, which is actually necessary for performing your own validation. You can define your own code which runs when the object is deserialized, and the code can do pretty much anything. An attacker might be able to trick the code (using malformed input) into doing something harmful.
4. Additional classes on the classpath. Even if all of your code is well behaved, and has proper validation checks, and proper custom code, you're still vulnerable because additional classes exist that you're not aware of. You had no idea that there's this class 'Q' which has broken custom code, because Q was sucked in as a dependency of something else. That popular open source library you're using might be exposing your application to attack, and you didn't even know it.
For anyone designing an object serialization mechanism, always consider the tradeoffs when trying to make the system easier to use. Always use whitelists for trusted code instead of blacklists. Always construct objects using the object's public API. Favor the use of standard representations (maps, lists, tuples) instead of supporting full-blown customization. A little bit of friction can be a good thing.
Serialization isn't inherently bad. It's bad practices and misuse, which won't change. It'll just be replaced by many developers with XML, JSON, Protobuf, YAML, or other. Then, someone will inevitably sprinkle on some reflection or code generation, and you've almost done a 360... but with a lot more code and even more that could go wrong. I don't agree that adding more training wheels and/or removing features is always the best way to fix bad developer habits.
Thank Go weaning me off ruby's eval().
That's because Google's motto is "Do no eval".
Ezekiel 23:20
Java is in so far unique as when you use build in serialization, you also serialize the class files.
There are two "marker interfaces" to make a Java class serializable: Serializable and Externalizable.
In casse of the first one, the Java Framework/VM uses reflection to serialize and deserialize objects.
In case of the second one, you are required to implement the methods writeExternal() and readExternal().
As the class files are in the serialized data stream, a program reading "untrusted" serialized data might also load classes aka code from that stream. If that code implements Externalizable and thus has an "unknonwn foreign" method readExternal(), the deserialization framework will call that unknown/untrusted method readExternal() which means: you run code coming from outside, which can do what ever it wants besides reading the object from the object stream.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.