XML Library Flaw — Sun, Apache, GNOME Affected
bednarz writes with this excerpt from Network World:
"Vulnerabilities discovered in XML libraries from Sun, the Apache Software Foundation, the Python Software Foundation and the GNOME Project could result in successful denial-of-service attacks on applications built with them, according to Codenomicon. The security vendor found flaws in XML parsers that made it fairly easy to cause a DoS attack, corruption of data, and delivery of a malicious payload using XML-based content. Codenomicon has shared its findings with industry and the open source groups, and a number of recommendations and patches for the XML-related vulnerabilities are expected to be made available Wednesday. In addition, a general security advisory is expected to be published by the Computer Emergency Response Team in Finland (CERT-FI)."
Seems to me that ASCII delimited protocols always have these types of issues. Its quite easy to write fuzzers for human readable protocols compared to binary encoded protocols. Too bad these developers don't know how to write good unit tests... This could have been avoided..
I suggest switching to a diet of hay, for the higher protein content.
Nerd rage is the funniest rage.
You'll probably getted tagged 'troll' for that, but I'll bite.
It's not that open source is not susceptible to these things (all software is). But with open source, these things are usually found more quickly, and are generally patched/fixed more quickly. I don't have statistics to support a statement that critical errors like this happen less often with open source, but I would have no trouble believing that.
Open source is usually more transparent about the problem, too. Many closed source vendors hide these things, so you never know you're vulnerable and thus can't adjust for it.
Ad luna, Alicia! Ad luna!
and anyone that builds their own firefox knows that python is required to build (not to run - just build), i have python-2.6.2 installed, so this means after python patches this flaw i got to re-roll every app that depends on python either just to build or at runtime too? yowza! that does not bode well, looks like i got my work cut out for me...
Politics is Treachery, Religion is Brainwashing
There doesn't seem to be much of an article behind this summary. Just some fluff about malicious input and the fact that XML is widely used. Would be interesting to see examples of the malicious XML and an explanation of how the vulnerabilities work.
"Welcome to our world. We are the wasted youth. And we are the future too." Yes, I know these are stupid lyrics.
Someone will undoubtedly say that the bug being found was part of the process, since it's open source and that means the source is auditable by anybody. Reality: it was discovered by the maker of a fuzzing tool. Fuzzing is the process of sending garbage into software to see if it breaks... it works quite well and generally doesn't require the source code.
Also, fuzzing discovers DoSes. But many DoS attacks turn into vulnerabilities in the hands of a skilled hacker, and it's generally not safe to assume that a DoS is unexploitable without extensive code analysis.
Except CSV isn't a standard. While the general idea is similar, the details differ greatly from parser to parser. Do you need a trailing comma on the line? Do you allow leading or trailing space on an entry? Since most generators use slightly different conventions, parsers need to be significantly more complex. And CSV is far more limited in scope. I think of CSV as the scripting language to XML's high level OO VM language. Neither is a particularly efficient format, but they're both easier to work with than the alternative (binary coded data), and they're each good for different things. CSV works well for simple data structures, just like scripting languages are appropriate for small utility programs, while XML is good for complex, rigidly defined structures, just like a high level OO language is more appropriate to large projects where maintainability is a concern.
$_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
8/10
Think "
I wonder if these vulnerabilites could have been found earlier if the code was open source."
Given that perfection is impossible... I believe that the open source process is much less likely to lead to problems such as this. When it goes well. The problem is that being open source defines so little about how the work is actually getting done. I could create an open source math library consisting of: add(a, b){ return a*b; } , make it open source, post it on source-forge and everything, and then simply refuse outside contribution. This is not really a problem for the community since anyone can fork and fix, but if I happen to be a software giant that has integrated my own faulty code into half of my products...
to the parent & grandparent post, plus with the source available you can check for anything malicious in the code before the binary is built. or rebuild with different parameters making it more secured against flaws by leaving out or changing the parts that are flawed or adding your own patches further hardening the app against vulnerabilities.
to the grandparent only: if you dont see the advantages of Open Source software to all users be it commercial or personal then you are not a user yourself and are just a corporate type with the corportista mindset, i got news for you = money is not everything and people will go out of their way to get your greedy little paws out of their pockets.
Politics is Treachery, Religion is Brainwashing
Title = XML Library Flaw -- Sun, Apache, GNOME Affected
1st Line of Summary = Sun, the Apache Software Foundation, the Python Software Foundation and the GNOME Project
to the grandparent only: if you dont see the advantages of Open Source software to all users be it commercial or personal then you are not a user yourself and are just a corporate type with the corportista mindset, i got news for you = money is not everything and people will go out of their way to get your greedy little paws out of their pockets.
So, if I need Photoshop as part of my job to feed my family, I'm just a corporate type with the corportista mindset and I should either switch to Gimp and pull my hair and lose time and clients or let my family starve?
Whatever happened to using the right tool for the job, instead of letting zealotry take over?
This space for rent.
it works quite well and generally doesn't require the source code.
But here, since it's open source, we don't have to rely on coders in a white tower to patch the code directly or someone to hack an intermediate patch. We can start looking right away.
well, I'd like json and bencode for that matter.
The solution is clear to me. I would stop using XML.
could result in successful denial-of-service attacks
Ah yes, but could it result in successful denial-of-cellphone-service?
Since MS is closed source, it wouldn't be fixed for months on end like open source is. That's the only difference. See? It works both ways, neither is really helpful.
Check out my lame java blog at www.javachopshop.com
CSV FTW.
What happens when your data contains \r or \n characters? (I know Oracle's sqlldr / external tables at least will reject that row, and I don't believe they recognize any escape sequence for this.) What happens if the data has commas in it, and the .csv was generated by something that doesn't add quotes?
What do you do if your data is more complicated than a simple table?
Way to beat that strawman!
Couldn't you have at least waited until a linux fanboi didn't understand the summary and made a dumb comment?
All that aside, the way these projects' being open source will make this better is by making a patch come out sooner. The community knows there is a problem. Someone will get on finding it right away, and in a day or two we will see patches getting pushed out that fix it. There's no sitting around helplessly hoping we don't get DoSed until someone at MegaSoft Corp. decides this is worth fixing and rolls a patch.
Mod points: Guaranteed to remove your sense of humor.
Side effects may include gullibility and temporary retardation
All that aside, the way these projects' being open source will make this better is by making a patch come out sooner. The community knows there is a problem. Someone will get on finding it right away, and in a day or two we will see patches getting pushed out that fix it. There's no sitting around helplessly hoping we don't get DoSed until someone at MegaSoft Corp. decides this is worth fixing and rolls a patch.
This is because the Community has unlimited volunteer resources available on very short notice, and large corporations with many paid full-time employees do not.
Except CSV isn't a standard.
The IETF might disagree with you.
Google for "billion laughs".
You think I've come to the right place?
Except CSV isn't a standard.
The IETF might disagree with you.
"This memo provides information for the Internet community. It does not specify an Internet standard of any kind. "
Most of the things you ask about can be done with CSV as long as it's quoted properly. If it's not quoted properly, then it would be considered invalid. There's a nice RFC spec for it here: http://www.ietf.org/rfc/rfc4180.txt
What happens when your data contains \r or \n characters?
It's perfectly acceptable as long as you quote it (#6 example of RFC 4180). If Oracle doesn't support that, then I would say their implementation is broken.
What happens if the data has commas in it, and the .csv was generated by something that doesn't add quotes?
It's invalid
What do you do if your data is more complicated than a simple table?
I'd need a better example from you, but you can embed a csv record inside a csv field. It starts to get complicated really fast with all the "escaping" that needs to be done with the double-quotes. Such as something like a record containing "Last Name","First Name","Sub-Properties". The Sub-Properties could be embedded data such as sex, age, and height. For example:
Clearly, you can represent tree style data with CSV, but it has more flexibility than you think. Too many people roll their own CSV, because it seems so simple. Then they don't quote and escape quotes properly blaming any issues on garbage data.
What do you do if your data is more complicated than a simple table?
Are you serious? The same thing one would do in a relational database if your data is more complicated than a simple table...
So, if I need Photoshop as part of my job to feed my family, I'm just a corporate type with the corportista mindset and I should either switch to Gimp and pull my hair and lose time and clients or let my family starve?
But with The GIMP you get to waste weeks of your time trying to wade through it's crappy codebase trying to fix it's buginess and try to cram in features that it still doesn't have that Photoshop has had for almost a decade. You non-corportistas just don't understand how this is a benefit and not a flaw of the software!
Interesting. Of course, it was only published in 2005. If they'd written this up 20 years ago, it might have been more helpful. As is, the various CSV writers have been around so long that a lot of non-conformant CSV is out there. So the parsers remain fairly complex, to account for the previously undefined behaviors. And of course, that standard is for a MIME type; non-web focused CSV generators will still ignore parts of it.
$_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
Which libraries? libxml2, expat, or some other library?
The last I'd checked, Python could use several XML libraries, and Sun distributed several libraries.
It would be nice if TFA had told us which libraries, or had a link to the actual report listing them.
www.eFax.com are spammers
If thats the case, then exploits in acrobat reader and flash should be fixed next day, or in a few hours.
Is this another Array bound check not being performed? Another I'm copying huge chunks of weird characters into memory and overwriting crap?
With all the extra horspower can we not get a something added to C++ to make this happen?
DOSs seems harder to fight against. Is it bad code that loops for ever or is just not optmized. I bet most libraries could be found to have some of that.
Oops that was supposed to be "corportistas" not "non-corportistas".
Hey man, you did "adobe xml vulnerability" twice!! Admittedly, their security record is appalling, particularly as of late, but still, play fair ;)
More seriously, an article comes out about multiple XML vulnerabilities in multiple open-source XML libraries and your immediate reaction is to rush out and try and shine the light on XML vulnerabilities in closed-source code?! How about you first wait to find out the severity of the exploits in the open-source software, and equally importantly, how long they have been in the source first, before you try and divert the conversation? Further, the exploits weren't found by the project authors, but by a security vendor who applied protocol fuzzing tools. Fuzzing tools operate on the binaries, and thus, the source code is irrelevant, you can run these tools on any software irrespective of the source-code ideology behind it. Where the open-source aspect may come into play is largely in patch response times, but the argument that they may have been found quicker in closed-source software is in this case unsubstantiated, and especially tenuous considering the mechanism that found them is equally applicable.
Of course, they might turn out to be entirely mundane, as the specifics of the vulnerabilities have not been disclosed, and security vendors tend to exaggerate the severity of any given vulnerability they find. But still, have you considered fixing your own house before immediately running out to abuse entirely unrelated software? It might not be long before someone is wondering about various vulnerabilities in open-source software. Some restraint is useful considering the complete lack of solid information.
Only just noticed you were replying to a (hidden) troll, which changes the tone of my reply a little, but the point about applicability of fuzzing tools is in my view still entirely valid. Sorry about my remark that you rushed out to change the conversation.
See signature.
random gibberish to make lameness filter happy.
Stupidity is an equal opportunity striker.
Fellow slashdotter Bill Dog
If Oracle doesn't support that, then I would say their implementation is broken.
I'd just suspect it's more than 4 years old (hmm, looks like the 10gR2 we're using was actually released in 2005, and that RFC is dated October 2005). The "standard" is "this seems to be what most people are doing" rather than "here's the definition of a cool new format".
Clearly, you can represent tree style data with CSV, but it has more flexibility than you think.
Hm, cool. Also, ick.
Too many people roll their own CSV, because it seems so simple. Then they don't quote and escape quotes properly blaming any issues on garbage data.
...and then I have to tweak it into the csv dialect that Oracle understands.
\r or \n aren't problems with proper CSV; \r\n combinations define record breaks, and can be included in data fields by enclosing them in double quotes.
Then you should use something that generates proper CSV (which means it either uses quotes properly or doesn't allow anything that needs quoting in data fields.)
You use more than one CSV file in some appropriate wrapper.
Yeah, because there are never bugs in open source software that don't linger for months or years without being fixed. Nope they are all fixed within sheer minutes of the bug report. Oh wait...
Exactly. Unit tests do not prove the absence of bugs. They prove the existence of bugs.
If this "security hole" just means that everybody is forgetting to disable the default way these parsers handle URI's for Schema's and DDT's then this is just a big scam. It's a known issue, although I would not be surprised if it isn't well known to many developers. In the worst case it is some kind of way of letting the XML parser perform a random URL request without the developer having the power to stop this from happening.
I must admit that the default behaviour as well as the API documentation leaves a lot to be desired. Even when security is directly involved, say with XML digital signatures, the API does not even mention how to do this in a secure fashion. I've written an application that verifies XML digital signatures in Java and there is at least 10 things you need to do to be slightly secure against forgeries and DoS attacks. At that time none of these were mentioned in the API, they were probably considered public knowledge by the API designers.
Very "funny" if you try verify a message using an URI within the message itself. Even worse with XML digital signatures, the signature could be over a completely different message than the one you are trying to verify if you are not careful.
You may have noticed that two of the three languages that I mentioned are garbage collected (D and Java). This isn't entirely coincidence. Languages that implement garbage collection in their design, and reduce or eliminate the direct use of pointers seem to eliminate an entire raft of security problems. That they tend to have dynamic arrays and arrays that implement bounds checking is merely one bonus.
C++ was at one time going to implement part of this in the new standard...which has now both had features cut, and been pushed further into the future, but those were cut years ago. Add-on libraries like Boost don't solve the problem. It needs to be designed into the language so that one can count on it being in use. For that reason the STL vectors don't count as a solution to this class of problem. For that matter, I note that C (and presumably C++) now allows one to *specify* that an array as an unspecified size. (I forget the syntax, but it's merely the legitimization of an old and very insecure trick used by C programmers to allow them to implement at run-time variable sized arrays. It was always quite dangerous, and making it legitimate doesn't remove the danger.)
I'll agree that one can write dangerous code in ANY language. One doesn't need to choose a language that goes out of it's way to make it the easiest choice. (That's slightly unfair. When C was designed the effort was to get something efficient enough to replace assembler. C did that, and it was, indeed, safer than assembler. And C++ merely copied it's approach from C. Indeed, for a long time it was merely a superset of C. But that was then and this is now.)
I think we've pushed this "anyone can grow up to be president" thing too far.
There are always going to be bugs that linger for awhile, but exploits are usually fixed pretty quickly in high-profile F/OSS projects and compared to Adobe or Microsoft, I think they do pretty decent without having tons of paid developers
sed has been around for more than 20 years though and I bet a couple of one liners can fix most anything right up.
Do you have any hard evidence of that or is it just faith?
Don't get me wrong I'm a big fan of open source, free software in the RMS meaning of free. But I just don't really get along with faith. It's quite astonishing how much of the commentary on Slashdot is all about faith with no reference to evidence. I guess we're all human though, even us techie geeks!
It's difficult to say from the information provided, but it sounds like someone just rediscovered XML entity attacks (as I did a few years ago). Assuming it is the same thing, here are some references from 2002 and 2006 with more details:
http://www.securiteam.com/securitynews/6D0100A5PU.html
http://www.sift.com.au/assets/downloads/SIFT-XML-Port-Scanning-v1-00.pdf
I've used these attacks in real-world tests and they are still surprisingly effective - just not new.
CERT-FI advisory: https://www.cert.fi/en/reports/2009/vulnerability2009085.html
Sun advisory: http://sunsolve.sun.com/search/document.do?assetkey=1-66-263489-1
CERT-FI advisory had a link to Codenomicon web page with some more details: http://www.codenomicon.com/labs/xml/
What am I to think now?
That better is not perfect ? Who told you that open source software had absolutely no flaws ? The open source software paradigm does not prevent bugs. It only makes it more likely that those bugs will be caught sooner since so many eyes can peruse the code. If you're not satisfied, ask for a refund...
Well, you obviously haven't tried before, so why on earth start now?
May we live long and die out
It only makes it more likely that those bugs will be caught sooner since so many eyes can peruse the code
Do you have any evidence for this or is it just your belief? I'm sure there are academic papers that look at this and of course there are sizeable historical repositories of vulnerabilities, e.g. US-CERT. It's actually possible to test your hypothesis.
What you find when you do this is that some closed source projects have good track records and some have bad track records. Likewise some open source projects have good track records and some have bad track records. You will find, for example, that there's a huge difference in standard between Microsoft (now actually quite good) and Apple and Adobe (very poor at security).
The only conclusion I can draw from this is that being open source doesn't result in your code being better than closed source code. Likewise vice versa. My belief is that it is the processes and people involved that make the difference.
Does anyone want to argue against this?
I agree that if these vulnerabilities have been found with fuzzing tools, they would have been detected just as easy in closed source software. But they could have crafted the input data a little according to the structure of the parser code.
But yes, you are right, I was basically backtrolling a troll and it shouldn't be modded up..
"This memo provides information for the Internet community. It does not specify an Internet standard of any kind. "
That's a boilerplate header at the start of every RFC, included for mostly archaic formality reasons -- in reality the RFCs are seen as standards by pretty much anyone working on a project which involves them, and the agreement between RFC users /makes/ them standards, as much as the RFCs themselves would like to object :-P
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
I totally agree with you, hence the word "likely". Of course it's the people that makes the difference, the code won't debug itself on its own. What I meant is that MORE people CAN peruse the code than in a closed source environment. Because an infinite number of people can look at the code doesn't mean that it will.
Do you have any evidence for this or is it just your belief?
Of course I don't have evidence, and I suspect you knew the answer to your question before you asked it. Would you yourself spend an hour or two of your life doing research for the sake of a two line comment and furthermore for something that is as clear as 2 + 2 = 4 ? Who needs evidence anyway ! :-P
What am I to think now?
Start by thinking how nice it would be to wire me some money. Then think about how you don't really need the money. Finally, follow the necessary thought processes that result in action upon these two items.
Well, it may be as clear as 2 + 2 = 4 to you but perhaps I'm not as clever as you. In any case I'm somewhat old fashioned and like to have evidence.
The particular article I was thinking of is: "Is Linux Better than Windows Software?", Adenekan (Nick) Dedeke, IEEE Software, Vol 26 issue 3.
The author says:
The author then goes on to conclude:
I also recall an article that Diomidis Spinellis (an academic and a keen free software advocate) published I can't remember where. He used automatic code analysis tools to compare the source code for a range of operating systems (Windows research kernel, Linux, some BSDs) and found that no significant difference in quality.
So, yes it's clearly true that more people CAN look at the code for open source software. But how does that translate into quality is much more interesting. Just because they can doesn't mean that they will.
I think the "many eyes make all bugs shallow" idea (Linus's rule) has some merit but it's not the only factor. For security the fact that there are such disparate bodies responsible for delivering Linux makes it very hard to get a good security process. Consider the fiasco surrounding the now infamous Debian OpenSSH bug where the bug was introduced by downstream packagers who didn't understand the implications of what they were doing. Where was the security audit? Well there wasn't one.
Traditional closed source companies appear to be able to have much more control and oversight of cross-cutting issues like security. Doesn't mean they will exercise it (see Microsoft in the time before XP SP2 and SDL). But it's certainly possible for them to do security well (see Microsoft today). Then there's a company like Apple which to my mind is like MS was 10 years ago. Security is just not on their radar - it's an irritation to them.
I think the open source bazaar approach has been wonderfully successful and has proven to scale fantastically. It's next big challenge though is in the realm of security. As Linux gains ground (which it seems likely to at least in server space) it will increasingly come under heavy attack. Can the bazaar adapt to handle this? I'm positive that it will but it remains to see how it will.