Domain: cafeconleche.org
Stories and comments across the archive that link to cafeconleche.org.
Comments · 24
-
Re:Only 30K lines anyway...What's wrong with reading until you get to the end of the file? That's how the idiom seems to be done in every other language I've used. Why is that an exception in Java? It's not. The typical Java idiom for reading lines from a file looks like this. (Actually, in production, you would actually handle the errors in that exception block rather than swallowing them. I would personally put the close() call in a finally block, not the main try block, as you still want to close even if a read fails. But you get the idea.)
As you can see, you read until it comes back empty, and then you're done. No exceptions will be used except when things are exceptional.
That's still slightly ugly, but that should rarely matter in Java or in any well-built OO system. You never read files for the sake of it; you're always up to something specific. Even when I'm dealing with files all the time, I almost never write code like this, because I'm reading XML or reading a properties file or something where all the details are taken care of in a method that I almost never see. -
OO Gui "Bloopers"?
I haven't used Open Office enough to have an opinion, but Elliotte Rusty Harold used it to write a book, and came away with the opinion that the program is full of "GUI Bloopers". More here.
-
Importance vastly overstated
The review almost sold me on the fact that I could actually learn something from this book. Looking at the sample chapters here told me the truth
-
Re:Just because you CAN...
CDATA sections don't need to nest. If you're trying to nest them, you're doing something wrong. CDATA sections are merely syntax sugar. (Items 9, 14 and 15) You absolutely can include the three character sequence ]]> in XML documents. You just have to escape the greater than sign as >.
The point is not that escaping is not necessary when creating an XML document. The point is that the escapes you need are predefined and understood by the parser. You don't need to think about them.
I've seen way too many CSV and similar flat-file parsers that keel over and die (or worse, corrupt data without noticing a problem) when presented with data that contains commas, tabs, quotation marks, line breaks and the like.
XML avoids this by providing necessary escapes. Furthermore, when you receive an XML file, you know what the escapes are. You don't have to guess whether this file uses \" or "" or some other mechanism for escaping otherwise reserved characters. It's not that XML's escape mechanism is fundamentally better or worse than other escape mechanisms. It's just that it's standard enough that we can stop worrying about it.
-
Re:Just because you CAN...
CDATA sections don't need to nest. If you're trying to nest them, you're doing something wrong. CDATA sections are merely syntax sugar. (Items 9, 14 and 15) You absolutely can include the three character sequence ]]> in XML documents. You just have to escape the greater than sign as >.
The point is not that escaping is not necessary when creating an XML document. The point is that the escapes you need are predefined and understood by the parser. You don't need to think about them.
I've seen way too many CSV and similar flat-file parsers that keel over and die (or worse, corrupt data without noticing a problem) when presented with data that contains commas, tabs, quotation marks, line breaks and the like.
XML avoids this by providing necessary escapes. Furthermore, when you receive an XML file, you know what the escapes are. You don't have to guess whether this file uses \" or "" or some other mechanism for escaping otherwise reserved characters. It's not that XML's escape mechanism is fundamentally better or worse than other escape mechanisms. It's just that it's standard enough that we can stop worrying about it.
-
Re:Really?
There's a very real tension between making examples too trivial to be interesting and making them too long to be readable. I struggle with it in every book I write, and every other programming book author I know does so too. I've tried putting so-called real-world examples in books, and it's hopeless. It can't be done. There wouldn't be any space left for the explanatory text, nor would anyone put up with reading page after page of code.
Most importantly, while I tend to be writing about just one topic at a time, real world programs wander all over the map. I may be trying to explain how to use callbacks in SAX, but a realistic program also has to consider network latency, GUI design, error logging, numerical algorithms, internationalization, and a hundred other things that aren't on topic. Covering them all would obscure the subject I'm actually trying to explain. Some things you just have to leave for other books and other authors.
As an author, I try to strike the right balance between excessive simplicity and excessive length. Sometimes I hit it. Sometimes I don't. I actually think Effective XML hits it fairly well. In fact, this book was one of the toughest I ever had to write, precisely because it was so short that I couldn't spew pages like I did in Processing XML with Java (1100 pages) or the XML 1.1 Bible (1000 pages). I had to be really picky about how much code I included, and make sure that each example carried its weight, demonstrated just the point at hand, and nothing else.
By the way, the chapter with that specific example is online if anyone cares to see for themselves just what it is that makes names a more interesting and complex problem than "John Doe Ph.D" seems to be at first glance. -
Re:Really?
There's a very real tension between making examples too trivial to be interesting and making them too long to be readable. I struggle with it in every book I write, and every other programming book author I know does so too. I've tried putting so-called real-world examples in books, and it's hopeless. It can't be done. There wouldn't be any space left for the explanatory text, nor would anyone put up with reading page after page of code.
Most importantly, while I tend to be writing about just one topic at a time, real world programs wander all over the map. I may be trying to explain how to use callbacks in SAX, but a realistic program also has to consider network latency, GUI design, error logging, numerical algorithms, internationalization, and a hundred other things that aren't on topic. Covering them all would obscure the subject I'm actually trying to explain. Some things you just have to leave for other books and other authors.
As an author, I try to strike the right balance between excessive simplicity and excessive length. Sometimes I hit it. Sometimes I don't. I actually think Effective XML hits it fairly well. In fact, this book was one of the toughest I ever had to write, precisely because it was so short that I couldn't spew pages like I did in Processing XML with Java (1100 pages) or the XML 1.1 Bible (1000 pages). I had to be really picky about how much code I included, and make sure that each example carried its weight, demonstrated just the point at hand, and nothing else.
By the way, the chapter with that specific example is online if anyone cares to see for themselves just what it is that makes names a more interesting and complex problem than "John Doe Ph.D" seems to be at first glance. -
Re:Really?
There's a very real tension between making examples too trivial to be interesting and making them too long to be readable. I struggle with it in every book I write, and every other programming book author I know does so too. I've tried putting so-called real-world examples in books, and it's hopeless. It can't be done. There wouldn't be any space left for the explanatory text, nor would anyone put up with reading page after page of code.
Most importantly, while I tend to be writing about just one topic at a time, real world programs wander all over the map. I may be trying to explain how to use callbacks in SAX, but a realistic program also has to consider network latency, GUI design, error logging, numerical algorithms, internationalization, and a hundred other things that aren't on topic. Covering them all would obscure the subject I'm actually trying to explain. Some things you just have to leave for other books and other authors.
As an author, I try to strike the right balance between excessive simplicity and excessive length. Sometimes I hit it. Sometimes I don't. I actually think Effective XML hits it fairly well. In fact, this book was one of the toughest I ever had to write, precisely because it was so short that I couldn't spew pages like I did in Processing XML with Java (1100 pages) or the XML 1.1 Bible (1000 pages). I had to be really picky about how much code I included, and make sure that each example carried its weight, demonstrated just the point at hand, and nothing else.
By the way, the chapter with that specific example is online if anyone cares to see for themselves just what it is that makes names a more interesting and complex problem than "John Doe Ph.D" seems to be at first glance. -
Re:Just because you CAN...
These days data has to be pretty damn simple to justify using a flat file rather than XML. I wrote more about this in my previous book, Processing XML with Java than in this one, though. Chapters 1-4 discuss this in some detail.
Real-world data often gets messy in ways that don't lend themselves to flat files. For instance, two of the thorniest problems:
- How do you handle encoding detection and international characters?
- What do you do when the data contains characters you're using as field delimiters?
Both of these are completely solved by XML with no extra effort on your part, and these are hardly the only issues.
I certainly agree that it's easier to write a parser for a flat file format than it is to write a parser for XML. However, it's much easier (and much more reliable) to use one of the existing well-tested, debugged XML parsers than it is to write your own flat-file parsing code.
-
Re:Really?
I'm very skeptical of so-called binary XML formats, as you'll find in Item 50, Compress if Space is a Problem. There are use cases where XML isn't appropriate (and I discuss these in the book, mostly data scanned from nature such as JPEGs and MP3s) but it isn't at all clear how a binary encoding of XML, would help these use cases. There are also environments like the smaller cell phones where XML doesn't (yet) work very well. Again, moving to binary doesn't necessarily address the underlying issues here. Furthermore, developing new formats tailored to special purposes and environments such as cell phones and scientific data, tends to deoptimize XML for other uses. XML isn't an optimal format for any one use case, but it's a very nice compromise across many different areas.
The one use case a binary XML encoding does address well is the need of a number of vendors to sell expensive tools for working with data and hide people's data from them. XML is just too obvious and too cheap to justify lots of expenditures on tools. If you hide the text inside an opaque binary format that programmers need special (even patented) tools to view, why then, companies can sell tools again! Surprisingly, I don't find this use case too compelling. :-) -
Re:JDOM.org
XOM is even easier.
-
XOM!XOM is an excellent XML-handling library. It makes XML parsing, interpretation, and generation a breeze, and goes to great lengths to ensure that what you do is correct according to the XML specs. It's an absolute pleasure to use, especially compared to the "standard" SAX and DOM libraries.
It's created by Elliotte Rusty Harold, who is one of the bigwigs in both the XML and Java arenas. XOM is at the intersection of those two sets.
Technically it's still in "beta", but the API hasn't changed at all since the Alpha releases, and all the bugs fixed in the beta stages have been for performance boosts or to fix bugs dealing with the very fringes of XML.
Probably the best part of the library isn't the code itself; it's the design process that went into making it. Check out the Design Principles for a good read.
Craig
-
XOM!XOM is an excellent XML-handling library. It makes XML parsing, interpretation, and generation a breeze, and goes to great lengths to ensure that what you do is correct according to the XML specs. It's an absolute pleasure to use, especially compared to the "standard" SAX and DOM libraries.
It's created by Elliotte Rusty Harold, who is one of the bigwigs in both the XML and Java arenas. XOM is at the intersection of those two sets.
Technically it's still in "beta", but the API hasn't changed at all since the Alpha releases, and all the bugs fixed in the beta stages have been for performance boosts or to fix bugs dealing with the very fringes of XML.
Probably the best part of the library isn't the code itself; it's the design process that went into making it. Check out the Design Principles for a good read.
Craig
-
XOM!XOM is an excellent XML-handling library. It makes XML parsing, interpretation, and generation a breeze, and goes to great lengths to ensure that what you do is correct according to the XML specs. It's an absolute pleasure to use, especially compared to the "standard" SAX and DOM libraries.
It's created by Elliotte Rusty Harold, who is one of the bigwigs in both the XML and Java arenas. XOM is at the intersection of those two sets.
Technically it's still in "beta", but the API hasn't changed at all since the Alpha releases, and all the bugs fixed in the beta stages have been for performance boosts or to fix bugs dealing with the very fringes of XML.
Probably the best part of the library isn't the code itself; it's the design process that went into making it. Check out the Design Principles for a good read.
Craig
-
Re:List (and reasons)
I think that's a good call, but I'd like to point out that JDOM is a library that sits on top of Xerces (or another DOM/SAX implementation). There are also other 'easy' libraries that inhabit a similar niche, such as dom4j and one that I use quite a bit, XOM. If you have to work with XML documents, mucking about in their innards, you probably want one of these libraries around. They're all XML Object Models, but they're not nearly as painful to use as the W3C DOM (which Xerces includes). But I also wouldn't want to use any of these libraries with Crimson as the underlying parser (which is what ships with the Sun JDK 1.4 series). Xerces *tends to* be more correct and robust.
-
XML 1.1 incompatibility
Elliotte Rusty Harold has a persuasive argument against XML 1.1. He is someone who's opinion should be considered. He writes very thorough, good books on XML and has created the most excellent XOM (same goal as DOM, but easy to use). He also keeps us current on the XML world at Cafe con Leche.
-
XML 1.1 incompatibility
Elliotte Rusty Harold has a persuasive argument against XML 1.1. He is someone who's opinion should be considered. He writes very thorough, good books on XML and has created the most excellent XOM (same goal as DOM, but easy to use). He also keeps us current on the XML world at Cafe con Leche.
-
XML 1.1 incompatibility
Elliotte Rusty Harold has a persuasive argument against XML 1.1. He is someone who's opinion should be considered. He writes very thorough, good books on XML and has created the most excellent XOM (same goal as DOM, but easy to use). He also keeps us current on the XML world at Cafe con Leche.
-
XML 1.1 Not necessarily a good idea
I'm just going to syndicate Elliote Rusty Harold [scroll down to the Feb. 5th entry] on this one and pass along his suggestion that you don't use XML 1.1; Xerces 2.6 will process it, but most things won't, and most of the benefits of what's new in XML only apply if you're putting your documents into a few (mostly Asian) languages.
-
Several chapters are online
Nice review. Thanks! It's interesting how many of the comments here relate directly to chapters in the book. For instance, there's a lot of concern about XML's perceived verboseness. This is addressed directly in Item 50, Compress if space is a problem. This chapter and ten others are online at http://www.cafeconleche.org/books/effectivexml/ . Check it out.
-
Several chapters are online
Nice review. Thanks! It's interesting how many of the comments here relate directly to chapters in the book. For instance, there's a lot of concern about XML's perceived verboseness. This is addressed directly in Item 50, Compress if space is a problem. This chapter and ten others are online at http://www.cafeconleche.org/books/effectivexml/ . Check it out.
-
Re:OK, I Installed MandrakeI only recently installed jEdit for the first time, and don't have much experience with it yet. It is, of course, a cross-platform editor, and you don't have to be a java developer for it to be useful.
Apparently, you can write major book projects with jEdit.
-
Re:Bah.(study xml a little more man)
You should go back home and study xml a little more man, and think twice before saying whats in your mind. A good XML book as a first homework to you.
-
Full Details
Full details of why this has the potential to break things are on the XML news site Cafe Con Leche.
Please read that before making uninformed comments - news.com isn't where you'll find technical information about this problem.