Netscape Dumps Critical File, Breaks RSS 0.9 Feeds
An anonymous reader writes "In the standard definition of RSS 0.91, there are a couple of lines referring to 'DOCTYPE' and referencing a 'dtd' spec hosted on Netscape's website. According to an article on DeviceForge.com quite a few RSS feeds around the web probably stopped working properly over the past few weeks because Netscape recently stopped hosting the critical rss-0.91.dtd file. Probably someone over at netscape.com simply thought he was cleaning up some insignificant cruft." Some explanation has been offered by a Netscape employee.
Of course they retrieve it - unless they already have a local or cached copy. How else would they be able to parse a document marked up using a custom DTD?
Don't answer - go hang your head in shame.
Actually the DTD is loaded up by pretty much every proper XML library even if validation is "off".
The DTD contains more than just the element definitions and hierarchy. Its also used to define entities (&...;) that are non-standard to XML but may be expected in the file. HTML has tons of pre-defined entities but XML only has the core 4. All others are defined in DTDs and loaded on the fly as part of the processing.
There are ways to turn it off at the lowest levels, but higher-level abstractions/libraries might not give access to that. For example, with JAXP + SAX you can turn off DTD loading, but Jakarta Commons Digester doesn't give a setting where you can trigger that, so Digester tries to load the dtd, and even with validation off you can't change that. My only recourse is to take the DTD lines out of the various config files. (Reason: My JBoss server is deployed in private networks where the server can't reach the internet).
"But remember, most lynch mobs aren't this nice." (H.Simpson)
-- Joe
Non-validating processors are not required to read any external DTD subset.
From April 2001, "Netscape removed the RSS 0.91 DTD from their website. This means that all RSS feeds which depend on the RSS 0.91 (many, MANY news sites) cannot be used with a validating parser."
/. discussion (which, um, I haven't read) remains.
It seems as though it just took them 5+ years to follow up on the threat? Primary links are broken, but of course the lively
my.netscape.com is undergoing a redesign, and when we announced the redesign about 10 days ago, the DNS entry for my.netscape.com was changed to point to the new server where My Netscape will be living. This had the effect of making anything under the old my.netscape.com unavailable, since the only thing public on the new server is a splash page. (Nobody on the team was especially aware of this DTD file since all of the old Netscape employees were let go last year around the time Netscape.com was redeveloped; anybody working at Netscape now was hired since then.)
Now, why this file was living under my.netscape.com is anybody's guess, but we'll have it restored ASAP. I only wish that someone had brought it to our attention so that I didn't have to find out about it from Slashdot.
Christopher Finke
Netscape Developer
You make several good points that I want to respond to more fully, but I've got to run out, so I'll have to do that later. In the meantime, I'll put this out there: my e-mail address is chris@newnetscape.com; my screenname and other contact information is available at my website. Anyone who wishes to do so can contact me regarding issues with any of the Netscape websites or the Netscape browser; if I can't solve your problem, I can definitely get you connected with the right person.
What you need to implement is org.xml.sax.EntityResolver. There's several methods that need to be implemented that are the different ways the SAX parser could query for stuff. Basically it will give you the Public ID and/or System ID and ask you to return a stream to what that resolves to. Then, in your code, all you do is run a hashmap that maps a given ID to a local resource (eg file or database BLOB) and then do your own stream opening/processing from there. I attempted to post some example code but seems like that trips the lameness filter :( So, just have a look at the interface. The code required is pretty trivial to implement. If that fails, you should be able to work out my email address from the website address under my profile - send me an email and I'll send you the code we use in one of our projects.
Life is complete only for brief intervals in between toys or projects -- John Dalton
It's called a non-validating processor and it's totally compliant with the XML 1.0 specification.
Bogtha Bogtha Bogtha
Yes, that's totally feasible. You're mistaking the semantics of document types with the external DTD subset.
It's true that inventing new element types and putting them in your DTD isn't going to magically make software understand what those element types mean. But DTDs provide other information - for instance, what entity references expand to, which attributes are IDs, and so on. This is useful information and can be processed in a generic fashion.
Bogtha Bogtha Bogtha
) The 2001 deletion of Netscape Developer. This lost a ton of Netscape copyrighted Javascript documentation. Unless I'm mistaken, this has been (quite some time afterwards) transfered to the mozilla fundation, and can be accessed at http://developer.mozilla.org/en/docs/JavaScript Cheers,