Netscape Dumps Critical File, Breaks RSS 0.9 Feeds
An anonymous reader writes "In the standard definition of RSS 0.91, there are a couple of lines referring to 'DOCTYPE' and referencing a 'dtd' spec hosted on Netscape's website. According to an article on DeviceForge.com quite a few RSS feeds around the web probably stopped working properly over the past few weeks because Netscape recently stopped hosting the critical rss-0.91.dtd file. Probably someone over at netscape.com simply thought he was cleaning up some insignificant cruft." Some explanation has been offered by a Netscape employee.
I would've seen this post sooner, but my RSS feed was broken... something about a 404?
I don't see how this would break RSS readers. DTDs pretty much never get read except by validators. Normal SGML and XML parsers just treat the DTD URL as an opaque string, not as something that can be retrieved.
According to an article on DeviceForge.com quite a few RSS feeds around the probably web stopped working properly over the past few weeks because Netscape recently stopped hosting the critical rss-0.91.dtd file.
STOP, Grammar time. Ooooh whoooaaa oh oh...
Probably someone over at netscape.com simply thought he was cleaning up some insignificant cruft."
Or Netscape got tired of people using their bandwidth. Regardless of the reasons: if you reference a file on someone's site, it's hardly their fault if they move/change/delete it, and it breaks your stuff.
Please help metamoderate.
"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."
Or maybe some smart person at Netscape decided to teach some people a lesson about using a 3rd party as a single point of failure?
And if so, why would anyone rely on AOL to make something on the web work?
http://bgcommonsense.blogspot.com
Suck that, Web 2.0!
Is to have a common component shared among many documents without replication.
Class paths is java are the perfect example to say how it *should* work. Java CLASSPATHs in every application/installation I have seen are site-local, all paths accessible without going over the internet to another site to get classes.
To be similar, an RSS site should copy this DTD to their local server, or to a server with which they have a concrete understanding of the relationship. Either a commercial agreement with a peer or at least using a server from an organization who explicitly defines the purpose of hosting to be a common place to promote it as a standard.
Did netscape promise itself to be an organization sharing that DTD explicitly, or did site developers get in the practice because 'it just always was there'?
XML is like violence. If it doesn't solve the problem, use more.
A lot of rss readers can't parse a custom dtd, they assume that rss is pretty much fixed, and ignore the dtd line completely.
This is the precise reason why I host everything myself including my own series of tubes, dubbed the Internets. I host not only every file that my site uses, but I also have a program that regularly crawls the entire Internet and compresses it onto my own distributed system. That way I can browse the Internet by myself without worrying if someone else's system will fail. Although I do need to replace systems every now and then. But that's not a problem, b/c the distributed system has 3-5 copies of the Internet, each copy in a different place. Wait, isn't their some other company that does that? I can't quite place the name.
Seriously though, relying on some other system so your site will work is a recipe for disaster. It's similar to relying on someone to take you to work everyday. After a while, you get used to that fact that someone else is driving you that you don't even think about it. Then your driver gets deleted somehow. And you're stuck with no way to work.
Funny createSig(Witty remark, Odd reference)
{
return (Funny)remark + (Funny)reference;
}
It is expected that DTDs are hotlinked. For example, if you ever look at html source of a web page, you would see: on the top, and the hotlink goes to somewhere on w3.org. That is because W3 is the authority body that defines the html.
Since Netscape is the authority body that defines RSS 0.91, it is a bit strange how they stopped hosting the definition.
In any case, the missing definition won't affect software that processes RSS feeds. It only affects software that checks whether a SGML document is structured properly according to that missing DTD.
The main interest to this article seems to be the speculation how a deprecated web 1.0 company could end up hiring a clueless webmaster who deletes important files without recognizing its importance.
I once had a signature.
This could seriously affect both of the guys using Netscape.
Fetching the spec is idiotic.
First of all, that's a needless operation. It can take time; don't forget the DNS lookup and all.
Second of all, it's not as if you could handle any random DTD. Software doesn't work that way. (this is one of the reasons why XML itself is a mostly-lame idea) If the XML doesn't match expectations, you can't convert it to your own internal representation. You probably have a C struct that you need to fill in. Even in some wild interpreted language like perl, you just won't have any use for unexpected data structures and you damn well need the expected data structures.
From April 2001, "Netscape removed the RSS 0.91 DTD from their website. This means that all RSS feeds which depend on the RSS 0.91 (many, MANY news sites) cannot be used with a validating parser."
/. discussion (which, um, I haven't read) remains.
It seems as though it just took them 5+ years to follow up on the threat? Primary links are broken, but of course the lively
my.netscape.com is undergoing a redesign, and when we announced the redesign about 10 days ago, the DNS entry for my.netscape.com was changed to point to the new server where My Netscape will be living. This had the effect of making anything under the old my.netscape.com unavailable, since the only thing public on the new server is a splash page. (Nobody on the team was especially aware of this DTD file since all of the old Netscape employees were let go last year around the time Netscape.com was redeveloped; anybody working at Netscape now was hired since then.)
Now, why this file was living under my.netscape.com is anybody's guess, but we'll have it restored ASAP. I only wish that someone had brought it to our attention so that I didn't have to find out about it from Slashdot.
Christopher Finke
Netscape Developer
If I would create a reader that was dependent on version 0.91 of the distribution, it sure as hell would include the DTD in local storage. It makes no sense to create a reader that can also use, say, version 0.92 since you would not know what had changed (and there is no such thing as inheritence between versions of a DTD afaik). Actually, as other readers noted, it would be terribly stupid to make your web-server or client rely on a third party computer for which you cannot guarantee the uptime.
These URL's are mainly there for their Uniqueness, not so much as for a place of quaranteed storage. Of course, they are also a nice place to look for the actual definition, but after that you would need a local repository. This is the first thing an XML library should support, and the first thing a moderately intelligent programmer should look at. I get *very* annoyed if this kind of basic rules are ignored. And I've even seen them ignored by people pointing to the XML digital signature definitions, where security and reliability should be the first requirements in the design.
Also, what would happen if w3c.org or netscape.com go the way of the Gopher? If they go bust? It's a quickly changing world out there.
This blast is not squarely aimed at you, but you triggered it. Treat this in the spirit it is meant please (if I didn't give a crap at all, I wouldn't comment. Show this to your insulated bosses who don't know the first thing about community and transparency. Kudos to you BTW for showing initiative and acting on a Slashdot post. Honestly, I would not have given the "new Netscape" that much credit.).
:-)
>I only wish that someone had brought it to our attention so that I didn't have to find out about it from Slashdot.
This rankles.
Have you EVER tried contacting Netscape from the outside world? Seriously, I can count the number of times:
*) When my.netscape.com locked out Konqueror (1998?)
*) When my.netscape.com WITHDREW the ability to embed RSS feed on your "my" page -- actually this was PRE-RSS if I recall. Way before it was commonplace, you could embed Slashdot and Linux Today feeds. Then they killed it, presumably because they got enough users or some pointy haired reason. 1999.
*) When my.netscape.com adopted a shitty policy of DELETING all your mail if you don't login for 30 days. This did not seem to be publicised by an actual email. They don't seem to delete the mailbox itself, which violates RFCs I'm sure and basically insinuates the mailbox is active. I lost tons of mail from 1996-2003 (yeah yeah backups. Some things I didn't think I would need later). ?? Happened in 2003. Note that mailboxes were only 5MB still, so I quickly bailed for a 100MB Yahoo account.
*) The 2001 deletion of Netscape Developer. This lost a ton of Netscape copyrighted Javascript documentation.
Just TRY contacting Netscape from their page. The best you can do is use the WRONG FORM to submit to some contracter who won't forward it. Or, oh yeah - there's a 900 number for by the minute Support.
Back when it mattered, there was no 'Google Guy' for Netscape, who would act as an unofficial liason. After Jamie Z left, no one internally tried to fill the shoes of a community facing employee.
While I'll be eternally grateful for Netscape's open sourcing of their browser. What a different world it is now. Too bad that step is something the current management would never have allowed (that's the perception). I can't think of a more opaque Internet company than today's Netscape. I'm sure there are people who disagree or wish it could be changed (you're here..) but that and a $1 gets you a cup of black coffee. Show this to your boss - there are suggestions here
not trying to be a troll here.. but.. one would think that that file would have been accessed quite often and that would have shown up in the logs...
If I was a new hire at some old company where everyone else had been let go, I'd at least check out the logs and see what is being used? and then if some file is being hit 1,000's of times a day.. maybe ask a few questions..
http://www.hawknest.com/
It's not any different, except the w3c is run by intelligent people, and Netscape, apparently, is not.
I've always thought the full paths were a bit stupid too, and they should have some sort of shortcut standard, one that says "Use w3c's HTML4.0 standard", and the web browser knows how to contruct a path to find w3c standards. That way, when "Use netscape's RSS0.91" standard stopped working, web browsers could have a trivial update, or their config could even be changed manually, to tell them where to find netscape's standards.
Granted, they already have something like this in the DOCTYPE, that's what '-//W3C//DTD HTML 4.01//EN' is, but then they blow it by then including the path after that. The parser should, instead, have to look at W3C and go 'Hey, I know where that is, that's w3c.org' and construct a standardized path using 'DTD HTML 4.01', like 'http://w3c.org/doctype/HTML4.01.dtd'. (And I just realized that string mysteriously doesn't include 'strict' or whatever in it, so now I'm slightly confused as to what good it's for.)
That way, when something happened to a server, the standard can be trivally updated to say 'W3C now means this domain, instead of w3c.org', and every damn page in existence doesn't have to change. Mandate that every parser should expose these locations to be reconfigured manually if needed, although obviously some sort of automatic updating is a good idea. (Notice, in general, application software doesn't need to be updated, because application software doesn't try to download the stuff in the first place.)
Now someone's going to host the DTD at some random place, and everyone will manually update everything to load the wrong URL when someone asks for "http://netscape.com/publish/formats/rss-0.91.dtd" then Netscape will move it back, and some applications will change back, and some won't, and it will be a big mess.
I understand the point of paths, in that, in theory, everyone can produce their own format and publish their own DTD. This has not, and probably is not, going to happen, and at this point all browsers interpet DOCTYPE strings as unparsable strings, and the only ones who actually read the things are the validators.
If corporations are people, aren't stockholders guilty of slavery?
Once upon a midnight dreary, while I websurfed, weak and weary, Over many a strange and spurious bookmark of 'free news galore', While I clicked my fav'rite feed, suddenly there came a warning, And my heart was filled with mourning, mourning for my dear amour. "'Tis not possible," I muttered, "give me back my free news source!" Quoth the server, "404".
If you were offended by anything I said... No, I'm not sorry. Please lighten up.