W3C Gets Excessive DTD Traffic
eldavojohn writes "It's a common string you see at the start of an HTML document, a URI declaring the type of document, but that is often processed causing undue traffic to W3C's site. There's a somewhat humorous post today from W3.org that seems to be a cry for sanity and asking developers and people to stop building systems that automatically query this information. From their post, 'In particular, software does not usually need to fetch these resources, and certainly does not need to fetch the same one over and over! Yet we receive a surprisingly large number of requests for such resources: up to 130 million requests per day, with periods of sustained bandwidth usage of 350Mbps, for resources that haven't changed in years. The vast majority of these requests are from systems that are processing various types of markup (HTML, XML, XSLT, SVG) and in the process doing something like validating against a DTD or schema. Handling all these requests costs us considerably: servers, bandwidth and human time spent analyzing traffic patterns and devising methods to limit or block excessive new request patterns. We would much rather use these assets elsewhere, for example improving the software and services needed by W3C and the Web Community.' Stop the insanity!"
That sounds like a DTD thing to do! If you are a dee, please don't marry a tee, because if you marry a tee, your kids will be DEE TEE DEE."
First off, I don't know much about DTDs, but from what I can tell, it's like a template, like a Cascading style sheet, or something like that. That said...
Why did they even allow people to link to this thing in the first place? I think that they could have predicted that this would happen, simply because the web is huge and if even a small percentage of all the servers on the internet start to link to the code, they are going to get a massive influx of requests demanding this information.
Knowing this, I wouldn't let people link directly to the code. That doesn't mean that they can't use it, (they can use it by downloading the code onto their own computers and hosting it there) but I would make sure that they can't link directly to my servers. Don't get me wrong, it's nice of them to let us link to their code. However, when you provide a useful piece of software for everyone to link to, you gotta expect that people are going to take full advantage of linking your code if you let them, whether they link it efficiently or not.