W3C Gets Excessive DTD Traffic
eldavojohn writes "It's a common string you see at the start of an HTML document, a URI declaring the type of document, but that is often processed causing undue traffic to W3C's site. There's a somewhat humorous post today from W3.org that seems to be a cry for sanity and asking developers and people to stop building systems that automatically query this information. From their post, 'In particular, software does not usually need to fetch these resources, and certainly does not need to fetch the same one over and over! Yet we receive a surprisingly large number of requests for such resources: up to 130 million requests per day, with periods of sustained bandwidth usage of 350Mbps, for resources that haven't changed in years. The vast majority of these requests are from systems that are processing various types of markup (HTML, XML, XSLT, SVG) and in the process doing something like validating against a DTD or schema. Handling all these requests costs us considerably: servers, bandwidth and human time spent analyzing traffic patterns and devising methods to limit or block excessive new request patterns. We would much rather use these assets elsewhere, for example improving the software and services needed by W3C and the Web Community.' Stop the insanity!"
"oops"
Add some sort of caching parameter to the DTD spec, that specifies how long browsers should cache those DTDs.
Another potential solution: Have browsers keep the DTDs cached, and then check the file date periodically when re-requested. This will still put some load on the w3c's servers, but significantly less than complete re-downloads.
Serves them right for forcing us to include the same long urls that point to files that never change in every single HTML file ever.
What are the user agents making the requests? Do these programs identify themselves with a UA string or something?
It is dangerous to be right when the government is wrong.
Sorry W3C, but if I don't include it in my webpage, IE goes into the dreaded quirks mode!
*forwards article to Microsoft*
Hydraulic pizza oven!! Guided missile! Herring sandwich! Styrofoam! Jayne Mansfield! Aluminum siding! Borax!