Slashdot Mirror


When RSS Traffic Looks Like a DDoS

An anonymous reader writes "Infoworld's CTO Chad Dickerson says he has a love/hate relationship with RSS. He loves the changes to his information production and consumption, but he hates the behavior of some RSS feed readers. Every hour, Infoworld "sees a massive surge of RSS newsreader activity" that "has all the characteristics of a distributed DoS attack." So many requests in such a short period of time are creating scaling issues. " We've seen similiar problems over the years. RSS (or as it should be called, "Speedfeed") is such a useful thing, it's unfortunate that it's ultimately just very stupid.

10 of 443 comments (clear)

  1. Re:Can't this be throttled? by jcain · · Score: 3, Insightful

    That kind of eliminates the point of having the RSS at all, as the user no longer gets up-to-the-minute information.

    Also, I doubt that the major problem here is bandwidth, more the number of requests the server has to deal with. RSS feeds are quite small (just text most of the time). The server would still have to run that PHP script you suggest.

  2. Re:Can't this be throttled? by mgoodman · · Score: 4, Insightful

    Then their RSS client would barf on the input and the user wouldn't see any of the previously downloaded news feeds, in some cases.

    Or rather, anyone that programs an RSS reader so horribly as to make it so that every client downloads information every hour on the hour would probably also barf on the input of a 500 or 404 error.

    Most RSS feeders *should* just download every hour from the time they start, making the download intervals between users more or less random and well-dispersed. And if you want it more than every hour, well then edit the source and compile it yourself :P

    --
    01100111 01100101 01110100 00100000 01101111 01110101 01110100 00100000 01101101 01101111 01110010 01100101 00101110
  3. Re:Simple HTTP Solution by skraps · · Score: 5, Insightful

    This "optimization" will not have any long-lasting benefits. There are at least three variables in this equation:

    1. Number of users
    2. Number of RSS feeds
    3. Size of each request

    This optimization only addresses #3, which is the least likely to grow as time goes on.

    --
    Karma: -2147483648 (Mostly affected by integer overflow)
  4. Re:RSS needs better TCP stacks by EnderWiggnz · · Score: 3, Insightful

    not needing user intervention is the effing POINT of rss.

    its like saying - "java is great, except lets make it compiled, and platform specific"

    --
    ... hi bingo ...
  5. Re:Can't this be throttled? by ameoba · · Score: 4, Insightful

    It seems kinda stupid to have the clients basing their updates on clock time. Doing an update on client startup and then every 60min after that would be just as easy as doing it on the clock time & would basically eliminate the whole DDOSesque thing.

    --
    my sig's at the bottom of the page.
  6. Re:RSS needs better TCP stacks by Salamander · · Score: 5, Insightful

    Leaving thousands upon thousands of connections open on the server is a terrible idea no matter how well-implemented the TCP stack is. The real solution is to use some sort of distributed mirroring facility so everyone could connect to a nearby copy of the feed and spread the load. The even better solution would be to distribute asynchronous update notifications as well as data, because polling always sucks. Each client would then get a message saying "xxx has updated, please fetch a copy from your nearest mirror" only when the content changes, providing darn near optimal network efficiency.

    --
    Slashdot - News for Herds. Stuff that Splatters.
  7. Re:Simple HTTP Solution by jesser · · Score: 3, Insightful

    Even if every RSS reader used HEAD (or if-modified-since) correctly, servers would still get hammered on the hour when the RSS feed has been updated during the hour. If-modified-since saves you bandwidth over the course of a day or month, but it doesn't reduce peak usage.

    --
    The shareholder is always right.
  8. Re:Can't this be throttled? by mblase · · Score: 4, Insightful

    Most RSS feeders *should* just download every hour from the time they start

    That's also a problem, though, since most people start work at their computer desks on the hour, or very close to it. The better solution would be for the client (1) to check once at startup, then (2) pick a random number between one and sixty (or thirty or whatever) and (3) start checking the feed, hourly, after that many minutes. That's the only way to ensure a decently random distribution of hits.

  9. Re:It just ain't broadcast.. by fiftyvolts · · Score: 4, Insightful

    You make some very good points. The old saying "When all you have is a hammer, everything looks like a nail" seems to ring true time and time again. These days it seems that everyone wants to use HTTP for everything and quite frankly it's not equipped to do that.

    RSS over SMTP sounds pretty cool. Heck, just sending a list of subscribers an email of RSS and let their mail clients sort it out would be pretty nice.

    Heh, my favorite posts are when some one suggested soething that sonuds totally novel and then someone else points our "Yeah! Like $lt;insert old and undeused technology>. It seems to do that damn well." The internet cannot forget its roots!

  10. Solution: HTTP 503 Response for Flow Control by Orasis · · Score: 3, Insightful

    The main problem here is that RSS lacks any sort of distributed flow control, much as the Internet did back in the early days with tons of UDP packets flying around everywhere and periodically bringing networks to their knees.

    One completely backwards-compatible fashion to add flow-control to RSS would be to use the HTTP 503 response when server load is getting too high for your RSS files. The server simply sends an HTTP 503 response with a Retry-After header indicating how long the requesting client should wait before retrying.

    Clients that ignore the retry interval or are overly aggressive could be punished by further 503 responses thus basically denying those aggressive clients access to the RSS feeds. Users of overly aggressive clients would soon find that they actually provide less fresh results and would place pressure on implementors to fix their implementations.