Slashdot Mirror


How Much Bandwidth is Required to Aggregate Blogs?

Kevin Burton writes "Technorati recently published that they're seeing 900k new posts per day. PubSub says they're seeing 1.8M. With all these posts per day how much raw bandwidth is required? Due to innefficiencies in RSS aggregation protocols a little math is required to understand this problem." And more importantly, with millions of posts, what percentage of them have any real value, and how do busy people find that .001%?

2 of 209 comments (clear)

  1. Re:How much? If everyone GZipped, a lot less! by Madd+Scientist · · Score: 5, Informative
    i used gzip with apache at an old job and we ran into a problem with it... some obscure header problem in conjunction with mod-rewrite.

    so i wouldn't say ANY site using apache... but probably most. the real problem there is with compression load on the servers... gzip compression doesn't just happen you know, it takes CPU cycles that could be being used to just push data rather than encode it.

  2. Gzip helps, but the real win is conditional get by epeus · · Score: 4, Informative

    If your weblog server implements ETag and Last-Modified, my spider can send a one packet request with the values I last saw from you, and you can send a one packet 304 response if nothing has changed.

    Charles Miller explained this well a few years ago.

    (I run the spiders at Technorati).