Twitter Throttling Hits Third-Party Apps

← Back to Stories (view on slashdot.org)

Twitter Throttling Hits Third-Party Apps

Posted by timothy on Wednesday July 7, 2010 @07:52AM from the you-should-take-some-thritalin-perhaps dept.

Barence writes "Twitter's battle to keep the microblogging service from falling over is having a dire affect on third-party Twitter apps. Users of Twitter-related apps such as TweetDeck, Echofon and even Twitter's own mobile software have complained of a lack of updates, after the company imposed strict limits on the number of times third-party apps can access the service. Over the past week, Twitter has reduced the number of API calls from 350 to 175 an hour. At one point last week, that number was temporarily reduced to only 75. A warning on TweetDeck's support page states that users 'should allow TweetDeck to ensure you do not run out of calls, although with such a small API limit, your refresh rates will be very slow.'"

23 of 119 comments (clear)

Min score:

Reason:

Sort:

175/hr is slow? by rotide · 2010-07-07 07:55 · Score: 4, Insightful

Isn't that an update nearly every 20 seconds? How fast do people need to see that you're currently wiping your butt?
1. Re:175/hr is slow? by the_one_wesp · 2010-07-07 07:59 · Score: 5, Insightful
  
  If you're only following a single feed. But I have like 10 lists in TweetDeck that all get individually queried, and there are some who have WAY more than that.
  
  But I am inclined to comment about this bit of "news"... Big. Woop. Twitter's just trying to stay alive. If the service falls over NO UPDATES will happen... at all... Inconvenient, yes, but totally necessary.
2. Re:175/hr is slow? by copponex · 2010-07-07 08:03 · Score: 5, Funny
  
  Isn't that an update nearly every 20 seconds? How fast do people need to see that you're currently wiping your butt?
  It seems you have forgotten how full of shit the average Twit is.
3. Re:175/hr is slow? by Alex+Zepeda · 2010-07-07 08:38 · Score: 3, Informative
  
  The API rate limit is per hour per user (if authenticated) and per IP if not authenticated. Unfortunately the Twitter API does not allow you to aggregate requests even if their web site does (e.x. status updates for all of the people I'm following and all of the things people I'm following have retweeted). If you go through the API docu, you'll find all sorts of horrid seeming inefficiencies and awkwardness with the API.
  For instance when you request a status (or a list of statuses or whatever) you'll get back: the contents of the tweet, the user name, user id, URL for the user avatar, URL for the user's profile page background image, whether that user is following you, their real name, the number of tweets that user has made, and so-on and so forth. A lot of this information could easily be cached by the client, but is instead sent for every tweet you get back.
  
  --
  The revolution will be mocked
75 updates per hour by VisiX · 2010-07-07 07:57 · Score: 4, Interesting

Any information that needs to be distributed more than once per minute probably shouldn't be relying on twitter.
Are They Employing an Event/Listener Paradigm? by eldavojohn · 2010-07-07 07:57 · Score: 5, Informative

Disclaimer: I'm not familiar with the Twitter API. If the assumptions I make are wrong, I apologize.

Over the past week, Twitter has reduced the number of API calls from 350 to 175 an hour.
Okay, if you're making that many calls to Twitter then there might be an inherent flaw with their RESTful interfaces. I think for a long time, the "web" as we know it has suffered from the lack of the Event/Listener paradigm. This is a pretty simple design concept that I'm going to refer to as the Observer. Let's say I want to know what Stephen Hawking is tweeting about and I want to know 24/7. Now if you have to make more than one call, something is wrong. That one call should be a notification to Twitter who I am, where you can contact me and what I want to keep tabs on--be it a keyword or user. So all I should ever have to do is tell Twitter I want to know everything from Stephen Hawking and everything with #stephenhawking or whatever and from that point on, it will try to submit that message to me via any number of technologies. Simple pub/sub message queues could be implemented here to alleviate my need to continually go to Twitter and say: "Has Stephen Hawking said anything new yet? *millisecond pause* Has Stephen Hawking said anything new yet? *millisecond pause* ..." ad infinitum. I'm not claiming Twitter does this but a cursory glance at the API looks like it's missing this sort of Observer paradigm that allows for the scalability they need.

I'm not leveling the finger at Twitter, it's a widespread problem that even I have been a part of. Ruby makes coding RESTful interfaces so easy that it's very very tempting to just throw up a few controllers that are basically CRUD interfaces for databases and to call it a day. I suspect that Twitter is feeling the impending pain of popularity right about now ...

--
My work here is dung.
1. Re:Are They Employing an Event/Listener Paradigm? by Late+Adopter · 2010-07-07 08:14 · Score: 2, Interesting
  
  I agree, that's the "right" way to tackle subscription mechanisms. But it's not the right way to tackle Twitter, because one of the defining features of Twitter is its ubiquity: i.e. if you have a phone/computer/netbook that's capable of running any sort of app whatsoever, you can run a Twitter app. As it stands now to write a Twitter client, you need to be able to do HTTP GET requests (every modern environment provides for this) and parse XML. That's it. But to do pub/sub, you'd presumably need to be able to listen, which you can't always do, say, on a smartphone or a Firefox extension.
2. Re:Are They Employing an Event/Listener Paradigm? by Late+Adopter · 2010-07-07 08:17 · Score: 2, Informative
  
  Bad form to reply twice, but I forgot something rather crucial: the "right" way to do this sort of thing might be to offer notifications over XMPP (i.e. Jabber/GTalk). Twitter used to do this, but they couldn't figure out how to keep it running under heavy load (which I would consider a fault on their end rather than as a fault in XMPP as a solution).
  
  XMPP would at least take advantage of established listening pathways (GTalk clients on mobile devices, etc).
3. Re:Are They Employing an Event/Listener Paradigm? by Animats · 2010-07-07 08:40 · Score: 5, Informative
  
  Now if you have to make more than one call, something is wrong. That one call should be a notification to Twitter who I am, where you can contact me and what I want to keep tabs on--be it a keyword or user.
  That's not easy to do on a large scale. A persistent connection has to be in place between publisher and subscriber. Twitter would have to have a huge number of low-traffic connections open. (Hopefully only one per subscriber, not one per publisher/subscriber combination.) Then, on the server side, they'd have to have a routing system to track who's following what, invert that information, and blast out a message to all followers whenever there was an update. This is all quite feasible, but it's quite different from the classic HTTP model.
  It's been done before, though. Remember Push technology? That's what this is. PointCast sent their final news/stock push message in February 2000. There's more support for "push" in HTML5, incidentally.
  If you really wanted to scale this concept, the thing to do would be to rework a large server TCP implementation so that it used a buffer pool shared between connections, rather than allocating buffers for each open connection. The TCP implementation needs to be optimized for a very large number of mostly-idle connections. Then implement an RSS server with slow polling, so that the client makes an RSS query which either returns new data, waits for new data, or times out in a minute or two and returns a brief "no changes" reply. Clients can then just read the RSS feed, and be informed immediately when something changes. A single server should be able to serve a few million Twitter-type users in this mode.
  The client side would encode what it was "following" in the URL parameters. The server side needs a fabric between data sources such that changes propagate from sources to front servers quickly, and then on each front server, all the RSS feeds for all the followers for the changed item get an update push.
  There's a transient load problem. If you have 50,000,000 users, each following a few hundred random users, load is relatively uniform and it works fine. If you have 50,000,000 people following World Cup scores, each update will force 50,000,000 transactions, all at once. All the clients get a notification that something has changed. So they immediately make a request for details (the picture of someone scoring, for example). All at the same time. However, if you arrange things so that the request for details hits a server different from the one that's doing the notifications, ordinary load-balancing will work.
4. Re:Are They Employing an Event/Listener Paradigm? by DragonWriter · 2010-07-07 09:14 · Score: 3, Informative
  
  Okay, if you're making that many calls to Twitter then there might be an inherent flaw with their RESTful interfaces. I think for a long time, the "web" as we know it has suffered from the lack of the Event/Listener paradigm. This is a pretty simple design concept that I'm going to refer to as the Observer [wikipedia.org].
  
  For messaging architectures (like, say, the internet), the pattern is usually described as "Publish/Subscribe". All serious messaging protocols support it (XMPP, AMQP, etc.) and some are dedicated to it (PubSubHubbub). The basic problem with using it the whole way to the client is that many clients are run in environments where it is impractical to run a server which makes recieving inbound connections difficult.
  There are fairly good solutions to that, mostly involving using a proxy for the client somewhere that can run a server which holds messages, and then having the client call the proxy (rather than the message sources) to get all the pending messages together.
  
  I'm not leveling the finger at Twitter, it's a widespread problem that even I have been a part of. Ruby makes coding RESTful interfaces so easy that it's very very tempting to just throw up a few controllers that are basically CRUD interfaces for databases and to call it a day.
  
  Given what's been published about Twitter in the past (including them at one point building their own message queueing system because none of the existing ones that they tried seemed adequate), I don't think what they've done is as simplistic as that on the back-end, though they be forcing third-party apps through an API which makes it seem like that's what is going on (and produces inefficiencies in the process.)
It's time to ditch the NoSQL bullshit. by Anonymous Coward · 2010-07-07 08:08 · Score: 4, Insightful

It's high time that the so-called "Web 2.0" companies ditch the NoSQL bullshit they've started to put into place. It's not bringing the scalability benefits they all claimed it would, and it's leading to data with very questionable reliability otherwise (not that their data is particularly valuable in the first place...)
A lot of these scalability problems could be solved by using a proper RDBMS on proper hardware that's designed to handle huge concurrent workloads. This level of traffic isn't new by any means. There are many POS systems around the world, from retail operations to airlines, that deal with a similar level of "traffic".
It doesn't matter if they go with a database and hardware stack from Oracle, or a DB2 and hardware stack from IBM, or even use Sybase's ASE on hardware from HP. They just need to invest in some real hardware and some real database systems that are meant for dealing with absolutely huge loads.
Ditch NoSQL databases. Ditch shitty servers. Start using real software, and start using real hardware. That's what other businesses do when they "grow up". If twitter is a viable business, it's time for them to grow up, too.
1. Re:It's time to ditch the NoSQL bullshit. by Amouth · 2010-07-07 09:13 · Score: 2, Interesting
  
  Lowes hardware does - there is a local server in the store that serves as a caching server only if the main trunk fails.
  
  --
  '...if only "Jumping to a Conclusion" was an event in the Olympics.'
2. Re:It's time to ditch the NoSQL bullshit. by Miseph · 2010-07-07 09:13 · Score: 4, Informative
  
  Debit card processing systems require real-time access to the full network for every single transaction. PIN numbers cannot be cached locally, and must be validated before completing the transaction.
  
  --
  Try not to take me more seriously than I take myself.
3. Re:It's time to ditch the NoSQL bullshit. by Knux · 2010-07-07 09:50 · Score: 3, Interesting
  
  Any telecom does way more than that.
  
  I've worked in a big telecom with 40mi+ clients and I've seen an 8 nodes Oracle RAC responsible for the whole pre-paid client database handle far, far more transactions and queries than Twitter says it does.
  
  Each regional server responsible for authorizing the calls has a 2 node Oracle RAC and it too handles far more transactions and queries than Twitter.
  
  So, there you go... The excuse to use NoSQL was that it is quicker in some cases. It's not, time to move back to RDBMS.
4. Re:It's time to ditch the NoSQL bullshit. by LWATCDR · 2010-07-07 10:06 · Score: 2, Insightful
  
  It is all about bang for the buck. I do not think that anyone has ever said that you can not scale a SQL server to handle a Twitter like load. The question is one of cost.
  I am sure that you could handle the load with DB2 on a z Machine also but at what cost?
  I am actually a big fan of SQL and find NoSQL to extremely cumbersome.
  But then I really do not have a need to scale that big.
  I just am not yet willing to write off NoSQL yet. I know that Google has used it for some things.
  But when you are talking about Twitter a key is the cost per transaction. That must be very low. And if I have to wait 30 seconds for a twitter that is a good trade off for me.
  
  --
  See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
5. Re:It's time to ditch the NoSQL bullshit. by jo42 · 2010-07-07 11:57 · Score: 2
  
  Start using real software, and start using real hardware. That's what other businesses do when they "grow up". If twitter is a viable business, it's time for them to grow up, too.
  Well that's the problem right there. Many of these 'businesses' (Twitter, Foursquare, Gowalla, etc.) are not viable businesses. They are still bleeding their initial (or secondary, or tertiary) funding rounds dry with expenditures greatly outpacing their receivables (if any). Any hope of being around in a few years is if someone comes along and buys them out or dumps even more $$$$ into their black holes. Or get into advertising - where Google is the million tonne gorilla that you have to compete with.
Protocol overhead by ickleberry · 2010-07-07 08:08 · Score: 3, Interesting

I wonder if it would have much of an impact if they switched from the verbose JSON/XML over HTTP formats for the API to a binary UDP-based protocol. Twitter seems well suited to such a protocol since it is so simple and the messages ar so short

Is it that they are doing too much processing on the data, wasting too much bandwidth or is their database causing trouble? Since its twitter obviously any bandwidth used is a waste, but you know what I mean
Re:Monty Python by spazdor · 2010-07-07 08:20 · Score: 4, Insightful

Company bases a business model on offering their resources for free, only to discover to their chagrin that people will take them up on it. Where oh where have I heard this one before?

--
DRM: Terminator crops for your mind!
Twitter has jumped the shark by Locke2005 · 2010-07-07 08:52 · Score: 2

Nobody goes on Twitter anymore -- it's too crowded! (With apologies to Yogi Berra.)

--
I've abandoned my search for truth; now I'm just looking for some useful delusions.
Re:Uhh, what are you talking about? by Anonymous Coward · 2010-07-07 08:52 · Score: 3, Insightful

What about clients who don't have a constant connection to the Internet, or who have a dynamic IP? Now twitter has to poll them, to see if they exist. You end up with the same situation, except worse.
E-mail seems to be doing just fine, despite these "shortcomings".
Re:Uhh, what are you talking about? by bored · 2010-07-07 08:58 · Score: 2

Yawn, just because the client isn't polling (REST, is just a way of saying polling to make people feel better), doesn't mean this doesn't work on just about every damn device out there. TCP keep-alives are supported by all the major TCP stacks and all the minor ones I've ever used (although not strictly required per RFC 1122). With reasonable configuration parameters for maintaining connections with little data transfer, its possible to keep a port open for basically an indefinite time period. Once the port is open, its going to consume server resources (and having more than a few 10k ports per IP is a problem, and is itself probably a good reason for having some kind of periodic queue poll type mechanism), but its going to significantly lower the bandwidth vs a polling mechanism.
That said, a big part of the problem is HTTP, and the insistence to use it for a API data transport even when its not well suited for such. Even worse though is the use of web servers like apache that consume significant resources for keep alive transactions. Frankly, though to be fair Apache was designed more for an environment where a lot of different machines were connecting for short periods of time, and then they were done. The http 1.1 keepalive mode didn't mesh well with the one process per connection model, and works only marginally better using the one thread per connection model now in use.
So, basically I don't think any of your arguments hold. Even over actual network failures, client standby, network changes, etc. The client will be notified of connection loss and can simply reconnect. Once reconnected, queued notifications can be issued, or the client can repoll before reconstructing the notification system.
Frankly, as someone who works with extremely high band-width (many GBytes/sec), high IO rate systems (100k/sec transactions) per node, I'm shocked at the problems twitter has. Fundamentally, i'm betting someone who didn't have to deal with the the BS could get the whole system running on a few fairly high power server nodes. The entire data set probably could be fit in RAM on a modern high end server. Its not like they are moving a lot of multiple MB messages around, or running really complex searches.
Just imagine what google would be like if written the same way.
Perhaps it's a monetary issue... by statemachine · 2010-07-07 21:04 · Score: 2

Does Twitter make money? I'm not trolling, I'm serious. A quick search yields this article:
http://www.pcworld.com/businesscenter/article/200635/twitter_to_promote_marketers_special_offers.html
Even the author of my linked article has doubts. If I wasn't making money, I'd try to limit my expenditures (bandwidth costs, etc.) too. It's not surprising to me.
So how do they make money?
genius web2.0 marketyng idea by rawtatoor · 2010-07-08 01:34 · Score: 2, Insightful

Twitter is a fundamentally stupid idea. It is like trying to run all of the mailing lists in the world from one server (and by 'like' I mean exactly the same) The end result is half as useful and twice as shitty. Seriously, write a web2.0 listserv interface and you will amaze tweeters. You can tweet with email holy cow!
Yes, it's a mailing list, suprise!