Slashdot Mirror


On World of Warcraft's Network Issues

alphaneutrino writes to mention a C|Net article discussing some of the recent problems the World of Warcraft playerbase has experienced. From the article: "'Being a system administrator myself, I have some understanding of what goes on in a corporate data center,' said Evgeny Krevets, a sometimes-frustrated WoW player. 'I don't know Blizzard's system setup. What I do know is that if I kept performing 'urgent maintenance' and taking the service down without warning for eight-hour periods, I would be out of a job.' Blizzard blames some of the problems--such as the disconnection, for several hours on Friday, of players linked to several servers--on AT&T, its network provider. (AT&T did not respond to a request for comment.) "

33 of 407 comments (clear)

  1. A typical week on Mal'Ganis by SaguratuS · · Score: 5, Informative

    Sunday: The day the server stood still
    Monday: *gasp*, playable (until 11pm)
    Tuesday: Weekly Maintenance Day. Nothing else EVER needs to be said about this day.
    Wednesday: Playable (until 11pm), good chance maintenance aftermath.
    Thursday: The 10 second instant-casts day for MC & BWL.

    Yeah, it goes on. Our server reliably bites the dust around 11pm every night for 6 hours, not to mention the constant plague of login issues and 30-minute loading screens during peak hours. Funny how this is all on a low-medium population server.

    1. Re:A typical week on Mal'Ganis by Anonymous Coward · · Score: 4, Funny

      What are Friday and Saturday? Go outside in the scary real world days?

    2. Re:A typical week on Mal'Ganis by whoop · · Score: 3, Informative

      Yes. If your server reaches some limit on the number of people, you get put into a waiting queue. The wait time of this queue varies it seems. A few months ago, it was an average of 15-25 minutes. In recent days, there hasn't been any wait times (though I don't play hardcore so it might vary day-to-day).

    3. Re:A typical week on Mal'Ganis by drdewm · · Score: 3, Insightful

      I love that this is all the same as with the Everquest servers. People constantly said that they would not buy from Sony eyc again because of the problems, nerfs, lack of support etc. It seems as if these issues are inherent to MMORPGs.

    4. Re:A typical week on Mal'Ganis by CTachyon · · Score: 3, Funny

      They BROKE a continent? I hope it was still under warranty...

      --
      Range Voting: preference intensity matters
  2. Ahhh.... by popeguilty · · Score: 3, Funny

    ...so THAT'S how Blizzard is combatting server lag.

  3. 8 hours? Coincidence? by ziggamon2.0 · · Score: 5, Funny

    Maybe it's the Blizzard guys' moms that come in and say "Enough of those stupid games already, go to bed!"? ;)

    Or are they too cool to be running the servers out of their parents' basements like the rest of us?

  4. Last heard from the WoW datacenter... by fuyu-no-neko · · Score: 5, Funny

    "Well, at least I have chicken!"

    --
    Don't take the above poster too seriously. He doesn't.
  5. Re:$15/mo times six million users.... by Anonymous Coward · · Score: 5, Funny

    Blizzard, I can guarantee this: if you spend $35 million per month on refactoring, hardware and bandwidth, all your shareholders go away. Guaranteed. I promise.

  6. Re:$15/mo times six million users.... by stevesliva · · Score: 3, Funny

    Just like with nine women you can have a baby in one month.

    --
    Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
  7. Re:wow by JavaLord · · Score: 4, Insightful

    Ive barely even seen any issues since patch 1.10. I think patch day the servers were down all day, but thats to be expected.

    Server preformance varies from realm to realm. I hadn't really had any issues until the last week or two when my server decided to drop 40 minutes into our 45 minute baron run, and then again in the BG's later on.

    As someone else mentioned, I think they are still a victim of their own success. Sure it's been over a year since launch, but they were expecting 250,000 subscribers and got 6,000,000.

  8. Re:It seems... by Armando_Mcgillicutty · · Score: 4, Funny

    No one except the 6 Million users that play the game.

  9. Code patches? by lawaetf1 · · Score: 3, Interesting

    I'm not a WoW player but if it's true that these systems regularly go dark for 8 hours at a time I have to wonder if they're not racing through some software patch. In other words, I don't know an architecture out there that can't be rebooted in 8 hours so a straight-up crash seems unlikely. I would assume they've taken care of scalability problems by now so system load / tablespace, etc, ought to not be an issue.

    Could it be that WoW suffers constant attempts at subverting the framework of play ... and some succeed, requiring a quick patch to the code base? I wouldn't doubt that they have monitoring mechanisms in play which detect unreasonable changes in a character's level / gold, etc.

    --
    CommentBot 0.7a running with args "-module irritate,disagree -target random"
  10. Nothing new for MMORPGs by Speare · · Score: 4, Interesting
    Having been intimately involved with the server management of one of the first graphical MMORPGs (3DO's Meridian 59), all I can say is that this is nothing new for MMORPG server clusters or services.

    Our game had its server problems and we were in "learning mode" to deal with some major outages, major gameplay renovations, major strife from jerks, and major socio-legal issues behind the scenes such as player-to-player harassment and real-life stalking. EA/Origin's Ultima Online started later and had some of the same issues in an almost predictable order and timing. Then EverQuest repeated our mistakes, and so on.

    I would think that as an industry, as a set of geeks, we MMORPG server managers would learn from each others' mistakes, but apparently, we do not. It is also a problem in that the management in *product* companies think it is easy to become a world-class *service* company, where the service is being sold to thousands to millions of *household* mass market customers.

    --
    [ .sig file not found ]
    1. Re:Nothing new for MMORPGs by garylian · · Score: 4, Interesting

      You are quite correct, just about every game has had these kind of problems, especially if they break new ground in subscription numbers. EQ had a lot of problems at launch, and for the first year or so. UO did have problems, as well. Blizzard certainly blew away everyone with it's subscription numbers with WoW.

      However, Blizzard has really dragged their feet when it comes to fixing things. The article makes it sound like this is a recent phenomenom for WoW, but it has been around since the game was first released.

      Granted, they didn't anticipate quite the initial subscription numbers they got, but within weeks we saw login queues show up, and Blizzard hastily added more servers. In fact, I do believe the more servers they added happened to be all that they had originally contracted for, and they used up that "growth servers" room right away. Now they have maxed their server capacity with their ISP, and they were sorta screwed at that point. Not that they couldn't have thrown money at the issue, but this is a game company owned by a media company. Throw money at the problem? Bwahahahaha

      Heck, I was on one of the original "terrible 20" servers; Uther. It was down so much it was scary. I think I ended up with more than 2 weeks of free play time for service outtages, and probably closer to a full month.

      Also, this whole thing about "a patch caused a new set of problems" is also not new for Blizzard and WoW. Every patch they did for the first several months would break half the server lag fixes they put in. Loot lag was so bad you could be stuck for more than a minute looting a corpse. From launch to when I quit playing 9 months later, they still had the problem of ore nodes and/or harvest nodes that would lock your toon up because it had nothing on it but failed to clear. I suspect that bug is still in place, but I don't care anymore. After a while, things got better, but as the queues came back, so did the content breaking patches, and the wife and I got out. Heck, we were 60, and bored.

      What is different is that most of these game companies have had their act together after 1 year, give or take a few months. It's been what, about 16 months since WoW was first released? They should really have their act together about now, or damn close to it. But they don't.

  11. Re:$15/mo times six million users.... by stupidfoo · · Score: 3, Insightful

    The problem doesn't seem to be how much they spend but where they spend their money. According to the article AT&T seems to be their only network provider. Who thinks that makes sense? To have such a huge bandwidth hungry product and rely on one provider for it. I would never host a commercial web site on a host with a single provider, let alone a huge undertaking like WoW.

    But, then again, I may also be an idiot... who knows?

  12. More Crafty by Doc+Ruby · · Score: 4, Insightful

    Sounds like WoW has a house of cards network with single point of failure architecture problems.

    And that AT&T is exploiting them, marketing a new "premium service/support" contract by letting them go down.

    I can't wait until WoW has to pay AT&T (and its handful of competitors, if they get rid of the SPF) the extra "premium tier" routing fees, once the telcos market their "nonneutral" Internet. Because a world of angry Warcraft players jonesing for their fix will be a nice gift for telco suits just trying to make it home from work.

    --

    --
    make install -not war

  13. NSA Agents Hot on the Trail of Horde Terrorists by Anubis333 · · Score: 5, Funny

    It's hard for AT&T to cater to so many millions of users *AND* filter/direct all of their customer data illegally and directly to the NSA.

  14. Service Providers In General by pandrijeczko · · Score: 4, Interesting
    Not a direct comment about Blizzard (I don't even play WoW) but I am totally disgusted with the way some service providers treat we general Joe Public customers.

    As an example, I came home from holiday (I'm in the UK) on Sunday evening & I immediately noticed my ADSL connection was down. So I phoned my ISP to report the fault, only to be told that they knew about the problem - a faulty server had been down for 48 hours!!! And when the tech support person could not tell me when the service would be restored, she seemed totally bemused as to why I was angry about the duration of downtime & demanded to speak to her manager.

    The manager was even worse... polite and courteous but did not have a clue as to the cause of the problem or when the ADSL service would be back up. He even admitted that they'd been making some network changes to accomodate a recent merger with another company and that they had no backup server to put in place to at least give some degree of restricted service.

    I may pay (the equivalent of) $30 a month for my ADSL service but am I the only person who expects good service from any company I deal with, whether I spend £3 or £30,000 with that company? I accept that sometimes there are service outages, I'd even view an 8-hour outage a few days a year as being understandable. But 48 hours???

    I've been in the telecoms/computer industry now for about 20 years now and I've seen the whole perception of what is and isn't good customer service change over that time - it seems now that customers are forced to accept worse service because every company has reduced the level of service they give.

    And when it comes to poor Joe Public "peons" like ourselves, who only spend a small amount each month with these companies, we're expected to endure countless menu selections, long delays in call-centre queues and lengthy outages as a matter of course.

    It would be good to see a lot more people complain more and cancel their services with some of these providers - I'm sure this is the only way that they will be forced to offer better service to us.

    --
    Gentoo Linux - another day, another USE flag.
  15. Re:Network and other WoW performance issues by EnronHaliburton2004 · · Score: 3, Funny

    This was a business decision, not a technical descision. Probably an exclusive agreement made with the PHB in charge of WoW.

    "But why do we need two providers? ATT has assured me that they can provide all the bandwidth we need, and that they have failover capability! *plus* their datacenters are built on SPRINGS!"

  16. Where it really shows by Vicegrip · · Score: 3, Informative

    The problem really is visible when you are adventuring in difficult to beat places. You depend on having your team perform to their best ability. It is then so frustrating to be constantly dealing with part of you team getting disconnected or being lagged to the point of ineffectiveness.

    My guild is doing MC BWL, ZG and AQ20 right now. It is a regular occurence right now to wait 20 minutes to start a fight because of disconnected people, only to then lose that battle because you lost two priests to a disconnect during it.

    The anger may not be at the threshold point yet Blizzard, but it most definitely building fast. The thing about angry customers is that there is a point of no return when they are forever lost. Blizzard has a lot of customers right now, but they would lose them fast if somebody else stepped up with a great game and more reliable game play.

    Blizzard, you executed very very well on game content by effectively removing much of the grind that other games are plagued with, but you have failed with customer interaction. Some of your representatives treat your customers with borderline contempt (Tseric) and you fail miserably at explaining properly the multitude of changes you make to the game.

    Blizzard, your six million customers are waiting; it's your move, take too much time and you could lose them. Start with being public about your server improvement plans, telling people what you're doing and why and how its going to make things better. Not knowing when things are going to get better is really making people angry.

    --
    Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
  17. Re:$15/mo times six million users.... by otis+wildflower · · Score: 3, Funny

    Given their reliance on only ATT as their network provider, this is precisely the problem they have _now_, and what they need to spend bucketloads of cash fixing.

    They need multiple sites around the world, with multiple OC192s to multiple providers, all BGP'd to the gills. They need to buy dark fiber and light that shit up.

    Then again, why bother, it's not like it's a free market out there and there won't be any competitors to WoW that can get their act together, right? I mean Blizzard owns the patents on MMORPGaming, right?

    Oh wait.

  18. Ill communication by Phanatic1a · · Score: 4, Insightful

    A large part of the problem is that Blizzard's communication with the player base sucks, to speak frankly. The login server for their forums seems to be one and the same as the login server for the game itself, so when that goes down the forums tend to shut down as well. There is a "Realm Status" page which purports to show the real-time status of the various servers, but which is frequently unreachable. There is a "Realm Status" forum which *might* contain some acknowledgement of a problem while the problem is still ongoing, but usually doesn't. When you start up the game client, Blizzard can stick up a 'News' window on your screen but, again, the appearance of any news often lags the problem, even severe problems, by a matter of hours. And, of course, Blizzard's chief form of communication with players is Community Managers on the forums, who themselves tend to be given dick in the way of information, are extremely controlled in what they can and cannot say, and who are (honestly, I'm not joking), tasked with yelling at users for stuff posting subject headers that contain excessive capitalization; what an obscene waste of resources.

    Seriously, a little timely information goes a long way. Yes, I agree that the downtime they have is absurd; consider that *every Tuesday* the game goes offline for *six hours* of maintenance. That's *planned, scheduled* downtime, folks, so that *alone* means they aren't even attempting to have greater than 96.4% uptime, and I can't think of another commercial service for which you pay a monthly fee where that would be even remotely acceptable; if your cable or your phone just plain didn't work for 6 hours every Tuesday, heads would roll. Then things just get asinine when you factor in all the spontaneous, freewheeling, unplanned downtime as well.

    But know what? I'd feel a lot better about it if, when something shits the bed, or goes tits-up, or whatever colorful metaphor you'd use to describe a server-killing technical problem, Blizzard would tell us, promptly, as they receive the information themselves:

    1. We know there's a problem.
    2. We know what the proglem is.
    3. Here's what we're doing to fix it.
    4. Here's when we expect it to be fixed.
    5. Update as old information is obsolete.

    They don't do this. A few hours after something happens, you might get some of the above information. Or you might not. Usually, it's the latter.

  19. Re:Oh please... by pandrijeczko · · Score: 3, Insightful
    But then the other argument is that it's people like you, who endure these outages without complaint, who make it bad for everyone else?

    I don't play any online games but I thought the whole idea of them was that you subscribe to that service for it to be available just about 24x7 whenever you feel like jumping in. Sure, occasional outages are to be expected but if it gets to the stage where the game is frequently slow or unavailable, the common sense solution would be to cancel your subscription until Blizzard (or whomever) improves the service they deliver you. If enough people did this, they'd have to do something about it...

    I'm sorry but I think far too many people have become "slaves" to marketing by truly believing that they simply cannot do without a lot of the products & services that they pay good money for - to the point where they "need" those items so much that they're afraid of complaining in case they're denied those things completely.

    --
    Gentoo Linux - another day, another USE flag.
  20. thank you AT&T! by tidokoro · · Score: 5, Funny

    WOW server downtime is saving my marriage.

    --
    tidokoro
    what turns a man's karma neutral? lust for gold? power? or just a heart born full of neutrality?
  21. Re:$15/mo times six million users.... by Ryan+Amos · · Score: 4, Interesting

    Because they have to pay developers, bandwidth fees, datacenter fees, customer service people, billing people, web designers, janitors, office supplies, and basically everything else it takes to run a business. $35 million / month with probably 15-20 million a month in overhead.

    Yes they are making money (businesses are allowed to do this, remember?) Re-architecting a massively distributed game like this takes time *and* money. They underbuilt their infrastructure to begin with, which is where they really went wrong. They are supposedly trying to remedy that, but by the time you have re-architected the system it has grown to the point where you have to do it again.

    Also, they're pulling so much bandwidth from so many disparate places that when a link close to them goes down, all the other links have to compensate and there's not necessarily enough fat pipes close to their datacenters to allow everyone on. I would be curious to see what percentage of traffic flowing over certain core routers can be attributed to World of Warcraft; I am betting it is non-trivial.

  22. Same problem in Germany by MorteSicura · · Score: 4, Insightful

    If these problem are really related to AT&T, then why do we Germans experience exact the same problem? Over here T-Online is the bad guy. To solve the problem, Blizzard even suggested to alter you MTU-rate for your dsl to 1400. I don't know how many people ever heard of a thing called MTU ever. (the common people, not the nerds here ;-) ) Blizzard should ask themself why the whole IT ifrastructure are haveing problems with there product and if it is really the isp's fault.

  23. Re:What I love about patches and hotfixes... by Glonoinha · · Score: 3, Insightful

    Actually, that's how software maintenance happens in the real world.

    Real code is complex, and generally written as a massive matrix of inter-related side-effects causing things to happen*. When it gets written, the entire matrix is designed, intended, documented, and understood. Two years later the guys working on the code have no clue about the matrix of side-effect driven code, no clue about the complex set of business factors driving the technical aspects of the code (and by business factors, in a MMORPG I mean things like class X has bad faction with everybody making it more difficult for him to start out, but in return for overcoming that challenge has more powerful magic later in life - stuff like that) and when they are making a change they go in, find the one line of code that looks like what needs to be fixed and just change it without knowing all the places that change will ripple back to, invisibly, via the side-effect matrix.

    A technical phrase to understand here is 'globally scoped variables' - and another one is 'design intent' - and as the current set of hacks don't understand the ramifications or scope of either, this is what happens.

    Footnotes
    * I didn't say it was a good idea. I just said it happens.

    --
    Glonoinha the MebiByte Slayer
  24. Re:wow by Corgha · · Score: 4, Insightful

    As someone else mentioned, I think they are still a victim of their own success. Sure it's been over a year since launch, but they were expecting 250,000 subscribers and got 6,000,000.

    The controlling factor for their server performance should not be the total number of subscribers, but the number of subscribers per realm, and Blizzard has complete control over that number, because they can mark a realm as "full" and disallow logins/signups. IOW, as you know, those 6,000,000 people are not all playing in the same game at the same time.

    It should be possible to make the realms completely independent, so that this just becomes a matter of horizontal scaling, and having hardware/systems monkeys roll out new realms via some standard operating procedure.

    Unfortunately, based on the rumors I have heard, Blizzard has chosen to tie a bunch of stuff together. For instance, the common web forums use the characters from all the realms (the web forums know about your level 23 mage), they have a single set of auth servers, it's not clear that the item databases are not shared between realms, and so on. This is sort of sad, because it's not like Blizzard are the first people to roll out an MMORPG.

    Now, some might argue that tying some of this stuff together makes for a better user experience. However, when this entanglement leads to downtimes, one could make the argument that it's not worth it.

    Anyway, my point is not to bash on Blizzard; I'm sure they've made some difficult design decisions correctly, and some difficult ones incorrectly. My point is that "we have lots of users" is not a good excuse when you have a service that lets you divide those users into sub-populations, and that there are probably architectural improvements they could make to improve their scalability. The real question is whether they have competent and experienced systems engineers to help them make those improvements, and whether management is committed to supporting them.

    Anyway, so much for pre-coffee ramblings....

  25. Well... by Svartalf · · Score: 4, Informative

    ...while you're not an idiot, I can understand where they could end up with one supplier for bandwidth.

    1) You need a SLA with each ISP you pull backbone level feed from. You can use InterNAP and hook into the peering points in the US and a few other places, but it's got it's own issues- and if you just use them, you're still with only one ISP; if they fail, you're still up a creek without a paddle.

    2) You'd need to frame the servers into one massive data center with a HUGE honking data-pipe from each ISP with BGP routing on the inbound routers from the ISPs to your DMZ to establish one IP address range for the front-facing servers

                OR

    Come up with some sort of nasty DNS trick to hopefully make the server front-ends transparent to the clients and spread them across multiple IP blocks (Which is what epicRealm did to make their CDN actually completely transparent to client and customer- and to be able to handle dynamic HTTP content...)- but be prepared, because in order for this to work right, you either need to trust the client's state, share state across server pools on different IP blocks, be stateless, or somesuch like the previous.

    There's a bunch more, but those above two and the first item will hopefully show you why someone (a bean counter, most likely...) will make the decision to just simply hold the ISP or Tier-1 host (Which is the most likely case here- they're very probably colocated at an AT&T Tier-1 facility...) to the SLA they promised- because it's cheaper and waaay simpler if everything goes right and they're "not to blame" if things go wrong. If you went an alternate route and had a mishap that wasn't server related, then you'd be to blame and have nobody to point fingers at when it all broke (And you just KNOW it will at some point- it always does... :-)

    --
    I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas
  26. Re:Monthly fee by podperson · · Score: 4, Interesting

    Call me anal, but it's bad enough when I pissed half my college years away playing Diablo II online for free. I don't see the point in having to pay for the privilege to waste my time.

    Actually I think it's a good thing to charge a monthly fee, that way even folks who don't understand the concept of opportunity cost won't be blissfully unaware that playing games all day is never "free". The really annoying thing for me is that most of these games require you to, basically, work (in the game).

    E.g. in WoW at some point you'll want to collect a set of gear from Molten Core. Each class has eight pieces of "tier 1 set gear" which can be obtained from Molten Core (we'll ignore the other stuff you can get there). It takes 40 people to clear Molten Core, you can only do it once per week, and you get about 20 pieces of set gear from one trip. Do the math and, optimistically, you'll need to do Molten Core 16 times to equip each of those forty people (of course, it will actually take much longer -- say six months -- to get most of the people most of their pieces).

    Now, every visit to Molten Core -- once you figure out how to do it -- is pretty much the same. So after your first few nightmarish two-three evening death-a-thons, you'll eventually be able to "do" MC (as it's known) in maybe three hours. So we're talking at absolute minimum 48h solid gameplay, much of it mindless repetition. (You know how to do everything, you're just waiting for your helmet to "drop".)

    But that's not all. At least until you all become very well equipped, Molten Core takes a toll on your equipment and consumables (e.g. potions and ammunition). To stock up on victuals and repair your gear, you'll probably need to spend another couple of hours prep time for each "adventure". So, we're now talking, at absolute minimum, 80h of solid grind to get a complete suit of "tier 1" gear. Again, all of this is mindless repetition.

    Now Molten Core is just one instance. I don't know how long it took to assemble it, but I suspect it would take a team of developers fewer person hours to put something like Molten Core together than it will take a typical guild to finish collecting set armor. Of course, they had to attend meetings and so on, so multiply that by ten, but what you're looking at is the fundamental flaw in all current MMORPGs ... they leverage a small amount of content with a gigantic dollop of tedium to keep people online as long as possible, paying their monthly fees and ruining their expensive college educations.

  27. Translated by Danathar · · Score: 3, Funny

    "My crack pipe...My crack pipe!....suck...suck....It's not working right!"

  28. Well, no, but... by mbessey · · Score: 4, Funny

    With proper pipelining, you CAN get one baby after an initial nine-month delay, then a baby a month of throughput until your cache is depleted.