WoW Downtime Interview at Penny Arcade
The short answer is "The game is capable of supporting this many players," but it would probably be helpful to provide some background information. Based on our market analysis, we made some initial calculations about the size of the massively multiplayer online games market in the United States. We then accounted for new customers to the genre based on our previous games. Looking over this data, we did believe that there was the potential for an extremely sizable interest in a Blizzard MMOG. According to our research, other successful MMOGs in the U.S. had achieved roughly 300,000 subscribers after 12 months of operation. What ended up happening with World of Warcraft is that we achieved double these numbers in approximately the first six weeks of launch. We absolutely can support the number of copies we put on shelves, but we believed it would take us longer to get to this number in terms of players purchasing the game and logging on.
We had not anticipated this amount of growth in such a short time; however, we did have a backup plan that was deployed rapidly. In the first week of launch, we more than doubled our number of game servers and server infrastructure to accommodate the demand. The fact that we had planned to grow the service over the first 12 months of operation was evident, as we had server hardware waiting to be deployed. We just anticipated that this server rollout would be gradual. Copies of the game were being purchased at a much faster rate than anticipated, so we had to abandon our slower-paced plan and go into rapid deployment to accommodate these additional customers. This meant we also had to advance our timetable for additional server purchases.
With such a rapid growth of the network, we started to see several bottlenecks in the infrastructure that exposed themselves very quickly when the expanded hardware immediately took on massive load. These bottlenecks were solvable, but they required additional upgrades to the backend systems to accommodate the load--which, again, we hadn't planned to see, even with the extreme estimates, until later in the year. Regardless, server stability has remained our number-one priority, and so we acquired and deployed even more equipment as part of the process of addressing these issues. All of this new hardware also required additional software and operating system upgrades on the backend. The problems that some players on the 20 or so most populated servers (out of the current total of 88 servers) have been experiencing are related to some of the upgrades not functioning as desired. We are working diligently with our vendors and internal technical staff to get as quick of a resolution to the problems as possible, and we believe there should be noticeable improvements soon. When our community team commented that people are working 24/7, they weren't exaggerating.
2. If it's true that the server problems are related to the overwhelming number of players, why was no effort made to better distribute players evenly across realms, or allow players and guilds to transfer to less populated servers?
We actually did have a number of checks in place at launch to distribute players as evenly as possible across realms. When a new account logged in, the game would ask what realm rule set and time zone the player preferred, and then it would suggest the realm with the lowest population that matched the selected preferences. That said, we're definitely working on resolving the overpopulation problems that ended up occurring on some realms despite our preventative measures. A realm-transfer option that would allow players to move from their high-population realm to one with a low population is one of the things we're investigating. We're exploring this option fully and hope to be able to communicate more detailed information about it to our customers in the coming weeks.
3. Currently, large scale player raids involving large groups of players experience a huge amount of latency. How do you plan to compensate for this in your upcoming PvP Battlegrounds feature?
The player raids often have hundreds of people per side in one area; that area is on a server that is also running the rest of the continent, and that can result in the latency you describe--depending, as well, on the total population of that server. We're continuing to look into the issues surrounding this dip in performance. Battlegrounds, on the other hand, will run on the instance server, so there should be no such issues. Additionally, players will be unable to "zerg" in Battlegrounds; there will be a limit to the number of players per side.
4. What accounts for the frequent "emergency" maintenance downtime? What issues are you attempting to resolve?
The emergency maintenance periods are to restore stability while we continue to narrow down the cause of the problems. Some of them are also to deploy temporary fixes to various in-game systems while we continue to develop a longer term, more stable solution. World of Warcraft delivers many complex features that are unique to MMOGs. Features such as the in-game mail system, auction houses, player inventories, flight paths, quest states, etc. use a lot of server bandwidth, which makes pinpointing problems on the server infrastructure much more complicated.
Recently, the extended emergency downtime for a certain number of realms was needed in order to better accommodate our growing player base. Some of the upgrades that we planned for all of the realms were made to these realms first, as they are among the most populated and thus most in need of aid. We set the realms up on the latest top-of-the-line hardware and made the software upgrades accordingly, but some unforeseen issues cropped up with the database that resulted in the problems players currently see. This is no fun for our player base, of course, and we don't want to keep the realms running in a condition that frustrates our customers when we can attempt to fix things . So, these downtimes have been used to change hardware and apply fixes that will hopefully alleviate the issues. We have not yet resolved the problems, but we're working on this around the clock.
5. What issues are you experiencing with your login/authentication servers? It is often the case for myself and the people I play with that we cannot access realms our friends are already logged into.
These types of issues stem from the problems described above. Conflicts occur between some of the internal applications running in the background, and the end result can take the form of temporary login issues. We're working to resolve these conflicts so that they are no longer a factor.
6. When do you expect to have the worst of these problems resolved?
We'll be constantly working on these issues each day moving forward until they're resolved, but we don't currently have a set date for when that will be. We're doing all we can to make sure these problems no longer occur -- it's our top priority, and we hope to have the issues fixed as soon as possible. We'll continue to provide players with regular updates on our progress.
7. Will the European launch utilize the same realms, or will these players be hosted on all new equipment? If they are hosted on new servers, what have you done to ensure that the launch will be free of the problems mentioned above?
They will be on their own set of hardware, as with our Korean release. Our teams are learning from the experience of our North American launch and are applying that knowledge to the servers in Europe. We hope to provide them with a smooth launch.
8. What would you have done differently?
It would be easy to speculate about what we could have done differently, but that wouldn't turn back the clock. Right now we're extremely focused on the issues at hand, and this focus is helping us methodically chase down the problems that are causing frustration for some of our players. The foundation of our company is based on providing a top-notch game experience and an equally top-notch level of customer satisfaction; we won't be happy until we feel we're consistently meeting those standards.
The people at Penny Arcade act like they are investigating a space shuttle disaster. It would be nice to see what level of respect the people who play games 24/7 and actually care about intermittent downtime would give a customer paying them $14 / month. I bet most of them would tell the customer to "F off you n00b".
It is a $14 a month service for unlimited entertainment, you can't expect that every single kink will be ironed out at launch.
Real world devices have real world problems, and a whole host of gamers, like Tycho fail to realize that.
Imagine if Tycho had to deal with 100,000's of people complaining that one stroke in one particular comic was 1 pt off.
The problem is that if Blizzard put the resources into making the game so that there were no problems at launch and that they had the server infrastructure to support the entire planet logging into same screen all at once the subscription fee would be so incredibly expensive that no one would play the game.
Manufacturing defects are a trade off, yes Blizzard could build a game with no bugs, but how many players would want to pay $5,000 for a copy, and $1,000 a month?
Blizzard actually made an effort to let the community know what was going on... Now the forum trolls will have to find something else to whine about. Something tells me it won't be pretty.
They shipped an estimated 2 years worth of product at release? Even if you expected to be more popular than the other MMORPG, it seems silly that you would create so much, when it would be considerate to remaster the game at some point so that new users aren't forced to download a huge patch.
It sounds like the answers are spin for we weren't ready to handle a game of this scale. We maybe soon, but you all are going to have to put up with us through the learning process.
By the way they had to release early not because of their publisher, but they needed to release near EQ2 or lose lots of potential subscribers.
It's great that they responded and all; but I have no idea why I keep expecting something different from an MMO company response when we always get the same stuff.
It's what we already knew: They 'could' have supported many players, they 'tried' to evenly distribute players and they 'will' fix the problems at some unknown date.
I'm not sure what I was expecting, but a watered-down version of what we already know that doesn't cast any blame in the slightest wasn't it. Where is that 'Blizzard Difference' everybode keeps ranting about?
Have you played this game? The AH in Ironforge is packed on my server, and it is one of the low-population servers. I have seen screenshots of it from a high-population server. Can you imagine what it would be like if there was only one server?
The problem with having one realm is not about servers, because you could have each area on its own server. The problem is overcrowding -- unless you could find some way to forcibly keep people spread out on the entire world, you're going to have popular areas that everyone flocks to.
Even if you could forcibly spread people out, having 100,000 people in the same world at one time would require a vastly larger game world. For the smaller-scale MMORPGs, like EVE-Online, this is feasible, but for the really popular games like EQ and WoW, it's just not possible.
Note to self: Stop putting jokes in my insightful comments so I can get something other than +1 Funny!
That wouldn't work.
I don't know if you've played WoW at all, but in the beta I quite often had problems with areas being just too damned crowded. In some areas (e.g. Westfall), the mobs you had to kill for your quests were always dead already, with several players camping their spawn points. And then there's contested territory. Horde player doing Tarren Mill on a crowded PvP server? Good luck staying alive more than two minutes at a time!
That was with about 1200-1500 players logged in to the realm, IIRC. I'd hate to think what it would be like with 10,000 players, much less 100,000.
Now, if Blizzard were somehow capable of creating a game world that was 50 times larger than the one they have, then putting 50 times the players in one realm would be great. But, that would obviously take 50 times the development time, or would require drastic reductions in quality.
Frankly, when I get around to installing the retail version, I'll probably explicitly join the lowest-population server I can find. I like having people around (it's sort of the point of an MMORPG), just not so many.
I thought WoW had like 500K beta accounts/testers?
The bean counters were either sleeping or stupid.
How could they NOT have seen this coming?
Are they are full of BS?
or more likely:
This is an example of a large corporation with very slow turning wheels when it comes to planning/changing plans.
I give them credit for having a backup plan in the first place... but personally, I expect MORE of Blizzard. So many beta accounts should have been an obvious indicator for what was in store, I would think..?
That approach won't work for a lot of games. In these games, people usually don't try to go off and find a place of their own, they go to the popular places and farm the mobs everyone else is farming. In many cases this is because certain spots are genuinely better for gaining experience or getting cash or items (and it's difficult for the developers to prevent any spots from being better than the others without making the game bland). People will even wait in line to get into a group camping a really good spot.
There are other factors as well. Capitol cities, or any place that players use as a base of operations would be heavily overpopulated as well. Also, as an MMOG ages, the lower level areas tend to become less crowded and the higher level areas tend to become more crowded.
Having a massive game world running on a large cluster of servers doesn't help anything if your players only inhabit a fraction of them.
My only political goal is to see to it that no political party achieves its goals.
The workload scales exponentially. Even with some wicked parallel code running on computing clusters the problem is intractable - you can support 10k people on 10 server racks, or 100k people on 200 server racks. Since the 10x increase in people requires 20x the servers it makes more sense to split things up.
There is a balance here.
As processing power becomes cheaper this problem will ease, but in the end it's still O(n) hardware trying to solve an O(n^x) problem.
If they could distribute the processing to the users securely (right!) then the problem would become almost trivial.
-Adam
This is not flamebait. It is a critical analysis of responses laden with evasive language. The game might be capable of handling all the players but they did not plan correctly and blew it. The poster's comment on #2 is DEAD ON. #6 is probable. #7 is accurate. #8 is a great summary as well, somehow blizzard failed to realize that WoW was just about the most anticipated game of the year, maybe after Halo 2. "Here is the pulse, and here is your finger, far from the pulse, shoved straight up your ass."
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
In respectful defense of our colleagues at Blizzard, I think it's safe to say there's virtually no way to have predicted the kind of volume they saw. My hat's off to them for having 88 servers ready to go around launch time. Despite all of the servers, and the huge open beta they ran.. the problems they are seeing simply don't surface until you see scale on the order of what they're seeing. In my opinion they are doing right by their customers and are being respectful their players downtime by adding free time. In this business, you can spend a lot of time planning.. and even a lot of time in beta only to find some crucial piece of technology doesn't scale as well as advertised. I for one appreciate and respect the way they're dealing with this situation. I realize the frustration of their customers.. but I think it's safe to say that the Blizzard team is filled with stand-up people that are just focused on fixing the problems as quickly as possible. John Smedley President, Sony Online Entertainment
John Smedley President, Sony Online Entertainment
I can't think of a better way to handle the situation than Blizzard has so far. They're a business, not a Entity of Gaming Awesomeness that has a crystal ball to see market demand in the future. They did some research, saw the numbers, and bought the number of servers that they thought they'd need. When a problem came up, they let everyone know, allowed another free month to some, are working hard to fix the problem, while still keeping in touch with the public with what's going on.
What more do people want? They have tried their best based on what they know, and when things went wrong, they responded very quickly. Several MMOs have problems and are dickheads about it.
The fact that Penny Arcade yanked the Game of the Year award away from WoW is just immature, in my opinion, given Blizzard's response to the situation.
By the way, I've been noticing that Penny Arcade takes a shitton of time to load. I demand they fix the problem instantaneously! I don't care what it takes! I'm going to yank their links from my site unless they get on it right now! I'm sending "terse questions" next week.
One thing that people don't often mention is that, while the lag was getting pretty terrible, the game was still mostly playable. It might take several minutes(!) for an Auction House query to return, or for an auction to be created, or for your email to show up, but the transactions DID eventually happen.
:-)
Instead of just blowing up completely, by and large, WOW fails fairly gracefully. The engineering in that is non-trivial. I don't think people realize just how good that code is. Getting a system that stays reasonably stable under completely untested loads is really, really hard. I am VERY impressed with the quality of their design and code work.
And, yes, some data did get lost... the servers did eventually seize up and crash completely, and often there'd be some data lost. But, at least in my case, it was never much more than a skill point or two, or a few hundred experience.... twenty minutes of lost playtime. I'm sure some people had a worse experience than I did, that's the nature of random data loss, but I wasn't badly affected.
When you consider the sheer scale of what they're doing.... they had TWO HUNDRED THOUDAND PEOPLE AT ONCE playing their game not long ago. The scale of that just boggles the imagination. If you assume 32kbits/second down, and 2kbits up (probably a bit skinny, but I'll try to err against Blizzard here), that's an aggregate total of 10,880,000,000 bits per second. Roughly 10 gigabits, or total saturation of an OC-192. Just the FIREWALLING on that kind of traffic is a HUGE project! Admittedly, they've broken that up into 3 or 4 datacenters, but doing firewalling and connection tracking on a mere 2.5gigabits is still pretty daunting. And that is completely ignoring the application servers, the load balancing, the inter-server communication, the databases..... just 1% of this project is a HUGE BIG DEAL.
The fact that we were able to pile in there at that kind of speed and the game didn't seize up and die completely is a resounding, amazing success. I'm sure the Blizzard guys aren't feeling too great about how things went by now, but.... guys, you kicked ASS. You did somethiung that has never been done before, a level of complexity nobody else has ever reached, got loaded down with about four times as much traffic as you were expecting.... and STILL mostly succeeded.
I have every faith that you'll work out the remaining kinks and bottlenecks.
By the way, the last couple days, on Uther, have been quite good... I think I got one disconnect in two days, and there really hasn't been any lag. They may nearly have the problems fixed. I haven't been thinking about bugs, just about slaughtering beasties.