Slashdot Mirror


LiveJournal Servers Go Down

Wind writes "According to any journal hosted off of LiveJournal.com, the LiveJournal data center Internap has suffered a critical power failure, leaving all of LiveJournal and its content temporarily offline and requiring the revival of 100+ servers. Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size? Updated information is posted here."

22 of 596 comments (clear)

  1. slashdot has repeated 503 errors, by Anonymous Coward · · Score: 5, Insightful

    and search.pl is constantly being trashed by distributed xanga botnets. perhaps michael wasn't quite prepared to be an editor of slashdot?

    1. Re:slashdot has repeated 503 errors, by stupidfoo · · Score: 5, Insightful

      How is this a troll? It's funny that an "editor" at site with as many problems as slashdot has feels that it isn't amazingly hypocritical to mock another site that is currently having problems. People in glass houses indeed.

      Slashdot has semi-major problems almost every day. 503 errors, "nothing for you to see here" annoyances, and a search engine that goes down more than a Thai hooker.

  2. What a cock by realdpk · · Score: 5, Insightful

    "Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?"

    Perhaps shit happens, and a blog service doesn't warrant the necessary investment to survive whatever caused this outage?

    1. Re:What a cock by qcubed · · Score: 2, Insightful

      i'm sorry, how exactly does this reflect poorly on sixapart?

      THIS doesn't reflect poorly on them. their licensing scheme for movabletype does.

    2. Re:What a cock by casparianaremi · · Score: 2, Insightful

      I was prepared for my friends page on LJ to be full of "It's SixApart's fault" but didn't expect to see it here! Six Apart bought the company but have made no changes, the servers would be down whether they'd done it or not, so to claim it's their fault is just pretty dumb, IMO.

    3. Re:What a cock by mlefevre · · Score: 2, Insightful

      "half at one co-lo and the other half at another co-lo"

      Then they'd either need multi-gigabit bandwidth between the two co-los (which would probably cost for a week what they make per year), or they'd have to make separate, semi-independent communities. Google's servers don't stay in sync - you get different results according to which servers you hit, which isn't something you can do with "live" journals.

  3. Was that really called for? by Anonymous Coward · · Score: 3, Insightful

    Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?

    Ok, I understand that you don't like Six Apart; I'm no fan of their new licensing scheme either. However, I really doubt that SixApart has any control over any power failures that might occur at Internap.

  4. Re:Disclaimer: I am Not an Electrical Engineer by DrBlubGut · · Score: 2, Insightful

    Happens some days. A key breaker at a data center some of my friends work at went bad and took the whole floor with it. Generators didn't even get a shot at takeing the load because the breakage happend later in the circut. No matter how big and bad your infrastructure some points in the design will not be 100% resistent to all problems. We do our best, make plans, design good systems but the world teaches us Shit happens.

  5. Six Apart is hosting them already? by MattW · · Score: 3, Insightful

    Er, they just announced Six Apart was buying them like days ago. I doubt they transitioned the servers in the first week.

  6. Re:./ed !!!! Server Reboot Time? by bradfitz · · Score: 5, Insightful

    They all came back up when the power came back.

    But we intentionally don't have databases come back up on boot because if there was a blip, we want to do an integrity check first. (we run InnoDB, so it's ACID, but we're paranoid ...)

    We have clusters of 2 identical databases in separate cabinets, separate switches, separate Internap power feeds... so normally losing one database in each cluster doesn't matter: the other one gets used. But when we lose every single database, in all clusters, all at once... that's the time to be paranoid and double check stuff.

  7. Where's my irony stick? by gmhowell · · Score: 4, Insightful

    Because michael needs a beating. The site that rolls beta (alpha?) code onto live servers complaining and making jokes because another site goes down through no fault of its own?

    --
    Jesus was all right but his disciples were thick and ordinary. -John Lennon
  8. No... by EdMcMan · · Score: 4, Insightful

    Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?

    What does Six Apart have to do with Internap? Livejournal has been using - and wanting to switch from - Internap for a long time.

  9. Re:Elsewhere by kd5ujz · · Score: 1, Insightful

    I think he intended "like".

    --
    -William
    God is everything science has yet to explain.
  10. Not related to Six Apart by wersh · · Score: 3, Insightful

    From the article write-up (and reflecting the thoughts of quite a few of the comments I just read):

    Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?

    I'd love to know what makes you think this has anything to do with Six Apart. The very first line at http://www.livejournal.com states:

    Our data center (Internap, the same one we've been at for many years)...

    They've been with Internap for years, predating Six Apart's takeover. Unless LJ staff is lying, the fault here sounds like it lies entirely with Internap.

    And as far as I can tell, Six Apart didn't ditch the LJ team when they bought them out, so you probably have the exact same people working on bringing the site back up now as you would have if Six Apart had never got involved.

  11. Re:Update by Anonymous Coward · · Score: 1, Insightful

    sorry, I missed the part where Six Apart actually changed something...

  12. Re:Update by Anonymous Coward · · Score: 1, Insightful

    uh, this is not Six Aparts fault. This has nothing to do with them. And, they won't lose subscribes.

    How do I know this? Outages are common on LJ....

  13. bigger explination by moosesocks · · Score: 4, Insightful

    I'm surprised to see that Internap's main servers are back up. It's pretty irresponsible to bring up your corporate servers before those of your clients.

    That being said, LJ's servers are back up now, but they're making sure that the databases are all in sync -- LiveJournal has one of the most massive distributed MySQL clusters in existance along with a complete caching system.

    They need to make sure that the database is all synchronized before bringing it back up -- chances are they're going to rebuild the cache too. If they didn't, the initial strain on the DB servers would probably bring the site down again.

    This does however, bring up some questions about LiveJournal's network infrastructure. Danga (the creaters of LJ, recently purchased by Six Apart) are heavy users of Perl and MySQL. Needless to say, they have made numerous contributions to both projects and have developed an innovative memory caching system for linux.

    The questions raised however, come from Perl and MySQL. Both are questionable in terms of scalability. Although I'm not qualified to comment on this, I belive that the general concensus is that MySQL is one of the least efficent databases today. Livejournal has 100+ servers. I honestly don't think that a system the size of LiveJournal should require a server cluster that big. It seems that they are trying to solve their performance/reliability problems by blindly throwing hardware at it.

    Of course, I love livejournal. It's simple, easy to use, and is a great tool for building communities. Just as it is simple, it can also be incredibly nerdy (there's actually a command prompt!). They're also completely open source.

    Hopefully, Six Apart can make their network infrastructure more 'professional' while still maintianing the community spirit that has made it so successful.

    --
    -- If you try to fail and succeed, which have you done? - Uli's moose
    1. Re:bigger explination by Kyrrin · · Score: 4, Insightful

      As we've said a bunch of times in the past, moving away from MySQL would be prohibitive. By now we know how to make it work for us; switching away from MySQL would not only involve massive rewriting of stuff and alterations on the existing DB, it'd take the next five years before we got as comfortable with the flaws and advantages of another DB package.

      Sure, MySQL has its flaws -- some of them pretty big -- but we can work around them.

      As for the "not needing a server cluster that big" -- do you have any clue how much data we push in an average day? We maintain so many DB clusters to improve reliability, and we maintain so many web nodes because we push a screaming shitload of traffic.

  14. I call bull on all this by Anonymous Coward · · Score: 3, Insightful
    There seems to be a lot of latent hostility towards teenage girls. WTF? Your outlet is geeking out on Slashdot. Theirs is LJ. And how do you all know so much about the content of LJ anyway?

  15. Re:./ed !!!! by Anonymous Coward · · Score: 1, Insightful

    No, it is based off them having toolbars installed on people's computers monitoring (with the user's semi-agreement) what URLs they hit.

    It all depends on their sample representing the internet as a whole, whatever that is.

  16. Re:Internap is *down*? by slavemowgli · · Score: 3, Insightful

    That's debatable, and depending on what database you use, having more than one database server (or pool of database servers) in different physical locations that are kept in sync at all times is definitely possible. I'm not sure whether MySQL allows this, but I think if you have a site that has nearly 6 million users, more than 100000 of which are paying you for the service you provide (I'm one of those, one might add), then you really should look into doing just that - or at least I hope the LJ people will do now (I don't really want to blame them for the problem).

    That being said, I think you didn't quite understand what I was trying to say. I really don't care whether they have "plenty of backup power", "plenty of generator capacity" and "top-of-the-line big datacenter grade stuff" (which really sounds more like a collection of buzzwords than anything else, anyway). If a wiring fault (of whatever kind) can bring up the entire UPS system as well as the "generator capacity behind that" and all other safeguards they supposedly had in place as well, then it's just worthless and a waste of money - a UPS is supposed to be an *uninterrupted* power supply.

    And while I admit that it's not possible to guard against *all* problems, saying that the colo facility is "one of the most solid in the state" and supposedly can't be taken offline by something "short of a direct strike from a comet" is just silly when a "wiring failure" can bring down the whole thing, and even more so when it's not the first time that happens.

    Really, this just stinks of an attitude that's all too prevalent in parts of the IT industry - just piecing together the components of a reliable system won't necessarily give you one, and if you can't build one properly, then don't go advertising that you have one. Don't you think the fact that the LJ people are now planning to buy their own UPS equipment to use on top of the facility's should tell you something?

    Oh, and regarding six nines of uptime - I don't think you actually realize for how little downtime that actually would allow. It's about 30 seconds per year, and Livejournal has been down for at least 16 hours, which corresponds to an uptime of about 99.8% - only two nines left. They probably (hopefully!) won't fall down to one, but things are bad enough as it is, and I, at least, fully blame Internap for that (and, again, I'm a paying user on LJ, so I reserve the right to do just that. ^_~)

    --
    quidquid latine dictum sit altum videtur.
  17. Re:Value of Livejournal - "Open Source Philosophy" by TiggsPanther · · Score: 2, Insightful

    Oh yes. If I ever feel the need to post any of those quiz-things I make good use of the <lj-cut> tag. So if anyone on my Friends list (or a random person finding my Journal) doesn't want to see the results they don't have to.

    Actually one of the more useful LJ Features i know of is one that allows you to screen out images over a set size from your Friends list. So you need to view the entry in question to see the image, which is good for your bandwidth and/or narrow page layout.

    --
    Tiggs
    "120 chars should be enough for everyone..."