LiveJournal Servers Go Down
Wind writes "According to any journal hosted off of LiveJournal.com, the LiveJournal data center Internap has suffered a critical power failure, leaving all of LiveJournal and its content temporarily offline and requiring the revival of 100+ servers. Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size? Updated information is posted here."
Sounds like someone was taking a nap over at Internap
You can't imagine the withdrawals I'm going through. It's like the great Slashdot brownouts of '98.
I need my fix, man!
We should tell them so. =)
----geppy -
Oops?
I prefer a void in conversation to a vacuous one.
In related news, 6,000 teen-age girls were heard yelling "OMG! WTF! How will John know I life him if I can't blog about it!"
An effective signature identifies a particular user amongst a base of thousands.
Well, it wasn't slashdot atleast... Bringing 100+ servers back online isn't an easy task lol ^^.
Good luck to them.
...the collective IQ of the internet has raised about 20 points.
but that's ONE HELL of a Slashdotting! :)
Join the TWIT army now!
and search.pl is constantly being trashed by distributed xanga botnets. perhaps michael wasn't quite prepared to be an editor of slashdot?
Someone forget to buy gas?
Bush just appointed Internap's CEO to his National Infrastructure Advisory Council, yet the man can't keep a co-lo facility switched on.
I'm not sure what that says of Bush or of Interap. And it certainly doesn't seem to have anything to do with SixApart.
Man I am sooo putting this in my LiveJournal!
Why did you have to go and cause a power outage?
I meta-mod all positive moderation Unfair, because it's abuse of the system.
"Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?"
Perhaps shit happens, and a blog service doesn't warrant the necessary investment to survive whatever caused this outage?
Internap.com is still up, they aren't stupid enough to use their own servers.
so it's deadjournal now ?
How will the world find out how it sucks to be a freshman in Podunktown High School and how Angela is such a slut for making out with that guy in History class and how my parents are total dimwits when it comes to gangsta rap?
Well now the millions (?) of users might actually have something to write about when the servers are back up. "Today I went outside. My pupils have never been tinier..."
Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?
Ok, I understand that you don't like Six Apart; I'm no fan of their new licensing scheme either. However, I really doubt that SixApart has any control over any power failures that might occur at Internap.
What a sad, lonely, despicable person you must be.
Where will I write about my depression over this event?
Oh. Slashdot.
"Our data center (Internap) lost all its power, including redundant backup power, for some unknown reason. (unknown to me, at least)"
Coffee is a hell of a drink.
That's too bad. [sarcasam]I really enjoyed the great editorial features![/sarcasam]
There are other hobbies in this world besides complaining about something you can't control.
That's what you get when you hire Tim Allen as your electrician at a Data Center.
Al Borland was nailing Heidi behind the stage when the outage occured.
Where were the APC backups?
No, because there are still such things as chat rooms and forums.
Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?
Perhaps the submitter doesn't have a clue what the fuck he is talking about? Their hosts suffered a complete power failure. How the hell do you protect against that, short of buying out the data centre company and running it yourself?
Use the Coralized link. No sense in crashing their status page. Plust it'll respond a lot quicker than loading the actual web page.
It was an on-site power failure -- I don't see how you can blame them (new owners) on that...
http://www.google.com/search?q=livejournal%20outag e&sourceid=mozilla2&ie=utf-8&oe=utf-8
lj_maintenance: Livejournal Outage ... Livejournal Outage LiveJournal is currently under a Distributed Denial of Service
attack, and has been since about 5:30pm PST (1:30 AM GMT) tonight. ...
www.livejournal.com/go.bml?journal=lj_ maintenance&itemid=55410&dir=next - 11k - Cached - Similar pages
lj_maintenance: LiveJournal Outage Update ... LiveJournal Outage Update The Distributed Denial of Service attack that
began on Thursday has not subsided. We were able to make ...
www.livejournal.com/community/lj_maintenance/55947 .html - 10k - Cached - Similar pages
[ More results from www.livejournal.com ]
This is my signature.
I think it was a bad idea to have a site slashdotted while its down . . . . it shouldn't be able to stand a chance. No really, I wish they would have waited a little while. Now the admins are wondering why they suddenly are getting 200,000 hits.
Well the power outage was not a person's or organisation's fault, it just happened. I wonder what Danga would have done that Six Apart is not doing to bring the servers back online. By the way, don't they have diesel generators for backup?!
I feel a great disturbance in the force..... It's as if a million bloggers cried out all at once..... and became silent.
The population of depessed pre-teens has just dropped by 20%
are servers with LOM (lights out management) superior in this case?
Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
It's not like most LiveJournal user's have enough to worry about, here's something for most LJ users to get melodramatic about. I'm serious, randomly pick 5 LiveJournal blogs, and I guarantee 4 out of 5 are going to be "Fuck the World" posts.
Mood: anxious...
Check out this page on the Iternap site for a real laugh. The flash page is a real hoot too.
Anyone seen my jagged little pill?
LiveJournal's offsite status page is status.livejournal.org.
It's terrorists, obviously.
OMG OMG OMG!
Gonna tell my LJ friends! They're going to FREAK!
OMG OMG!!!!!!! WAIT LOL!!!! How do you blog that the blog is down when the *blog is down*? This sucks! What do I do! OMGLOLWTF!
I"m going to take a picture of my cat and put it in flikr like, ten times. LOL OMG !!
"Update #1, 7:35 pm PST: we're up on 'dirty' power for now (it works, but it's unreliable), and we're working to assess the state of the databases. The worst thing we could do right now is rush the site up in an unreliable state. We're checking all the hardware and data, making sure everything's consistent. Where it's not, we'll be restoring from recent backups and replaying all the changes since that time, to get to the current point in time, but in good shape. We'll be providing more technical details later, for those curious, on the power failure (when we learn more), the database details, and the recovery process. For now, please be patient. We'll be working all weekend on this if we have to."
- Captbaritone
...the collective blogger angst should be a sight to see.
Those fools!
Didn't they see the option in the BIOS clearly marked as "Restart after power failure"? You just can't get the staff these days! * rolls eyes *
People that believe in their opinions don't post AC.
sounds like all the fucking spammers they host overtaxed spammer-nap's power resources and brought it all down.
Seriously though, spammer-nap is a massive spam haus, see for yourself
Lawyers, MBA's, RIAA? A jedi fears not these things!
I know nothing of how InterNap is set up. I just want to throw that out there ahead of time. Now, it's time for my patent pending "Bull Shit Theory of the Day."
Ok, here is the rant. I used to work for a Colocation facility. Nothing special, small by Telco terms. The whole facility only had about 1500 cabinets. (Though I hear they are now full, and going to be expanding.)
We had a main power draw off of the local grid. We had a backup power draw off of the *next* cities power grid. (ie, when all the offices around us went dark, we still had power.) And you don't even want to know the kind of red tape we had to go through for *that* pull. I'm still not sure how they did it. We had fly wheel kinetic electricity storage systems, battery backups, and a diesel engine from a train so large it had it's own building.
We used to joke that if we lost power, we had more important things to worry about. And again, we were small time compared to some of the massiveness that is out there. *cough*AADS Chicago*cough*
So I'm kind of in agreement with the statement currently on LiveJournal. It's unknown to me how any self respecting colo facility can say "We've had a power outage that also took our redundant systems."
I have to call bullshit on that entire train of thought. If that's true then they don't *have* any redundant systems, and I'd be looking for a new provider. The most likely thing (at least in my mind) is that someone, somewhere got mad at something specific and decided to make a point by popping the main breaker to their portion of the facility.
Oh, that was another thing, each room had several "main" breakers. It took a hell of a power surge to pop all of them, and the Liebert systems had power filters of some kind, really really big capacitors or something I think, so a surge really never made it to the other side anyway, it got stored in the cap and then trickled out like the rest of the power.
But I was a UNIX admin, not the EE that was planning the power generation aspects of the facility. So take some of it with grains of what ever white powdered spice you prefer.
"Genius may shine aloof and alone, like a star, but goodness is social, and it takes two men and God to make a Brother."
How to Use Your Browser Tip #63:
When you put your mouse (that's the little box with the long cord and the buttons) over a link (that's a shiny word), it will display the address (that's a bunch of words with slashes and colons and stuff) that it points to in the bar (that's a horizontal strip, probably grey) at the bottom of your browser window (that's the magic box your browser lives in inside your computer). If you see the words "livejournal.com" in that address, it probably means the link (refresher: that's a shiny-looking word) goes to LiveJournal.
Now, assuming you hate LiveJournal, clicking (pushing the button) on that link (one more time: a shiny-looking word) will make you unhappy, forcing you to search the Internet for child porn to find solace (see How to Use Your Browser Tip #91: "Boffo the Clown Shows You Everything You Ever Wanted to Know About Smut, You Filthy, Filthy Perv!").
We're out of combinations of phonetic sounds. When we make up new words, they sound so fucking retarded, like blog. Say it out loud. Tell me you don't feel like you just lost 100 IQ points.
Why can't we just call them what they are: online diaries/journals?
I don't need no instructions to know how to rock!!!!
Update from the site:
"Update #1, 7:35 pm PST: we're up on 'dirty' power for now (it works, but it's unreliable)".
Congrats to LiveJournal for assembly a coal generator in a record time.
On the Livejournal main page:
Update #1, 7:35 pm PST: we're up on 'dirty' power for now (it works, but it's unreliable), and we're working to assess the state of the databases. The worst thing we could do right now is rush the site up in an unreliable state. We're checking all the hardware and data, making sure everything's consistent. Where it's not, we'll be restoring from recent backups and replaying all the changes since that time, to get to the current point in time, but in good shape. We'll be providing more technical details later, for those curious, on the power failure (when we learn more), the database details, and the recovery process. For now, please be patient. We'll be working all weekend on this if we have to.
Lovely. I just bought another year's subscription for my wife, figuring the change to Six Apart wouldn't change anything for a few months at least. LJ could lose a lot of subscribers with an outage just after the takeover.
live journal is dark like my soul like my heart a void its link is cut just like i'll be doing to my arm i blame my parents
Michael, SHUT THE FUCK UP with your stupid editorial snide remark.
1) How the fuck is this Six Apat's fault when they * just * took over?
2) Sometimes, no matter what you do, shit happens.
3) If this has a human fault, it was the previous owners who didnt pay or check their servers would survive a power out.
4) Just what is your problem with Six apart that you make such comments anyway? Your an editor, it is not your place to make stupid snide cooments that are clearly BS.
I'm sure a slashdotting is exactly what LiveJournal needed right now.
But i feel like this could be turned into some sort of brilliant DDOS attack scheme in the future...
Looks like the angst over at Livejournal is no longer limited to the database.
... as if millions of teenage girls suddenly cried out in terror and were suddenly silenced.
half of the newest entries.....oh wait, I see someone's already got the emo lj user-base jokes covered. I have so much room to talk because I have an lj too, haha. But, seriously, how can this have anything to do with six apart?
This sig is o Unfunny o Funny
This is another thing that bothers me about this scenario. I can't say that I've ever admined 100 servers, the most I've ever had was about 30, but if we had a power loss of any kind, you'd just repower them and walk away. Most of them were DEC Alpha gear running Tru64. Why would you spec out a box that has to be handheld every reboot? The only time you should have to handhold a server is during an upgrade. A power cycle without proper SIGHUP or term signals should just run fdisk on it's way back up. (K, so it might take an hour for the server to go live again, but still.) I mean, am I missing something here? Maybe since nothing I've admined got the traffic these things do .... I'm just lost. Some one hit me with the clue by four.
The only thing I can even think of is they have explicit services that must be started manually ..... but why would you want that? If you have a power hiccup in the middle of the night, you want it to come back up, and be live and happy again *before* you even get the first page. I mean sure, if there was a surge, and that destroyed components, and those components have to be replaced ..... but ..... a reboot is a reboot, man. Here, smoke some source. It's the good stuff.
"Genius may shine aloof and alone, like a star, but goodness is social, and it takes two men and God to make a Brother."
Er, they just announced Six Apart was buying them like days ago. I doubt they transitioned the servers in the first week.
That's one 1 down...
And a massive cheer was heard across the land...
Wow. Can't the poor guy do anything right?
I'm not an electrical engineer, either, but I'm wondering what Dirty Power is? Is that the unfiltered power that tends to anomilate, per the Monster Cable surge protectors advertising? Or am I thinking of something else?
For anyone who uses Authorize.net, you may have noticed a little downtime a few hours ago. They run their stuff through Internap, and went down for a decent while before coming back up. I called the tech guy and apparently there was a huge power failure at their ISP that took them offline. Wasn't entirely sure if that was 100% accurate, but this new info about Livejournal certainly corroborates the guy's story.
I'd like to request that the links to LiveJournal be removed from the Slashdot post. It's not as if people don't know how to get there, but at least they are less likely to click just for the hell of it or because they didn't think before hand. I doubt that it will get the servers back up quicker, but at least we won't have to be concerned that we're part of the problem. Thanks for hearing me out.
watch it'll come back up as a subscription site and all of your journals are erased if you don't pay..
Regards, Joseph
from the updated message they're giving, I'd have to assume that they're going through all the databases, looking for data incongruencies caused by calls that were not completed due to the power loss... especially if there were data that had to propagate its way through the network, it could be quite troublesome if all the data connections were to just drop like they did
hee hee ho ho ***clunk*** hahahahahahahahahahahahahahaha
well said mate.
They all came back up when the power came back.
...)
But we intentionally don't have databases come back up on boot because if there was a blip, we want to do an integrity check first. (we run InnoDB, so it's ACID, but we're paranoid
We have clusters of 2 identical databases in separate cabinets, separate switches, separate Internap power feeds... so normally losing one database in each cluster doesn't matter: the other one gets used. But when we lose every single database, in all clusters, all at once... that's the time to be paranoid and double check stuff.
Tonight is the night for power failures. My employer's main data center in Ann Arbor experienced an external power failure just before 5.
OMGWTFBBQ!
Reviews with a twist! http://www.sardonicbastard.com
Brad, get back to work! I need my friends page!
LiveJournal Servers Go Down
With thousands of teenage girls unable to ponder in an open forum whether or not to blow their boyfriends, thousands of teenage girls go down.
500GB of disk, 5TB of transfer, $5.95/mo
Because michael needs a beating. The site that rolls beta (alpha?) code onto live servers complaining and making jokes because another site goes down through no fault of its own?
Jesus was all right but his disciples were thick and ordinary. -John Lennon
Does anyone know why the SomethingAwful forums are down? This is much more dire of an Intarnet catastrophe than ten livejournals!
lj == lockjaw
tell me that many will blog about how they couldn't blog. Some will complain about the stress of not being able to express themselves, others will question the engineering prowess of LiveJournal and wildly speculate about the cause of the power outage followed by plans to re-engineer the data center, the LiveJournal infrastructure and 93% of the Internet to ensure this never happens again (X.25 over barbed-wire will be suggested).
Several individuals will join together to file a class action lawsuit against LiveJournal and the data center citing their inability to express themselves due to neglicance and will seek real and punitive damages totalling over $2.5 billion dollars.
Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?
What does Six Apart have to do with Internap? Livejournal has been using - and wanting to switch from - Internap for a long time.
Doesn't California have a long history of power problems? Why burden ones self with the costs of putting a server center there instead of moving to another state with a better infrastructure. Another negative externality of doing business in California.
I don't keep a lid on my coffee so when I walk around I look busy -me
I found it ridiculously ironic that as soon as I wanted to bitch about livejournal being down in my livejournal..... I couldn't.
For those people who might not know, Brad Fitzpatrick is Livejournal User #1.
I'd have to agree with the AC, Brad, stop posting to slashdot and hover over that DB rebuild a bit more.
(Yes, posting to slashdot relieves tension... Whatever it takes, Brad.)
"like, how m i suppozd 2 tell meh bf bout dat par-t?"
The LiveJournal status page claims "Our data center (Internap) lost all its power, including redundant backup power". This is nothing to do with "cheapskate blog admins" and everything to do with a serious and quite likely unacceptable problem at Internap.
Of course, that's why Anonymous Cowards start out with zero points. Guilty of idiocy until proven innocent.
If other LiveJournal users ever found out you post here Brad, /. might end up with 5000+ replies to everyone of your posts :p
:)
Other than that, cheers and keep up the good work
"YESTERDAY MY MOM GOT A FONE KALL ON HER CELL FONE FROM MY MATH TEACHER SAYiNG THAT i DO BAD iN CLASS AND ALL THiS. SO iM NOT REALLY ALOUD ON THE COMPUTER WHEN SHE'S HERE CUZ i GOTTA BRiNG UP MY GRADES AND ALL THAT. THANKS ALL FOR THE PROPS. iLL BE ON HERE TOMORROW TO UPDATE iF SHE GOES TO WORK. iLL STiLL BE UPDATiNG ON THiS JUST WHEN SHE AiNT HERE. HAHAH iM BAD! OH AND ON ANOTHER NOTE. MY DiET iSNT GOiNG THAT GOOD. i SORTA BROKE iT A FEW TiMES. BUT iM COMMiTED NOW AND STARTiNG TOMORROW MORNiNG NOTHiNG OVER 100 CALORiES, NON FAT, OR LOW FAT FOOD. GOiNG OUT TO DiNNER TONiGHT SO iMMA GO CLEAN UP A BiT AND GET READY"
A direct quote, sadly. Is this a good thing?
Yeah, but they left out "paradigm" and "synergy" - upper management will never take them seriously without those!
this is redundant but... PWNED!!!
"Ted, it seems that we the LiveJournal outage has caused a massive wave of young emo writing singers who just want to be heard."
In Minneapolis, Unisys has I believe two or three large diesel generators. One time when their part of the city lost power, they fired them up and had a lot of juice left over. Northwest Airlines bought some of their power and they still had electricity to spare, and ended up powering thousands of homes in the southeastern suburbs, if I remember the story right.
;)
A friend who worked for Exxon once told me about their power backups...I think it was almost cheaper for them to run on diesel than on the local grid.
"I'm blogging this locally, and will post it when the servers come back up"
Assume I was drunk when I posted this.
Yeah, just like on LiveJournal.com. Thanks for the heads-up though! =)
RTJKJAS
I've had a lot of experience with Internap, and I can tell you, the quality of their service is nothing but great. It's possible this could turn out to be one man's mistake, but come on. Half of the United States lost power once, because of an unforseen chain reaction.
I can give nothing but the highest praise for Internap. Everytime I ever upgraded the IOS on the core router, or took BGP down for testing (Ensuring our failover worked properly. The only real test, is a live test.) I received e-mails in advance about every maintenance, even though 99% of the time, I wouldn't lose connectivity during the maintenance, a testament to their engineering capabilities.
I was employed, at the time, with a wholesale VoIP company. This means, to provide top knotch quality, we could have no jitter (variation in the RTT of ping times.) Internap monitors their routes for quality, and has customized software to take alnternate paths if problems are recognized are dedicated through one carrier.
I've never worked for Internap, but I've worked with a hell of a lot of ISP's over the years. They are, far and away, without doubt, the best carrier I've ever dealt with. When I was having problems at one point, which was affecting only their BGP session, they had a team of 4 engineers helping me look at the debug information. A reload solved the problem. Heh...Cisco never did find out what happened. Anyway, it sucks to see them slandered with the strength of their service, and I just wanted to give the community one engineers humble input.
At this point all my whiteboards are full of boxes of each database cluster, the machines in that cluster, which have passed their checksum tests. (innodb checksums each 16k page), which replayed their replay/undo logs, where in binlogs each was writing/reading/executing etc...
So lots of waiting now on the checksum validators. I don't want to put a machine back in and find out in a week there was a database page that was corrupt because the battery-backed write-back cache on the RAID card didn't work as advertised. (which happens on about 95% of RAID cards, in my experience, because they're mostly crap, even the most expensive ones...)
Also whenever there's any doubt about something's integrity, we backup or snapshot the potentially corrupt version before operating on it. That operation can take time too.
It's going to be a fun night.
Yeah, keeping a journal or diary is only for the sad and lonely. Get a clue.
Let people do what they want; we don't need others to put them down just because it makes them not feel as insignificant. Go do something constructive, and actually try to better society, and you'll be more significant in the lives of others, which makes you feel better about yourself. Everyone benefitting is better than being a parasite, isn't it?
One could say, "At least they're not sad or lonely enough to have a slashdot account", and it would sound more "realistic", thanks to the "closet geek" stereotype. But then, wouldn't you say, "That isn't true. Besides, if you don't like it, you don't have to visit. You can just totally ignore this part of the internet."? What's different with Livejournal? Or anything else that's "uncool" in your eyes?
But I disgress, since you're so much better than everyone else and can decide what's universally "cool" and what's not, and especially since I overlooked the point that the Livejournal admins hired thugs to force you to read teen girl journals for hours every day...
*hugs* Good luck man!
What am I gonna do, now that I can't update my blog!
I just went there for some information before coming here. Didn't think much about it but appearently somebody else did.
-Tim Louden
I, for one, welcome our website-killing overlords.
Hey, good luck with this whole thing. I hate it when it happens. Did you have both of your clusters in the same center?
(Yes, I really am Simson Garfinkel)
A long night indeedy.
Is there some sort of load threshold you're willing to live with? Perhaps 50% or 80% of all servers up before starting a cluster? You know your system load distribution based on time of day better than anyone though...
An update #2 on the status page might be called for at this point. People might appreciate some reflection of how many checkboxes are checked off on that whiteboard. It would also give the impression you're busting ass for LJ, which would go over well after the panic some users had over SixApart. "As long as Brad is still around, we're in good hands", that sort of thing.
Good Luck tonight. SysAdmin crises nights suck, but they do actually pass.
I only hope it lasts forever.
Update #1, 7:35 pm PST: we're up on 'dirty' power for now (it works, but it's unreliable), and we're working to assess the state of the databases. The worst thing we could do right now is rush the site up in an unreliable state. We're checking all the hardware and data, making sure everything's consistent. Where it's not, we'll be restoring from recent backups and replaying all the changes since that time, to get to the current point in time, but in good shape. We'll be providing more technical details later, for those curious, on the power failure (when we learn more), the database details, and the recovery process. For now, please be patient. We'll be working all weekend on this if we have to.
For those who don't know what's so hot about it and for those who think Livejournal is just a bunch of teenage girls whining.... Livejournal has just about four years of my life documented. The ease of use and the ability to "vent" is comforting, but the real value comes in the interaction. My friends see my life at their convenience and I see theirs at mine. We can choose to ignore the whining of others or we can choose to relate and comment on our own experience. Think of it this way: Open-source philosophy, emotion, and life. I put my own out there and others add to it. I add mine to others. Granted ... those quiz/meme things HAVE TO GO. I do not want to read about "what frog best resembles me" or "which 80's hair band song is me." Grrr.
The livejournal servers are provided by Internap. Anyways, Bush just appointed Internap's CEO to his National Infrastructure Advisory Council (http://www.tmcnet.com/usubmit/2005/Jan/1104954.ht m), and i'm worndering, if this is some sort of terrorist attack, and then I thought, that is what they want you to think. Rather, it's jsut another step towards the republican squashing of the independant media. Perhaps, the republicans are following me. or you. They know that without live journal, the teenage adolecent girls will surely flock to forums, such as this one and post. The sheer amount of posts will crash another server, and therefore, we have a domino effect (also a technique used by the RIAA to crush peer to peer services, such as bit trorent, by causing more strain on a website not meant to handle it). So with livejounral going down like a Korean hooker, and Slashdot in hot pursuit due to the flock of teengirls, we will be unable to communicate our ideas. And without communication, the left wing majority of this website will be unable to unite and thus ensuring the republicans remain in power and control of the "free" world.
-------
Support Indy Music. Buy
Thank the gods for user settings.
Sorry about the writing. Robot fingers, you know? Cliff Steele in DOOM PATROL #23
Just remember it's not ALL obnoxious, over-emotional teen-angst teenage girls. I use mine to showcase (non-depressing)poetry and make intelligent comments about intelligent topics. Basically, if someone makes an LJ about their own life, it sucks. If you can manage to write an LJ and make it about things that matter to more people than just you(ie, "Why Bush's Iraqi war is unjust" vs. "Why this babe I know should bang me"), and at the same time make it funny and enjoyable to read, then you have a good LJ. Most LJs DO suck, but there are some diamonds in the rough.
.. post this on my Live Journal blog... wait.. it's not loading?
Blog blog blog blog.
Lovely blog!
Wonderful blog!
Blog blo-o-o-o-o-og blog blo-o-o-o-o-og blog.
Lovely blog! Lovely blog!
Lovely blog! Lovely blog!
Lovely blog!
Blog blog blog blog!
-- The Viking Blog Song
All right, who did it? Who pressed the shiiiny, candy-like history eras... I mean emergency stop button?
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
ROFL on the hugs...
That is so LJ.
I'm getting those too now. And they're not all that nice.
--
# Canmephians for a better Linux Kernel
$Stalag99{"URL"}="http://stalag99.net";
For god sakes people, it's a Friday night! If Google went down I could see people panicking, but LiveJournal? Whatever... I'm going out.
The last time I lost my journal, it was accompanied by a loud popping sound and the smell of ozone coming from the power supply of the linux box in the corner of my living room.
As much as I like having an easy interface to the online writings of all my friends, I miss the days of having my own pet web server, of being able to do something myself other than twiddle my thumbs and wait when something breaks.
(sorry ... I know ... random meanderings ... I feel this sudden urge to post a few memes, links to personality tests and a few "what happened at work today!" comments).
From the article write-up (and reflecting the thoughts of quite a few of the comments I just read):
I'd love to know what makes you think this has anything to do with Six Apart. The very first line at http://www.livejournal.com states:
They've been with Internap for years, predating Six Apart's takeover. Unless LJ staff is lying, the fault here sounds like it lies entirely with Internap.
And as far as I can tell, Six Apart didn't ditch the LJ team when they bought them out, so you probably have the exact same people working on bringing the site back up now as you would have if Six Apart had never got involved.
Slandered?
Excuse me, but they LOST POWER TO CUSTOMER'S SERVERS. Oh, did I slander them by telling the truth?
Yeah, and I don't even have an LJ! Oh, the irony. I never really read much into the infrastructure behind LiveJournal - does anyone have details on it?
Dear editors, please don't let morons submit news. It's bad.
LiveJournal has not changed any of its hosting or any other infrastructure yet. This has nothing, nothing, nothing to do with the recent acquisition.
Sorry, I'm just cranky because of the withdrawal.
I passed the Turing test.
This outage has nothing to do with Six Apart. Nothing. Nada. Zero.
It has everything to do with Internap, where LiveJournal has been hosted for years.
Something obviously fucked up at Internap's facility when they have two independent power sources plus backup and there's no power at all.
As usual, you can thank Michael for his posting of braindead and incorrect opinion pieces submitted by readers. Were it not for his leaving in the baseless opinion of the reader, you'd be making your comments on facts.
I just think it's ironic (this is irony, right? I remember that all the things in that Alanis song were in fact not irony) that Wikipedia has been experiencing some rather major server issues, recently resolved but not really explained to anyone outside of the server maintenance IRC channel.
While it was down, the OpenFacts status page was the place for immediate info, but the log of activity was kept on the 'wikitech' account on, you guessed it, LiveJournal.
Sweet, sweet irony.
--grendel drago
Laws do not persuade just because they threaten. --Seneca
I am curious to know what location went down?What janitor/sanitation engineer plugged the Buffer into what wrong socket? What electrician was fired over this? Who maintains all of the back-up systems ? Who was too drunk to find the Actual Light Switch? Who is getting fired for this? Who are they hiring in his place? Do they need tallented replacements if so reply with your e-mail address and I will send you my resume, or if you need a good person and are not related to this please, still reply.. Last but not least, who got stuck in the elevator (doing what), and who got stuck in the bathroom and how they made it out.? If none of the above applies I am curious to know what the ROOT cause analysis is for this situation.. gk
they notify everyone to almost excessive levels, however. i'm not sure that downloading a config off a router via TFTP really necessitates notifying all their customers :)
much, much better than most providers that don't notify you when they *are* doing something big, but it gets a bit like crying wolf eventually, and you learn to just quickly glance over anything from internap...
No, but I've seen plenty of people say they are a crap carrier, or shoddy, because of one incident. Work at an ISP, you'll learn, that despite your best efforts, s**t happens.
Wait, what new licensing scheme? I didn't even think they'd rolled out the new lawyer-friendly TOS yet.
--grendel drago
Laws do not persuade just because they threaten. --Seneca
I use an ISP that peers with internap for upstream connectivity to it's tier 1 ip network.
I noticed a few unreachable hosts earlier, as well as DNS delays. Didn't think much of it, but now those sites are all back up. They are all in the LA area, and I suspect you are as well.
Is this the case? (I can't tell if it's LA or not from current traces, we appear to be using Level(3) to get there at the moment however)
Also, what about redundant power? Internap is huge, they must have redundant systems. in place.
Internap animated site status
"At this point all my whiteboards are full of boxes of each database cluster, the machines in that cluster, which have passed their checksum tests. (innodb checksums each 16k page), which replayed their replay/undo logs, where in binlogs each was writing/reading/executing etc..." But at least there is still time to read and post on slashdot. :-)
------
insert sig here,here, and here
Someone probably hit the big red switch on the wall, the one covered in a plastic case
That does happen. I remember working at Purolator Courier's data center in NJ back in -- oh, geez, mid-80s some time. I was a third shift print operator, helped out with the mag tape library too. One night the trouble alarm went off on the fire suppression panel. We'd been having trouble with it all week, and the alarm guy was due in in the morning. One of the newbie operators -- the only one at the console at the time, the others being on a smoke break or asleep in the tape library -- panicked and went over to the annunciator panel. He opened it as I watched him from the console area. I think he thought the halon was about to dump because he reached around the panel and instead of hitting the halon dump abort, he hit the emergency power cutoff.
BLAM! It was as if a firecracker went off as all the breakers tripped and the fans came to a sighing halt. Both on this floor -- the one with the console and the tape drives -- and the floor above, with the CPU and the disk farms. Dead as a doornail.
Now, this was Purolator COURIER. We had AIRPLANES coming in to land at Indy center and as of this moment, no way to tell the crews which gate to go to, where to unload their stuff, or how to sort it.
Not only that, but this was an IBM mainframe shop -- S/390, the Big Iron, with 3380 disk drives. You don't just flip the power switch back on. An emergency power cutoff blows breakers in the power supplies on those DASD strings. The IBM Field Engineer was duly dispatched and arrived with cases of breakers the next morning. But we were still dark when I got off shift the following morning.
The next night a brand new plexiglass cover was mounted over the Big Red Switch.
Mit der Dummheit kämpfen Götter selbst vergebens.
The company I work for also colocates at Internap, and I just got back from their data center after bringing our own servers back up. I was talking to the techs and they said it was a cascade failure that started outside the building and took out a UPS. There were so many people there bringing their servers back to life that there was an hour wait for crash carts (:
Internap is supposed to be one of the best in this area also, they certainly are the most expensive.
I'm surprised to see that Internap's main servers are back up. It's pretty irresponsible to bring up your corporate servers before those of your clients.
That being said, LJ's servers are back up now, but they're making sure that the databases are all in sync -- LiveJournal has one of the most massive distributed MySQL clusters in existance along with a complete caching system.
They need to make sure that the database is all synchronized before bringing it back up -- chances are they're going to rebuild the cache too. If they didn't, the initial strain on the DB servers would probably bring the site down again.
This does however, bring up some questions about LiveJournal's network infrastructure. Danga (the creaters of LJ, recently purchased by Six Apart) are heavy users of Perl and MySQL. Needless to say, they have made numerous contributions to both projects and have developed an innovative memory caching system for linux.
The questions raised however, come from Perl and MySQL. Both are questionable in terms of scalability. Although I'm not qualified to comment on this, I belive that the general concensus is that MySQL is one of the least efficent databases today. Livejournal has 100+ servers. I honestly don't think that a system the size of LiveJournal should require a server cluster that big. It seems that they are trying to solve their performance/reliability problems by blindly throwing hardware at it.
Of course, I love livejournal. It's simple, easy to use, and is a great tool for building communities. Just as it is simple, it can also be incredibly nerdy (there's actually a command prompt!). They're also completely open source.
Hopefully, Six Apart can make their network infrastructure more 'professional' while still maintianing the community spirit that has made it so successful.
-- If you try to fail and succeed, which have you done? - Uli's moose
LiveJournal should offer those who are willing to put in some sweat equity, charging UPSes or powering servers by pedaling/running in a hamster wheel/other old school electricity generation technique, the chance to update their journals, so they can let the world know how they felt about not being able to access livejournal... to be read once the site comes up.
Colour me redundant, but there must be at LEAST a hundred or so scremo/emo/angst filled poetry, song and interpretative dance about not being able to let the world know how you feel about livejournal being down.
I have 1 million monkeys on a million year contract to make me a better sig.
A cam whore gently weeps.
Dude, where's my packet?
"The next night a brand new plexiglass cover was mounted over the Big Red Switch."
Hehehe, it's not just 3 year old girls named Molly that hit the Big Red Switches by mistake.
Finally a break from the never ending angsty teenage bullshit.
This space is powered by Google Ad-nauseam.
I hope they do a look-back analysis on this and publish the results.
It will be interesting to see what caused the failure, what they didn't do that could've mitigated the failure, and whether such mitigation makes economic sense.
Should make interesting reading.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Single point of failure.
The systems all pass through a single cable at some point. That means at your UPS or later.
If a circuit at the UPS goes out, or god forbid they decide to test the UPS and it fails the test, everything goes down.
It isn't impossible. It does happen.
Earlier status.livejournal.com pages have been mostly (or completely) plain text. Now that /. has linked to it? What's that I see? A banner ad for warped.com, the ones who host the status.livejournal page? Could they be trying to say they don't suck like Internap, so pretty please buy services from them?
I love it.
There were already lots of LiveJournal users who were upset and confused and unhappy with the idea that LJ and Danga (the company which made LJ) had been bought by SixApart. No doubt, as there have been no downtimes of this magnitude at LJ before, doomsayers will be claiming that it's SixApart's fault.
Never mind common sense; it won't matter that if SixApart can be held responsible for failures at InterNAP's colocation facilities, they're a much bigger -- and more powerful -- company than most people have ever given them credit for...
--Rachel
hey, you sound cool. can I add you to my journal? kthx. ;-)
"PC Load Letter? What the $@#% does that mean?!"
This makes me wonder if we will see a slashdot article if slashdot ever goes down for more than 2 mintes.
Video Production Support
Update #2, 10:11 pm: So far so good. Things are checking out, but we're being paranoid. A few annoying issues, but nothing that's not fixable. We're going to be buying a bunch of rack-mount UPS units on Monday so this doesn't happen again. In the past we've always trusted Internap's insanely redundant power and UPS systems, but now that this has happened to us twice, we realize the first time wasn't a total freak coincidence. C'est la vie.
According to some LiveJournal employees, a massive UPS exploded. From IRC:
<rahaeli> As far as we can tell, a UPS exploded.
Their site now says that they're buying their own UPSes, because this is the second time that the entire data center has lost power. Details on the first outage can be found here (a Google cache since LJ is down).
For the paranoid: This has nothing to do Six Apart buying LJ. They're still in the same "world-class" data center they've been in for years.
you want beer and pizza? email me an address/zipcode at the sig email and ill do my part to support restoring lj.
;)
if my wife cant post this weekend, im gonna hear about it. and not even be able to post my lj about getting yelled it about lj being down as if i caused the power outage myself.
not really.
well maybe.
Cheers.
This is my sig. There are many like it, but this one is mine.
"He wants to shut down the LiveJournal grid, Peter."
"You shut that thing down and we are not going to be held responsible for whatever happens."
Hey, thanks for all you're doing.
I might be impatient for lj to get back up, but I'd rather it be up and running right then it all being screwed up because you didn't wait and check everything.
Remember when teenagers were happy when people couldn't read all the personal details in their diary?
One line blog. I hear that they're called Twitters now.
Haha, yeah right. Using mysql, you're probably fucked. Switch to a real database - Postgresql.
"I have felt a great disturbance in the force; as if a million voices suddenly cried out in terror."
Those poor, poor children.
there is an article on few russian sites stating that the reason it was shut down because of russian 'separatist' student group that wanted to get free transportation and assiciated with anti-Putin statements. can anyone find that in english?
Dont Judge The situation by the Misfortunate. Goga.
i won't exaggerate if i tell that in recent years most of "social life" in .ru zone moved to livejournal.
it's 10 a.m. in russia now, and most of russian lj-addicts still don't know about apocalypse in lj.
i hope everything will be turned up in the nearest future. brad, we believe in you! :)
It was down for about a half hour, maybe a little longer. Most obnoxious for the colo facility that *is never supposed to go down*
If you look at his userpage here, you'll see he only posts a couple times a year.
Be relentless!
there was info in some russian online media, that this turnout was organized by russian officials who thacked down opposition in internet. conspirology rules :)))))
Or, as in the immortal cartoon "Dexter's Laboratory"...
Dee-Dee: OOOOOH! What does THIS button do?!?!?!?
Dexter: GET OUT AUF MY LAH-BOR-AH-TOR-EEE!!!!!
Knowledge is power. Knowledge shared is power multiplied.
Authorize.net (a fairly popular credit card gateway) is also an Internap client - I wonder how many sites (like ours) potentially lost revenue as a result of this outage.
http://www.theboyz.biz/
If you're not living on the edge, you're just taking up space!
What's so unusual about Brad posting here, I thought everyone posted here.
What Would Scooby Do?
i rly do think so this isnt a troll
Sure, go to www.livejournal.com when it's back up. It's fairly self explanitory.
There are literally tons of conservative communities at LJ, it's one of the bigger conservative communities out there.
When $DAYJOB had a present from Sierra Pacific Power of a two-hour blackout, and we discovered there were major problems with our generator, the poor APC UPS batteries weren't able to hold up the 150 servers I run.
When the power came back on, we had 143 servers back on-line in ten minutes. We had 149 on line in fifteen minutes. We had two servers (leased dedicateds) that requires some file system repairs before they would come back on-line, but that task was finished 30 minutes after power restoration.
What's so hard about that?
(With the addition of a three-phase power transformer, our generator is working properly.)
Customers kept calling asking us why it took us so long!
Damn you fuckers move quick.
-- The doctor said I wouldn't get so many nose bleeds if I just kept my finger out of there!
Given the fact that a pyramid scheme is guaranteed to leave the vast majority of the people who get sucked into it with absolutely nothing, do you actually expect you have a good chance to get your free Mac Mini? What makes you luckier than the next guy?
Mod down posts with a "Free Mac Mini/iPod" sig, they're spam!
I mean having the uber-redundant, diesel-powered backup power in the server room fail.
Except the power didn't fail outside the server room, just inside it. There was a faulty breaker that died unexpectedly. Now, we had 100+ servers go down as a result of that, but we were pissed just the same, right along with about 20 other companies.
I'd post a link to the livejournal entry about the incident, but...
"No problem. I have the capacity to do infinite work so long as you don't mind that my quality approaches zero."-Dilbert
What bothers me is that you don't have separate data centres. I run a reasonably large web site, but it's nowhere near the size of LJ. Yet we have multiple geographic sites, so even if the (N+1) power fails completely in one hosting centre, we're only down on capacity, not out completely. I can't believe a site the size of LJ doesn't do the same...
"The invisible and the non-existent look very much alike." -- Delos B. McKown
Unless it means that the "cheapskate blog admins" were too cheapskate to buy proper dual-power supply boxes so that they can have dual power paths right to the servers.
You can have all the great redundant mains and backups you want, and it's for shit if you only have one power line to the system and that power bus loses juice.
Yeah yeah, it's funny and all, but it's pretty fucking uncool on a number of levels. People cutting themselves is really bad news; please don't make fun of it.
I've been waiting almost twelve hours to change one freaking word of the post, I added to my journal right before it went down. One f'ing word is all I want to change in an otherwise wonderful treatise, but because of the depth of the subject, the lesser word might detract. Is the damn thing ever going to come back? I'd like to go to bed!
LJ User for all of 45 days...
Ever heard of "dynamic content"?
Ever heard of "high update to read ratio"?
Do not pass go, do not collect $200.
Look, Perl rubs me the wrong way. I loathe it, and it makes me wanna hurl. More than that - it's Postgres that rocks my DB world. But personally, I think I'd at least read up on LJ's infrastructure before bashing it.
I mean they've got what? 2.5 million active users?
And how many hits are DB-backed?
Sweet fuck, man. How many servers do you think they're wasting? Assuming no redundancy (ha!), right now they're sitting at an approximate ratio of about 25,000 users per server! What morons they must be to not be squeezing more out of them. (And yes I know that I'm way oversimplifing, but... really?)
What does this button d$#%* NO CARRIER
Hopefully as lambda switching becomes more common, it will be perfectly feasible to run a SAN spreading across 2 or more datacentres.
Update #3: 2:42 am: We're starting to get tired, but things are almost done. Unfortunately a couple machines had lying hardware that didn't commit to disk when asked, so InnoDB's durability wasn't so durable (though no fault of InnoDB). We restored those machines from a recent backup and are replaying the binlogs (database changes) from the point of backup to present. That will take a couple hours to run. We'll also be replacing that hardware very shortly, or at least seeing if we can find/fix the reason it misbehaved. The four of us have been at this almost 12 hours, so we're going to take a bit of a break while the binlogs replay... Again, our apologies for the downtime. This has definitely been an experience.
I'm replying to myself and going to bed
It's funny how I was just met with some Internap sales people a few months ago. They were bragging about how their network infrastructure was superior to most others, since it intelligently routes traffic to the path of shortest response (not hops).
They even bragged to me how their network uptime SLA is 100%! I mean good god, now I find out this is the SECOND time it's happened (from the livejournal update site)???
I'm glad I didn't go with them...
eTrade SUCKS
The comments seem to be full of contempt for teenage -angst inane ramblings that are common on LJ. Come on. It's not like you are forced to read through this stuff.
I have a few "friends" there at LJ, some of them net.celebs, and I like their posts. It's the matter of whose writings do you find interesting, and you are free to be completely unaware of the rest. Why all the vitriol?
My exception safety is -fno-exceptions.
Thats why you have redundancy in your payment gateways. Use two or more. Use anet and plugnpay.
Anet was hosed a few months ago due to DoS attacks. But all was good because we had a backup provider.
Speaking as a former teenage self-harmer, they only do it because they take it seriously and believe others will. Less therapy and more perspective would have helped me loads, even if some of that perspective came from strangers' shitty jokes :)
In South Korea, only Old People blog on LiveJournal
s'wut i sed.
http://www.cafepress.com/blogwhine
:)
Can't whine about it in your blog if you blog isn't there for you to whine about it in.
I can just imagine the huge pile of traffic that LiveJournal is going to get hit with once everything *does* come back up online.
Hrm. Ss there any way that they can blame this on Microsoft?
Unfortunately a couple machines had lying hardware that didn't commit to disk when asked, so InnoDB's durability wasn't so durable (though no fault of InnoDB).
Um, yeah. That happens when you configure the raid cards for write-back instead of write-through but forget to buy the cards with batteries.
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
I have more arguments, but I'll let it drop. No sense in arguing over something that neither of us can prove ;-)
I'll admit though, LJ is a major undertaking and they have produced some nice code for the community at large to use.
I seem to remember that a few years back they had a similar problem (Internap lost all power) and it turned out that some idiot had hit the big red "shut down all power to the entire datacenter" emergency button. This isn't the first time this has happened, and last time it wasn't under Six Apart's management.
I'd say it's Internap's incompetence that caused this problem. If they can't keep their datacenter running even though they have multiple redundant power supplies then something is very wrong. I see from the outage page that LJ people are now planning to buy their own UPS so that they don't have to trust Internap anymore.
For power outages, my house has a better record than Internap right now, and I don't even own a UPS!
I wonder if this was the same outage that took down Geocaching.com? Talk about your worst case scenarios...
Happiness is like peeing yourself. Everybody can see it but only you can feel its warmth.
You can read Brad's presentations on LiveJournal's setup. The LISA one is the most recent, I think.
Ironically, I had just finally got around to aquiring a consumer grade UPS for my own system, installed it, and posted on LJ about it shortly before all of this happened. Go figure.
This frightens me. Now they will have to actually live a life rather than write about one. I like the way LJ & AOL keeps them online. Or maybe, hopefully, they are crying & hitting refresh...
It looks like you're running on Linux...what filesystem are you using? Reiser(4(?)) would be a big help here.
ROMANES EUNT DOMUS
According to the Russian-Israeli news service, MigNews http://www.mignews.com/news/technology/world/15010 5_54115_61571.html/
, a student group "Going without Putin" advocating the preservation of the existing public transportation discounts and army draft delays for the students got their website http://www.idushiespb.narod.ru/ busted.
When they moved their discussions to the livejournal, LJ got its UPS exposion. This might have been a paranoia if LJ was not a major meeting point for the students during the recent Orange revolution in Ukraine that got Kremlin blackeyed.
However this outage would have never happen, KGB or not if Brad has been using some mature high available suppliers like Stratus that AOL used to use, or Tandem. One should not be using unix hacks clustered or not for the system that requires high availability. Hire me Brad, I'll fix it for you!
People posting news reports to /. should refrain from posting kneejerk assumptions/opinions such as this one:
"Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?"
Unless you've researched the cause of the downtime and researched what happened, how it was resolved, etc, etc, leave your comments out.
I don't post here. Oh, wait...
Pope Felix the Scurrilous.
Computer Geek by day, religious Icon by night.
Internap has been hosting LJ since long before Six Apart thought about taking over LJ. If you bother to read you'll know this.
"There is a way that seems right to a man, but its end is the way of death." Proverbs 16:25 (NKJV)
Anyone know of other sites that have been affected by the outage?
OMG I LOVE BRAD!!!!
If you get it, can you pass it on? I've done a whip-round for a gift basket and we're uncertain of the address we've found...
copperbadge@gmail.com (lj user copperbadge)
2. Not filled with teenage girls and emo boys. Depsite the fact I like emo...a bit...
3. OK, so maybe there are teenage girls on Blogger but they don't go "OMFG LYK i GoT kIsSeD bY jOhN tOdAy!!!111"
4. You get a subdomain, not some crappy
5. It can be exported to sites.
6. It never went throught a temporary phase in which you had to buy a damn "invite code" or one from a fucking friend.
Also according to Wikipedia:
In America, you spam computers In Soviet Russia, computers spam you!
Does every frickin' LiveJournal thread on Slashdot have to devolve into a bunch of losers insisting blogs are a contribution to society?
It's always the same defense, too. "LiveJournal lets me keep everyone apprised of my life, without sending mass emails." Like that's new. When I was a kid -- before email, jackasses -- there were people like you. They mailed "family updates" every Christmas, long letters about what they named their dog's puppies and who their cousin saw in Orlando. We mocked those people. No one took them seriously, and their "updates" always ended up in the trash.
No one cares about your life. When you graduate college, send us a letter. When you get married, send us an invitation. Those are landmark events. But no one -- NO ONE, even your family members who pretend to read your LiveJournal -- wants daily, weekly, or even monthly updates on your life. Not via snail mail, not via email...and not even via one, convenient website.
Yeah, the hosting center that they've been at for years has a power failure 3 days after a new place buys LJ. Must be Six Apart's fault, they're not "ready to handle this." Or, maybe it's a coincidence?
That's funny, I see subdomains on LJ.
/., killer. I hope it keeps you warm at night, loser.
And you're making it seem like the fact that it's mostly female is a bad thing for an online community, eh? You'll fit in quite well at
I just wanna rant about the doofus who followed me around my mom's condo building muttering "9/11", "you just don't get it", "do you live here", etc.
Ok, there, whew.
Mix the failings of Usenet with the shortcomings of the World Wide Web and the result is slashdot.
Didn't they think after it first went down with Internap that they should switch. Honestly, servers these days...
I KNEW that this story would be on slashdot, and I knew someone would make a crack at Six Apart (who are brand new to the scene) and imply that somehow magically the power wouldn't have gone out if it weren't for Six Apart owning Livejournal.
Hi. Perhaps it's more than just the power failure? Perhaps BradFitz's last cached LJ entries may hold some additional clues as to why it's taking so long after the power failure to bring LJ back up? Yes, perhaps they do. Enjoy: Jan. 12th, 2005 @ 02:31 am *yawn* It's one of those database nights. Watching 3 sets of progress bars move way too slowly. Wish there were something on Tivo at least to kill some time. Jan. 11th, 2005 @ 09:15 pm smart migrator I just wrote a database migration script which, after each chunk of data moved, asks the load balancer (Perlbal) the free user queue depth. If more than 10 (less is just noise), it sleeps a second and asks again. Only once the queues are empty does it migrate more data. End result: it moves data as fast as it can, without affecting page response times. I've been meaning to make a generic wrapper for any utility, where the wrapper parent watches the load balancers, and the child does its work at full speed, but the parent will occasionally SIGSTOP/SIGCONT it...... haven't got around to that yet. The generic wrapper could even have another pluggable-child as its rate-limit determiner, so anybody could use it. Jan. 11th, 2005 @ 01:24 pm Parallel compression Are there any multi-threaded compression algorithms, or at least wrappers/formats for interleaved compression, with variable interleave size? It'd be nice to take advantage of multiple processors when gzipping 380 GB, while still doing sequential reads, even if the resultant file was non-standard.
LJ has always been a somewhat cash-starved operation; they make a significant amount of money from their paid users, but they also have a lot of expenses--full-time employees, an ever-expanding user base on a technology that isn't easy on hardware, bandwidth use...
As it is, most (all?) of their employees are in Portland, so they keep all their servers there, where they can quickly get at them if something happens. Having a second datacenter would be hard on their employees, hard on their budget, and hard on their architecture--for a site that, in the end, isn't critical to have running 24/7.
Hey, you try to find an open nick these days!
Cut the poor bastards some slack, at least they have the excuse of "teen hormones".
Nothing, on the other hand, can excuse Taco's lame blog:
Why is it that my personal value as a human being is always tied 100% to the status of my server. Since last week the box has been cranky (a blown power supply, resulted in the harddrive being happily moved to a machine with 128 megs less RAM, which means the whole thing is just sluggish as hell today. And suddenly I feel like shit. I feel tired unhealthy, and burnt out. A few weeks ago, I was on top of the world: the machine was stable, kicking out 640,000 pages in one day, and performing snappy for everyone. And I was cheerful. Its really strange that a chunk of steel and silicon 3 time zones away defines my mood.
The airline lost my luggage... it contained 4 pairs of boxers.
So ya know that annoying ad with the damn taco bell dog and the cops that keep saying 'Drop the Chalupa' over and over again? I hate that ad.
Good gawd... Taco can put most teenage girls to shame when it comes to lame personal details publicized.
Wow too many LiveJournal haters here. People, you're wasting your time. What's so wrong with LJ, anyways? There's all types of people that use it, not just 13 year old girls that can't spell, so don't be so goddamned ignorant. And At least it's not as slow as this lame site.
Personally, I'm finally switching from postgres to mysql after 8 years of happy use of the former because it's finally let me down.
They are hoping to have limited capacity on the site in a few hours. They have not slept very much and called Six Apart the minute everything went to hell. Plans for the holiday weekend for the Fitzpatricks went to hell and no one has had much rest. They are testing everything and all the rumors out there are just that - rumors. Mena and her crew were notified the minute it went down. So the people calling up LJ's customer service threatening to slit their throats, saying Six Apart got punk'd, and everything else - can you be more emo? Come on now, if you need to journal that bad - head over to GJ just get your fix. Otherwise, give them time. This has only happened one time before and that was confirmed by Sandy when she called me back (I called asking for an interview and more information as I am doing a story on blogging and this outage for a site I write for.) My god, you would think someone is killing kittens or something the way people are crying in chat rooms. http://www.mandelion.com
OK...I just couldn't resist doing this...In keeping with the American tradition of compassionate response to disaster...
Get your souvenir T-shirts and coffee cups HERE!
this is loaner...my sig is in the shop
Those LiveJournal users are everywhere!
Yes, we do want our fix.
Go, Team LJ, Go!
Most of the employees are in Portland, but there are at least three who work remotely from other parts of the US. The servers are physically hosted in Seattle, and the two sysadmins (Lisa and Matthew, iirc) live and work there.
LJ user Emmavescence, too lazy to sign up...
A-ha! Our credit card processor -- Authorize.Net also went down hard yesterday. I did a quick tracroute just now and see that they're also located at Internap.
:)
Whoops
Now somebody's going to start a ribbon campaign.
It's back up. Unless you're on Filet MIgnon or Madcow (like myself.)
you are such an ass
Quite possibly, the "ass" has never managed much of anything and he may have cost us, further updates. Plus, he insulted a good guy. Perhaps, "ass" isn't right word and he's more of a punk...
http://www.livejournal.com/community/artinnudit
Been a bad 24 hours without LJ :(
Its back now though, yay! :D
Speaking as a former teenage self-harmer, they only do it because they take it seriously and believe others will.
Though perhaps not everyone is the same as you. In my experience, many if not most self-harmers (both teenage and older) do not do it because of what they think others will think of it.
OMGWTFLOL!!!1!1!!! me2. :-p
Bizzarely enough we had major downtime for both WoW and LJ this week.
Maybe it's a conspiracy to see if they can force all the computer geeks ut of the house and into the sunshine for a few hours this month?
Sara
Designer, Gamer, Macgrrl in an XP World
True, one man can't control it. One man with a backhoe, however, is a different story entirely. ;)
He who can destroy a thing, controls a thing.
The REAL jabber has the user id: 13196
What you do today will cost you a day of your life
Perhaps Six Apart wasn't quite prepared for the responsibilities of a website of this size?
Last time I checked, LiveJournal wasn't experiencing 503 Service Not Available error downage for 1 - 2 hours every day (unlike certain other web sites).
can we say slashdot effect?
Speed/Duplex negotiation is an OS configuration issue, not a hardware NIC issue.
If the OS can't configure the negotiation, that's still the OS, not the hardware. It just means that the driver isn't capable of properly configuring the NIC. Just because your workaround was in hardware, does not mean that that is the cause of the problem.
As for adding your own UPSes which ignore the EPO, is surely that defeats the object of the EPO. I don't know USAian requirements, but if, as you say, the EPO is required, is it legal to bypass it with your own UPSes?
LJ clearly have not heard of DR; although a true DR configuration is probably overkill for this type of site, this report gives the strong impression that basic sysadmin competencies were not followed when there was time available - during design and deployment, and then later during normal running. These problems had apparently not occured to anyone until it happened. Isn't "what's the worst-case scenario" a common-enough question? Wouldn't "total power failure" be one of those answers?
Even with write-though caches, a small battery in the array can flush data to disk after a power failure. This isn't rocket science - buy the right kit for the job, understand what you're buying, and how to configure it. If you don't understand what it is, what on earth made you decide to buy it?!! You've got dual-powered systems, but didn't use that feature - why did you buy it then? It wasn't a conscious decision to take the risk, and it wasn't a conscious decision to get dual-powered hardware for resilience. No thought was made about power. Most colo's provide dual-sourced power supplies for this type of problem - power from seperate grids, so even if the grid providing power to the datacentre goes down, the alternate grid continues running.
Sigh. I often have customers nearly as daft as this, though I don't think I've come across such a poorly considered deployment for a long, long time.
Author, Shell Scripting : Expert Re
It isn't that someone did a breakfast journal. I could see someone doing that as a joke. It doesn't shatter my worldview.
What gets me is the sheer volume of comments! Not just from a small group of people, but if you look, you see lots of different people doing it.
That DOES shatter my worldview.
Slashdot. It's Not For Common Sense
oh the nostalgia...
the ohonosecond as the 50hz and 400hz supply goes the lights, then a short pause as the brakes on the 3380 HDA's belt drives kicked in literally clank clank clank right through the disk farm..
the aircon isolation switch was next to the EPO's and our site sparky was due to do Liebert maintenance...
We had some nice simple technology (which of course was only useful if you lost external power....) a room full of hundreds of car batteries in series.. as a last ditch backup in case gennys and thycon both failed.
Facebook is a woodpecker tapping on the skull of Humanity, Forever.