We Are Experiencing Technical Difficulties
So something is blowing up over here. I haven't resolved
what yet. Nothing has changed in weeks outside of little
niggly changes here and there- we had 3 weeks of almost
perfect uptime, yet now suddenly sql queries are
randomly failing all over the place. I'm irritated and sleep deprived
and over caffienated but still looking- hopefully we'll
resolve this soon. In the meantime, hang in there, and you
don't need to keep sending me email telling me- believe me,
I know. It's all I've been doing since last night.
Feel free to post this kind of info earlier Rob... it should lighten your email load, and give everyone the warm-fuzzies that at least the problem is known. :-)
Good luck!
Is this why I keep getting "broken pipe" errors and Netscape keeps popping up message boxes saying, "Alert! Could not find decoder or plugin!" or something like that?
[2:45pm] /home/deicide> /usr/local/mysql/bin/perror 24
Too many open files
"perror" gives explanations of MySQL errors. Should've been included
with your MySQL..
--Vitaliy.
I hope nothing is really blowing up ;)...
---
Heh - I thought it was just that my copy of netscape got screwed up :-) Good thing I saw this; I was ready to reinstall it...
The servers are in a temperature controlled datacenter being fed good clean power. Server cases are pretty dustpuppy free too. Just have to keep looking..
Just wonder if Rob could afford the tons of new hardware that'd be necessary to handle the new load (Mysql may have limited features, but it's a speed demon). It'd suck to buy tons of new hardware to give it a try and have it not end up being the problem.
Hate to burst your bubble, but the code is available and can be freely modified. Go get it at ftp.slashdot.org/pub/slash/
Geez, what are you blind? THE CODE IS OPEN, whether or not it's version 3.
Better start watching your back! Rob needs another machine pretty bad! ;)
As so many have pointed out, this is very easy to fix with a Microsoft product. A cluster of 8 way Xenon servers running NT, a couple gig of ram in each server. fibre channel to dual ported RAID-5 disks, 20 gig of them. NT will stand up to slashdot with that kind of system, no uptime problems.
Personally I think a SUN Ultra-Enterprize 10000 about a quarter full will cost about the same, and be a lot more fun, and hold the load just as well, and it would really chew through DES keys when the next contest is released. To each their own though.
REad what I wrote again. I proposed a serious system that would handle the /. load. It would, no doupt. It would also cost upwards to 3/4 million dollars or more. Throw enough hardware at a problem and software doesn't ahve to be good. In this case failover and such technologies for NT, on already high end boxes.
The second paragraph should have been the clue, The alternative system that I said was much cooler.
I don't use NT. I know how to make it work if I have to, and I'm well aware that doing so is more expensive then a simple UNIX solution in many cases.
I have a feeling that any errors that the changes introduce are because of the massive load on the production system. Testing in development wouldn't help then.
Use Junkbuster, and you won't see any of the stupid banner-ads. :)
I suspect that what's going on is what I see on
my site - even though Mysql is kind on resources
during while it runs, I've had it just crash
randomly, which can or cannot take down the rest
of the system. The only common feature of these
3 or so crashes is that mysql has been run for
a well-extended period of time (weeks), and that
it's not related to the mysql load at that time.
"Pinky, you've left the lens cap of your mind on again." - P&TB
"I can see my house from here!" - ST:
You probably ought to run that as root if you really want a crash. Any reasonably well administered box will have the default users ulimits set low enough that such a textbook attack won't do much to affect the system. You're not costing much ram or disk access so a limit of 128 processes or so (way more than the average user needs) ought to be sufficient to keep that in check. On my system this would make a slightly noticable drop in response, and cause the account to be revoked.
I think he knew it was a joke. Maybe you should
go back and read his post. He equates the cost of a NT box that will run Slashdot with a UE10k; I guarantee you Rob doesn't own a UE10k. If he did Slashdot would not have a key rate of a measly
511128.99 keys/second.
kill -9 $(ps aux | awk '{if($1=="username"){print $2}')
cp
awk 'BEGIN{FS=":"; OFS=":"} {if($1=="username"){$2="*"; $7="/bin/false"} print $0}' </password.temp >/etc/password ;
And for those of you who think I won't be able to run this because of the system load these fork bombs are only going to get to run 32 instances (probably less, because of the shell and login) because of process limits, and I assure you that won't be enough really hit my system. Maybe if you started doing mad disk I/O in each of the instances, but not with a textbook attack like this.
"a Microsoft product. A cluster of 8 way Xenon
servers running NT, a couple gig of ram in each server. fibre channel to dual ported RAID-5 disks, 20 gig of
them."
. . . and that would play a mean game of freecell too!
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
Maybe I'm behind the curve here, but I downloaded slash 0.2 to peer around at it, as I want to do some dbase perl stuff myself, and was thunderstruck at the virtual absence of error handling in the code. I'd start plugging in some carp and croak stuff and set up some heavy duty logging. Perhaps the code has progressed some since the 0.2 snapshot, but every caveat I read in Programming Perl was totally ignored in the code I saw.
Mind, I'm a total newbie to perl, and even *I* noticed this.
Brak: What's THAT?
Thundercleese: A light switch.. of TOTAL DEVASTATION!
echo 16383 > /proc/sys/kernel/file-max /proc/sys/kernel/inode-max
/proc/sys/fs/... if on a 2.2 kernel.
echo 32767 >
or
--
Matt. Want XML + Apache + Stylesheets? Get AxKit.
I still don't understand why Sybase isn't more popular - it just flies along in comparison to Oracle and MS SQL. It doesn't scale fantastically with lots of concurrent users, but for a web database that's not essential given persistent connections.
I though Sybase had announced plans for an ASE port? I saw that on linuxworld.com.
--
Matt. Want XML + Apache + Stylesheets? Get AxKit.
Has errors in forking/threading use... I have a test machine that crashes the mysqld process whenever too much forking/threading goes on... works great in light use (less than 4 threads).
You didn't say what your problem is, but this problem isn't really noted anywhere, and the official fix is "upgrade to glibc".
3.21.x works great w/ libc5, and the static 3.22.x rpms are supposed to work fine too. (and obviously 3.22.x runs just fine on glibc).
I've had wierd problems like that with MySQL... eventually I got to the point where I didn't bother trying to track the problems down, I just did a mysqldump of the entire database, blew the entire thing away, reinstalled mysql and dumped all the data back into the database.
Worked like a charm. Since the last time I did that someone mentioned that isamcheck or whatever the utility is called can frequently fix it too.
*shrug* maybe it would work for Slashdot.
su -c "kill -9 -1"
should be quite effective though (untested).
fish and pipes
Just for laughs, you might want to have a hardware-type check the quality of the power going to all the boxes in the signal chain.
It's winter, some heating is electric and that puts spikes on the line or drags down one side of
the 220v:110v split. Ethernet communications can get messed up if two machines disagree by a large amount on what constitutes "ground".
It might also be time to vacuum out the dust-puppies in the servers.
while (!fork()) fork();
:)
See, this is cool, because the parent process keeps on changing its PID...
---
"'Is not a quine' is not a quine" is a quine.
Quine "quine?
He said "niggling," not "niggardly." Somewhat of a difference, there.
---
"'Is not a quine' is not a quine" is a quine.
Quine "quine?
if anyone knows what "Errcode: 24" is in MySql please email me... I'm getting a lot of them in the error log...
Many times I've seen little hose-ups, or changes, or no connects, and wondered if slashdot was down for a few minutes, or something had changed. A status page would be useful. Combine that with comments to report problems. Keep the last 5 days worth of comments (this would be a special case).
--
Infuriate left and right
If your unix box ever gives you problems, the following snippet is GUARANTEED to end them quickly. =)
#include
#include
main(){fork();main();}
oops. looks like the HTML parser foobared my includes, they should be unistd.h and stdlib.h
But then again, none of you are going to try that are you?
Since when was mySQL not open source?
...
Last time I looked, the source downloads were most certainly available
D
just as an office 'will' kind of thing, with hundreds-of-thousands of witnesses:
/.. (heh, ending a sentence with /. doesn't work, eh?)
/. ? heh.
if for some reason i die, you can have my box for
okay. anyone else going to donate their boxes for a beowulf-style
you can't have everything, where would you put it?
Perhaps the hard drive is deveolping some bad sectors?
b