Introducing The New Slashdot Setup
the original version of this document was written by Andover.Net Alpha Geek Kurt Grey. The funny jokes are his. The stupid jokes are mine.
The Backstory
We realized soon that our setup at Digital Nation was
very flawed. We were having great difficulty administering the machines and
making changes. But the real problem was that all the SQL traffic was flowing
over the same switch. The decision was made to move to Exodus to solve these
problems, as well as to go to a provider that would allow us to scatter
multiple data centers around the world when we were ready to do so.
Meanwhile Slashcode kicked and screamed its way to v1.0 at the iron fists of Pudge (Chris Nandor) and CaptTofu (Patrick Galbraith). The list of bugfixes stretches many miles, and the world rejoiced, although Slashdot itself continued to run the old code until we made the move.
The Co-Loc
Slashdot's new co-location site is now at Andover.Net's own (pinky
finger to the mouth) $1 million dedicated datacenter at the Exodus
network facility in Waltham, Mass, which has the added advantage of being
less than 30 minute drive for most of our network admins -- so they don't have to fly
cross-country to install machines.
We have some racks sitting at Exodus. All boxes are networked together through a Cisco 6509 w/ 2 MSFCs and a Cisco 3500 so we can rearrange our internal
network topology just by reconfiguring the switch. Internet connectivity
to/from the outside world all flows through an Arrowpoint CS-800 (which
replaced the CS-100 that blew up last week) switch which acts as both a
firewall load balancer for the front end Web servers. It also so happens that
the Arrowpoint shares the same office building with Andover.Net in Acton so
whenever we need Arrowpoint tech support we just walk upstairs and talk to the
engineers. Like, say, last week when the 100 blew up ;)
The Hardware
- 5 load balanced Web servers dedicated to pages
- 3 load balanced Web servers dedicated to images
- 1 SQL server
- 1 NFS Server
All the boxes are VA Linux Systems FullOns running Debian (except for the SQL box). Each box (except for the SQL box) has LVD SCSI w/ 10,000 RPM drives. And they all have 2 Intel EtherExpress 100 LAN adapters.
The Software
Slashdot itself is finally running the latest release of Slashcode (it was pretty amusing being out of
date with our own code: for nearly a year the code release lagged behind
Slashdot, but my how the tables have turned).
Slashcode itself is based on Apache, mod_perl and MySQL. The MySQL and Apache configs are still being tweaked -- part of the trick is to keep the MaxClients setting in httpd.conf on each web server low enough to not overwhelm the connection limits of database, which in turn depends on the process limits of the kernel, which can all be tweaked until a state of perfect zen balance has been achieved ... this is one of the trickier parts. Run 'ab' (the apache bench tool) with a few different settings, then tweak SQL a bit. Repeat. Tweak httpd a bit. Repeat. Drink coffee. Repeat until dead. And every time you add or change hardware, you start over!
The Adfu ad system has been replaced with a small Apache module written in C for better performance, and that too will be open sourced When It's Ready (tm). This was done to make things consistant across all of Andover.Net (I personally prefer Adfu, but since I'm not the one who has to read the reports and maintain the list of ads, I don't really care what Slashdot runs).
Fault tolerance was a big issue. We've started by load balancing anything that could easily be balanced, but balancing MySQL is harder. We're funding development efforts with the MySQL team to add database replication and rollback capabilities to MySQL (these improvements will of course be rolled into the normal MySQL release as well).
We're also developing some in-house software (code named "Oddessey") that will keep each Slashdot box sychronized with a hot-spare box, so in case a box suddenly dies it will automatically be replaced with a hot-spare box -- kind of a RAID-for-servers solution (imagine... a Beuwolf cluster of these? *rimshot*) Yes, when it'll also be released as open source when its functional.
Security Measures
The Matrix sits behind a firewalling BSD box and an
Arrowpoint Load balancer. Each filters certain kinds of attacks and frees up
the httpd boxes to concentrate on just serving httpd and allows the dedicated
hardware to do what it does best. All administrative access is made through a
VPN (which is just another box).
Hardware Details
Type I (web server)
VA Full On 2x2
Debian Linux frozen
PIII/600 Mhz 512K cache
1 GB RAM
9.1GB LVD SCSI w/ hot swap backplane
Intel EtherExpress Pro (built-in on moboard)
Intel EtherExpress 100 adapter
Type II (kernel NFS w/ kernel locking)
VA Full On 2x2
Debian Linux frozen
Dual PIII/600 Mhz
2 GB RAM
(2) 9.1GB LVD SCSI w/ hot swap backplane
Intel EtherExpress Pro (built-in on moboard)
Intel EtherExpress 100 adapter
Type III (SQL)
VA Research 3500
Red Hat Linux 6.2 (final release + tweaks)
Quad Xeon 550 Mhz, 1MB cache
2 GB RAM
6 LVD disks, 10000 RPM (1 system disk, 5 disks for RAID5)
Mylex Extreme RAID controller 16 MB cache
Intel EtherExpress Pro (built-in on moboard)
Intel EtherExpress 100 adapter
I'm astonished that nobody made a remark about the RAID setup so far:
Usually RAID 5 and a database is a pretty bad match, because RAID 5 read/write performance is lower than RAID 0, 1, 0+1 read/write performance...
Is it an exception to the rule or hasn't anybody in the slashdot crew taken a database course?
FWIW, I think a central NFS server is just pointless. Why on earth have multiple HTTP frontends and then make a huge bottleneck by forcing them to share a single filestore over NFS, a protocol which is less than impressive.
Batch pushes onto local storage on the HTTP machines wins hands down as far as I'm concerned.
And, as others have said, if you must use central storgage for files, you may well be better off with SMB than NFS.
-----
- Their physical security is crazy (biometric scanners for the guards).
- They are in a non-descript building that doesn't look like it hosts millions of dollars in websites.
- They have UPS, generators, etc.
- As someone mentioned earlier, they have all kinds of connections to all kinds of places (not just a single T3 to one place).
- They save you the trouble of building all that shit yourself.
Duh.Jordan
Ok, so now that we have established the fact that MySQL isn't beefy enough. Why was the choice made to go with MySQL instead of Oracle or PostgreSQL... etc? Just because the code was written for MySQL, or for some other reason that eludes us all?
-S
Scott Ruttencutter
We Apprentice Developers and Designers
I thought it would be used for load balancing of the web servers?
\\'
Erik,
The site recieves somewhere between 1 and 2 million page views a day. The mySQL server has to be 'beefy' to take that kind of load under mySQL while we wait for replication to be stable and complete.
If you ask "why not __X__" for the database.. our brave programmers are working on a general abstraction layer to allow different databases to be used with slash that is NotReadyYet(tm). When that is complete, people will be able to port slash to the database of choice. For now, the only db supported is mySQL.b
Martin B.
The idea is if the system ever crashes, it will automaticly boot off a CDROM, format the hard drive, and install a new copy of the operating system with your modifications to it. This is not a good solution for a database computer, but any of the web servers that contain nothing dynamic themselves, its a great solution. If someone hacks the internal server, just reboot.
We also did the same thing with the multiple datacenters. The computers talked to each other over a VPN network, backed themselves up to one computer that was ONLY accessible over the VPN network, etc. Then from the location of the "secure" severs (only on VPN) we would write and test new versions of our distribution (modified Slackware). Then just burn it to a CDROM and mail it out to the systems around the country. Put it in each system, and again reboot. Drop me an e-mail for more information.
Nicholas W. Blasgen
-Nicholas Blasgen
You don't need replication support for data partitioning, which is actually a better choice in most cases. Don't know how Slashdot works, so couldn't comment either way. (To the uninitiated, an example of data partitioning would be storing all the odd-numbered records on ServerA, and the even-numbered records on ServerB. This is clearly a pain for table joins, however) The big question is (and I'm not trying to be a troll here)... why not use a proper database in the first place?
Sounds great. Long live the slashdot empire! And may the force be with you! :-)
Always run = ON
They noted in the first article that Exodus does not allow cameras inside their facilities. Boo.
is the BSD firewall a permanent fixture? The what happened page said that the BSD box was brought in because Pat could get it running quick.
Do you consider the PIX fundamentally flawed or just not correct for your environment?
There's more to it than this.
Posted by BSD-Pat:
It was mroe or less tweaking that, and the number of threads available to MySQL....
nothing real complicated... =)
"Switching requires a recompile of Apache"
No.
If you have compiled Apache to do dynamically loaded modules, it doesn't. That's how the Debian packages work; you have several versions of the mod_php package, each for a different database, and you install the one you want. It's easy, and it works (which is pretty much true of everything Debian; that's why I use it)
As another poster pointed out, they most likey do compile Apache themselves (which I did not find that difficult when I tried it to check out mod_ssl [wanted to use rsaref to stay legal], but then I wasn't using Mandrake). Even if you compile it yourself, with dynamic modules you can always compile more modules later without recompiling Apache.
Also, I don't understand your comment of "RH runs Postgres, not MySQL". Both of these should run on all Linux distros.
WMBC freeform/independent online radio.
How is SlashDot gonna deal with loss of accessability when Exodus finally lands in the RBL because it will not deal with its cronic spam probelm?
> 1. Cluster the SQL box for high availability and fail-over. MySQL is non trivial to cluster, hence the $$$ Andover is providing to the project. > 2. Switch from NFS to SMB - even the apache site recommends this for speed. Depends on what they are using NFS for... But I think they are generating html pages from the db at fixed intervals and copying to the www servers locally at regular intervals. > 3. Look into having local instances of SQL running on the web-servers - read-only copies that are replicated from the main DB... then the central DB would only be used for write (aka comments and postings...) Thats reliant on being able to use replication which mysql doesn't support - and it's a big nono to put db's on the same machine as the web servers.
I mean, if I had $ 1 million a year, I could run multiple redundant T3s to my own office and use them for my own personal/company use, too.
Why would people use Exodos instead?
D
----
Switch from NFS to SMB - even the apache site recommends this for speed.
I don't know the first thing about NFS but I am familiar with SMB, and all I can say to that line is GOOD GOD! If NFS is less efficient than SMB I would just HATE to see a packet analysis (I make it a point at home to use FTP for large transfers because SMB takes about three times as long).
Is this post not nifty? Sluggy Freelance. Worshi
I am sorry, but i do not know anything about web serving
/. before the new server? Unless you're sitting on a T1, it was actually painful.
Yes, but did you actually try to read
Anomalous: inconsistent with or deviating from what is usual, normal, or expected
Anomalous: deviating from what is usual, normal, or expected
Canard: a false or unfounded repor
Green? Green!?! Everyone who knows anything about electronics knows that the blueberry iMacs are faster. Has something to do with the blue wavelength of light relaxing the electrons more than green light. Remember the Wizard of Oz? All that green light made people woozie. Same for electrons. That is why MS made the default background of W2K a light blue; to wring the last possible bit of speed out of the boxes.
Kindness is the language which the deaf can hear and the blind can see. - Mark Twain
while () {
s/KURT/MARTIN/;
s/Andover/Adam/;
}
ROBLIMO: Not after we demonstrate the power of this station. In a way,
you have determined the choice of the web site that'll be slashdotted
first. Since you are reluctant to provide us with a URL, I have chosen
to test this station's slashdotting power...
on your home page on iVillage!
AC: No! iVillage is peaceful. We don't flame Linux on iVillage.
We only discuss travel and mystery novels. You can't possibly...
ROBLIMO: You would prefer another target? A commercial target? Then name the URL!
Roblimo waves menacingly toward AC.
ROBLIMO: I grow tired of asking this. So it'll be the last time. What is the URL?
AC: (softly) pcweek.com.
AC lowers her head.
AC: The FUD piece was posted on pcweek.com.
ROBLIMO: There. You see Lord Taco, she can be reasonable. (addressing
Hemos) Continue with the operation. You may post when ready.
I helped create an eCommerce environment for a large chemicals company. We used the following setup.
:)
* 4 - SUN Netra Web Servers
* 2 - SUN 3500 Application Servers
* 1 - SUN 4500 Database Server
The servers were running iPlanet and Oracle. This multimillion dollar setup was initially being used by 7 users making one transaction/week. Now the system has ramped up to a couple 100 users.
Now that's overkill!
P.S. Good work guys on setting up the new system.
>Didn't y'all read the article on here a few weeks ago, "Why not MySQL?"
Maybe because that decision was based on another one, made a couple of years before the article?
Well, if a site linked in a story gets slashdotted, you could assume that slashdot is always under that sort of stress.
Of course, slashdot hits are probably not as bad as other sites. (Because slashdot gets an unusually large amount of text-based browser hits.)
-- Thrakkerzog
INT EXODUS - MAIN SERVER ROOM
[MOFF KURT, a tall, confident technocrat, strides through the assembled geeks to the base of the shuttle ramp. The geeks snap to attention; many are uneasy about the new arrival. But Moff Kurt stands arrogantly tall.]
[The exit hatch of the shuttle opens with a WHOOSH, revealing only darkness. Then, heavy FOOTSTEPS AND MECHANICAL BREATHING. From this black void appears DARTH TACO, LORD OF THE SITH. Taco looks over the assemblage as he walks down the ramp.]
MOFF KURT:
"Lord Taco, this is an unexpected pleasure.
We're honored by your presence."
DARTH TACO:
"You may dispense with the pleasantries, Commander. I'm here to put you back on schedule."
[The commander turns ashen and begins to shake.]
MOFF KURT:
"I assure you, Lord Taco, my men are working as fast as they can."
DARTH TACO:
"Perhaps I can find new ways to motivate them."
MOFF KURT:
"I assure you, this station will be operational
as planned."
DARTH TACO:
"Andover does not share your optimistic appraisal of the situation."
MOFF KURT:
"But he asks the impossible. I need more geeks."
DARTH TACO:
"Then perhaps you can tell them when they arrive."
MOFF KURT: [aghast]
Andover's coming here?
DARTH TACO:
"That is correct, Commander. And they are most displeased with your apparent lack of progress."
MOFF KURT:
"We shall double our efforts."
DARTH TACO:
"I hope so, Commander, for your sake. Andover is not as forgiving as I am."
"This server is now the ultimate power in the universe. I suggest we use it!"
I imagine that there's a bit of good old fashioned vapour in the 10x assertion.
/. the much finer granularity of Postgres locks could make a difference.
Also, for a site with such traffic as
MySQL is, AFAIK open source, just not Free Software. Feel free too look up both definitions. The limitation to MySQL is available
Don't forget that GPL'd versions of MySQL (older releases) are always made available as well.
- Michael T. Babcock (Yes, I blog)
Posted by BSD-Pat:
nope, don't own us yet =)
in a few weeks though.....
-Pat
This is because Apache with mod_perl installed has a larger memory footprint than straight up Apache and using having dedicated image boxes makes it much more efficient.
--hunter
RateVegas.com - Vegas Reviews
Have you tested the setup under a variety of loads? Approximately what was its capacity? After the recent group of DDOS attacks, have you installed any safeguards to revent similar attacks in the future.
Hopefully this setup will be able to handle Slashdot's growth and end the insufferable page loading delays that have become all too common. Kudos to the IT staff for successfully migrating to the new system.
ByteMyCode.com: A Web 2.0 code sharing community.
The images (except for choosing which ad to serve up) are static data. The rest of the pages are extremely dynamic data. Each page you read is lovingly built just for you. The optimizations you want on the two very different types of web serving are very different. Ergo, two types of servers with different tunings. The number of servers of each type is then determined by the relative loads. I was suprised at the ratio of 3 image to 5 page servers. I would have thought that fewer image servers would be required.
Anomalous: inconsistent with or deviating from what is usual, normal, or expected
Anomalous: deviating from what is usual, normal, or expected
Canard: a false or unfounded repor
We're funding development efforts with the MySQL team to add database replication and rollback capabilities to MySQL
Wouldn't it be easier and cheaper to just use a real database? Oracle makes a nice one that runs under Linux.
DrLunch.com The site that tells you what's for lunch!
Yeah, but don't places like that usually have fire escape doors, and the folks in the inner sanctum just pop through them when they go on their smoke break? ;-)
I wonder how many peoplehours go into running slashdot each week?
tcd004
Here's my Microsoft Parody, where's yours?
I'm surprised that a 3-tier solution wasn't chosen. Maybe it was looked at and found to be too complex. Sometimes throwing hardware at a solution does work.
/.
:-)
3-tier slashcode - now that would be interesting. Watch out CORBA - here comes
Guess you'd need a bit of OO Perl for that
I believe you refer to the signal to noise ratio.
perl -e "print(pack('H37','4d65726b7572795a40676e7572642e6e6574'))"
-Tommy
------
"I do not think much of a man who is not wiser today than he was yesterday."
"I got a half gallon of Jack, and 2 dozen Ant Traps. I'm about to get wild." -me
You need to remember that almost everything slashdot serves is user content, which is constantly being added-to by thousands of rambling geeks. All this has to be ingested, stored, and then threaded/sorted and filtered (by threshold) on the way out - that takes a lot of disk, memory, and cycles. Caching helps, but each additional post on a thread will invalidate the cache and necessitate regeneration.
Add to that the ad serving, graphics, and so on. I don't know how much tracking they do, but ads usually only pay for each unique user - loading the same ad on the same user's screen ten times only counts for one impression. So most sites have to set a cookie just to avoid double-billing, and check it on each pageload to rotate the ads correctly. That adds cpu and memory too.
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
In MySQL you can decide per table if you want the table to be fast or take the speed penalty of making it transaction safe.
In MySQL, you do not have the choice turning on transactions and atomicity.
You have the choice of turning on features that they mistakenly label transactions and atomicity, but let's call a spade a spade here.
You use MySQL if you care about speed a lot, and don't care much about data integrity. That's a perfectly valid position, but let's not pretend it's some other position.
If you do care about data integrity, you use something other than MySQL, and find another way to achieve the speed.
--
How would a multithreaded TCP/IP stack impact this setup? What sort of performance boost would that give Slashdot?
Unless you're going to colocate offshore, which I would guess carries its own risks, I'm not sure that any American colocation facility can protect a site against a search warrant (which I understand was issued in the Steve Jackson case, whether rightly or wrongly).
I would be interested in knowing, however, what contract terms Exodus and other colos offer with respect to warrantless requests for searches. (That is, what has Exodus agreed to do when Agent Foo shows up without a warrant and asks to have a look at the machines hosting website Bar? Is Exodus in breach of contract if it says, "Right this way, Agent?" Or has it agreed to take some other action?)
I sure hope your not and that is what is coming! Then I'll finally be able to use it in production and make a good argument for its use at work.
======== In the future, everything will be artificial. ========
The story mentioned a hot spare system that is always synced with one of the production systems.
What is the benefit of doing this over just adding
this hot spare to the production cluster behind
the load balancer? That way you dont need any
intelligence. And if a box dies, you dont notice
it anyways because you always have 1 or 2 extra
boxes in the cluster anyways.
Is there a benefit?
Cor
A common myth about open source is that if you release a program with a description and a few source files...someone else will fill in the gaps. They would probably rather get their framework for their own needs put together before releasing it onto the world's eyes. Just because something says "open source" doesn't mean it has to be "open" from the beginning... I know I'd rather work on something and get the basics of what i want put into it, before having loads of people (and lots of non-programmers) look at it and bombard me with email saying "why not this, why not that..." when they are all things I had planned.
---
I could care less about the conection to the outside;
It's internal bandwith that I care about. Also overhead on these cards is lower.
"think of it as evolution in action"
Why couldn't this have been posted last night?! I was at Exodus Waltham and I could have looked around for the cage. Hmph.
It'll never happen. Exodus doesn't allow photographic equipment in the facility. There's a nice little letter in the kevlar-lined entrance saying so.
cos i`ve set it to 0, but i keep getting all this -1 crap. SORT IT!!
(yes, i`ve saved, logged in , logged out, logged in again...sometimes its 0, sometimes its -1. )
I've got a huge old CGI script running under mod_perl (worked after a few kludges and lots of "my" statements) that causes the httpd processes to look like this:
5808 nobody 2 0 25848K 7964K accept 0:45 0.00% 0.00% apache
5807 nobody 2 0 30476K 15428K accept 0:44 0.00% 0.00% apache
5809 nobody 2 0 31736K 17108K accept 0:44 0.00% 0.00% apache
Ugly, eh? I had to put 512M in the webserver after getting more than 40.000 views/day for some time (the static content is served by a thttpd already).
"I love my job, but I hate talking to people like you" (Freddie Mercury)
"need to buy a license if I want to do something with MySQL for a client. "
i haven't read the license for a while, but I remember reading that was only true if the product you developed for the client relied soley on MySQL. If your application used standard SQL and was not entirely MySQL dependant, then it would still fall under the "free" catagory. (Unless, of course, you tried to run it under NT)
---
Usually RAID 5 and a database is a pretty bad match, because RAID 5 read/write performance is lower than RAID 0, 1, 0+1 read/write performance...
:p)..
IIRC the big hit is in write performance, but one would hope large RAID caches should help this somewhat. Still, a 0+1 rig would probably provide the best overall performance and redundancy (and cost the most
Your Working Boy,
We're funding development efforts with the MySQL team to add database replication and rollback capabilities to MySQL (these improvements will of course be rolled into the normal MySQL release as well).
Transactions in MySQL ??? That would be really wonderfull, I really miss this feature (I could also use some kind of data coherency checks too).
Yeah its a big colo facility with lots of speed but in my experience its also a pain in the ass. As of right now they are fighting with PSI.net ( a fairly large provider ) about access to their peeing point....Exodus wants PSI to pay and PSI wants exodus to pay because they are not an ISP. My old company ( which I just quit ) had a lot of customers who were using Exodus and we were on PSI...frustrating as hell....
======== In the future, everything will be artificial. ========
From their Methodology Page:
At the current time, the estimated population being reported includes U.S. households that are viewing the Internet using PCs with Windows 95/98/NT as their operating system.
So I think if they included real operating systems we would see much different figures. :)
One reason might be that mySQL is much faster than Oracle when it comes to building up and dropping connections.
This is really important for the web, because a typical web program will start by opening a connection and end by closing it. So you effectively have one connection for every hit that occurs.
Unless you do some fancy sharing of connections, this is going to be a big problem when you use Oracle. This forces Philip Greenspun to use TCL/AOLServer for his work, since it allows connections to stay up between CGI invocations.
In the mean time, I can open and close as many mySQL connections as I need to.
In addition, as I said in another post, he would probably have to rewrite the Slash engine to use another database; it's most likely very dependent on the mySQL API (as my programs are as well). We get a big payoff from this - far greater speed - so we pay the penalty of being stuck on one database unless we want to make a herculean effort to convert all the software we've already written.
D
----
Cool.... If they're working on Mars that would give Geeks in space a new meaning ;)
Hrm loving these
But seriously, your description of Exodus reminds me of The Bunker in the UK (I think this was featured on Slashdot a few months back), which is housed in a decommisioned nuclear bunker. Sadly, The Bunker has no kevlar walls (as far as I know), though it does have "blast proof" doors, a "high security" electric fence and HERF protection. The fact that there's a demand for this type of facility shows how important web servers are to some corporations.
Have a contest! throw some equipment out to the masses. If you like the idea I'll happily accept the quad as an honerarium or something ;-)
BTW if someone could please explain in what way the parent's parent is flaimbait, I'm listening...
IIRC, Rob and Jeff started this from Michigan. Now, your servers are in Waltham. Does this mean a move in the future?
Also, while the source code may be available, the licesnse that MySQL is under does not come anywhere close to meeting the open source definition - with the exception of older versions as you mention.
Do you even know anything about perl? -- AC Replying to Tom Christiansen post.
Pining for the fjords.
WWJD? JWRTFM!!!
Not only are roll-back's on the MySQL developers TODO list... they have been implemented!
That's right, the latest versions of MySQL support a new DB file format (based upon the well recognized Berkeley DB format), and has FULL COMMIT/ROLLBACK capabilities. So no more bitching about that. They've also incorporated support for HEAP tables (emminantly useful for *very* fast cache tables), added better stored function loading, and they almost have sub-selects at the (very fast) speed they want.
I like postgres, it's very powerful for more traditional DB work... but MySQL is still the DB to go to when you feel that need for speed. So check out the feature list of the 3.23.x series. You won't be dissapointed.
Just wait till you try to add another arrowpoint :)
fro redundency. Run one nic out to each arrowpoint (plus one or more to your interenal net).
Now get reading for the configuration headache of getting those two to work together doing failover and redundency. Fornatly once you do get it right those things will do failover and rules balencing to other units in othe datacenters. Best thing to do is get one of the arrowpoint engineers down and chain them to the rack until they get it right.
We have also had troubles with loaner arrowpoints.
A single Arrowpoint replaces about 6 cisco boxes..
No wonder Cisco bought them
I wonder why there is a red hat box in the mix?
not necessarily the answer, but Kurt posted this previously:
The servers all came with Red Hat and we installed Debian on them, expect the 3500, and I think that was because VA installed extra drivers and stuff we wanted to leave it as is.
michael
dude these guys are pay a million dolars a year for there physical space, I'm sure they could afford Oracle
ReadThe ReflectionEngine, a cyberpunk style n
Anyone know what kernel the beasts run? How often do the machines go down (due to a software fault)? I'm a linux user myself so this isn't some excuse to bash, just a genuine curiosity about the stability, especially of 2.2 linux kernels on highend hardware.
All that security and I bet many of the boxes are running NT... :) But, humor aside, I wonder how many of the protected boxes are more secure physically than operationally. They are subject to much more danger via the internet than a terrorist wanting to damage the machine. Look at the DDOS attacks against /. last week. Shucks, the terrorist would take more systems offline by trying to find the network cable entry points and destroying those instead of trying to gain access to the physical boxes.
Kindness is the language which the deaf can hear and the blind can see. - Mark Twain
It is obvious that money isn't a concern - so why not fork over the money for some Oracle licenses and a competent Oracle admin? /. crew, but one good reason would be that oracle is a piece of shit. There are much faster rdbms's than oracle. Sybase and Microsoft (really) make products that completely spank Oracle on equivalent hardware. Oracle's sweet spot is large (Sun E1000 or HP V class) boxes used for data warehousing. Even there, oracle is an inferior solution to DB2 and Informix. Oracle is a lot like windows, it's a shitty product that just happens to be the market leader. The big difference between oracle and MS is that MS is run by Bill Gates and oracle is run by a guy who wishes he was Bill Gates.
Well, I can't speak for the
--Shoeboy
(former microserf)
Cool, not only is Slashdot moving to better servers, but they will be in the same town as my office rather than 600+ miles away. Now if they would just start proofing the stories for grammar and authenticity before posting all would be well in the world.
Speaking of which, did anyone else notice that that ridiculous idOS story from the other day seems to be completely gone now? Hmmm, do I sense some revisionist history editing on Slashdot? ...
-JeremyH
Nope, it's definitely compiled in the kernel itself. I try to avoid using modules whenever possible. Mind you, this doesn't happen all the time, but since we have 12 servers, with over a million page views a day each, it happens often enough to be a hassle (like once every week or so).
A sentence you'll never see on an Internet discussion board: "You know what? You're right."
That's right. It's primarilty so we can rearrange the network topology into front-end and back-end networks.
license, schmicense, who cares what kind of license it has? it they thought the license was objectionable, they wouldn't use it.
jon
-- http://www.cerastes.org
We have several Fullons ourselves, and under heavy load these cards cease to function. I have to log in via an internal network interface and ifconfig the outside IP down then back up.
A sentence you'll never see on an Internet discussion board: "You know what? You're right."
thank you... these 'expressions' are so useful in expressing oneself...
Natalie Portman wasn't in that episode. There go all my good jokes!
Got Rhinos?
and it was still painfull...
ReadThe ReflectionEngine, a cyberpunk style n
Er, in my experience it's not very difficult at all to implement connection pooling on an Oracle database. Fact is, no matter what database you use, you shouldn't create and destroy a database connection every time a user wants to run a query (or for each user session, for that matter).
Exodus sounds cool, but I'm a big fan of Level(3)!!! They just kikz azz.
-------
Oh shit! I forgot to click "Post Anonymously"...
Are you sure? I was under the impression that PostgreSQL has been around for quite a while.
-- Thrakkerzog
I'm supposed to be writing my PhD thesis at home after work, but for the last few weeks I've wasted most of the available time (=getting home->falling asleep at the terminal) time reading Slashdot and day-dreaming about building my own Beowulf-cluster.
(HOMER VOICE):
Mmmm.... a dozen, black, rack-mounted 2U:s with dual 750 MHz 21264s inside...
Webservers: $22525
NFS servers: $21120
Database server: $25739
Being THE place for Natalie Portman and Hot Grits on the Web: priceless
There's some things money can't buy. For everything else, there's Slashdot.
---
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
I wish that the old slashdot server was still up...it would be interesting to visit every now and then, like going to a museum. :o)
Got Rhinos?
Yea, we all know he releases his code after every key-stroke.
The comment pages are actually static files that are loaded directly by the comment perl script to minimize the number of database lookups done. So the comment pages themselves are stored on the NFS server, and the personalized stuff on the right side of the page is slotted in by a relatively simple script.
D
----
Also, it's "reparenting" comments even if you tell it not to.
Funny how no MySQL fan here even mentions licencing terms. Does nobody care that the current releases of MySQL are nowhere near being free? I remember a time when free licences mattered. Maybe RMS is right when he warns us we are caring more and more about expediency, less and less about freedom.
I have nothing against the MySQL folks (they seem like nice people), and I will be genuinely happy if they relicence some day. But in the meantime, MySQL has the same fundamental problem as RealPlayer or Microsoft Office. No offense intended.
I would have loved to have read the FAQ, if the browser had stayed running that long. It gets about half the images loaded and then goes byebye. (Thanks, glibc.)
If I were as lazy as you suggest, I wouldn't have visted their page at all.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
They could use Interbase just as well, it has most of the features that MySQL lacks. Of course it's still in beta, but it's Open Source
What's the rationale for that???
Daniel J. Peng
Be careful of MySQL documentation, and web site ... over the past couple of years, the PostgreSQL developers have tried to work with the MySQL folks to improve their various pages, especially their 'comparison' page ... we recently sent them a list of changes (the number of them eludes me, but it was alot) on just one section of the comparison page that listed a bunch of their types as 'SQL-Compliant' that have nothing to do with the standard, and a bunch of 'SQL Non-Compliant' that are required by the Standard ... As for 'a magnitude speed difference' ... check out PostgreSQL now. I would imagine that their 'tests' were based on several releases behind ... v7.0, that was just released, is several times faster then our last release, which was noticeably faster then v6.4 ... The point I'm making is that if you are going to judge something, judge it on your experiences, not on a competing vendors experiences... you might just be pleasantly surprised ... - scrappy@hub.org
PostgreSQL has been around since 1995.
-- Thrakkerzog
I don't think so. Any database driver worth it's salt implements connection pooling. I haven't done mod_perl -> Oracle myself, but this is standard procedure for the MS ODBC -> Oracle and EJB/Servlet/JSP -> Oracle stuff I normally work on. The bottleneck is never the connection setup/teardown time.
Oracle may indeed me somewhat slower - these guys have the money to throw at more hardware! And they need the features that Oracle provides.
As for code modifications - make a fork that supports Oracle. There appears to be people working on Slash full time now. It isn't like back in the day when Rob had to make the changes himself.
My best guess is that this is a result of 'do what you know best'. Oracle isn't well known in the open-source world, and there may be nobody at Andover who has real experience running it. I think their money would be better spent on setting everything up to run Oracle and hiring an Oracle admin, but that is just my humble $0.02.
Do you even know anything about perl? -- AC Replying to Tom Christiansen post.
The horsepower behind your getup is smokin'. what'd'ya say to takin one of those bad boys from the matrix and ripin up a Q3 server with it...??
CODA!!
I think the replication of coda would be good and cool. Think: all the webservers are coda servers in the 'www' cluster, another 'image' cluster for the web-image servers. Now, you have another box which is also part of the cluster, so you make mods to that "Extra" box and when you're done, you just get coda to sync the other servers by doing 'ls -lR' and presto changeo you've replicated your website changes without hacky scp scripts or anything else.
Last time I looked, they were serving from 400,000-600,000 accesses per day.
The economics of life are a lot different when you're a major corporation. I was able to get my mid-sized 100-employee company to buy a $ 10,400 VA Linux FullOn 2x2 server even though it was overkill; the money simply wasn't worth thinking about in comparison to the costs associated with having a slow server.
D
----
Older versions of MySQL are open under the GPL: http://web.mysql .com/Downloads/MySQL-GPL/MySQL_GPL-3.20.32a.tar.gz . The latest version has a source RPM available at http://web.mysql. com/Downloads/MySQL-3.23/MySQL-3.23.16-1.src.rpm. One way or another they are open enough to be qualitatively better than MS SQL Server, even though they may not be a free as we'd like.
Yes, it's a spelling error -- my fault. We'll fix it in Service Pack 1, due out next year.
1. There are some commercial (under $500 though) products that will do this quite nicely, and easily.
2. Don't know...
3. Well, it all depends on what you are trying to achieve... it does speed things up... but it depends on load... if you're worried about security... well, don't punch those ports through the firewall...
BlackNova Traders
Phew! I read the above as "It's a big colonization facility". I thought the colonization had already begun...
[Too bad we can't link to poor spelling and let these comments expire when it's fixed...]
LOL!
"I'm looking through you, where did you go?"
VA has made tweaks to RH that makes it better for running MySQL. How do I know this? A friend of mine works at a university and they just set up a machine that has basically two things running on it, MySQL and SSH. Its set to log the system events and with the load they have the VA RH MySQL setup was the best for the money.
a bunch of old, wrinkly, smelly violent fremen hosuekeepers are on their way now to help out the network admins.
So now we know the truth behind last week's attack: once Slashdot moved into the new Exodus facility, they found out that the Harkonnen had left behind a few hunter-seekers to cause havoc.... I wonder how long before the Baron gets to learn the hard way that one of Kurt's teeth has been replaced with a cyanide gas capsule?
My opinion only, IANAL.
MOO;IANAL.
There used to be a picture linked here.
Well, a typical Exodus facility isn't nuke-proof, but it's pretty damn close. I've toured one (in Herndon, VA) because our company is about to co-loc at it. Here's a brief rundown of the physical security:
You run into all this before you even see anything resembling a computer, apart from the terminals in the receptionist's enclosure. In the actual computer pens, you have the cages, and for the really paranoid, you can get a steel box with a biometric lock instead of a conventional cage.
To sum up...it would take a truly concerted effort to physically breach one of these facilities.
Aero
We can believe in you for 3 minutes, but beyond that, even the King of All Cosmos can't be expected to wait.
Here is your navigator : Mozilla/5.0 (Windows; N; Win98; en-US; m14) Netscape6/6.0b1
Just a security hole of Slashdot. You can find this kind of hole in all sites which has a forum. I think that in site like e-trade you can make some people asks for stocks.
You can contact me there : Krakus.Irus à voila.com
If you want to retry.
If you want to know more.
I'd buy a computer from VA Linux if they'd pre-install Debian. When I called to ask about it they said that they had no plans to do so. Any chance you could post the details of converting a VA box to Debian?
554285544Here is your navigator : Mozilla/5.0 (Windows; N; Win98; en-US; m14) Netscape6/6.0b1
Just a security hole of Slashdot. You can find this kind of hole in all sites which has a forum. I think that in site like e-trade you can make some people asks for stocks.
You can contact me there : Krakus.Irus à voila.com
If you want to retry.
If you want to know more.
The reason MySQL is so fast is that they made design decisions optimized for a read mostly database (Table level only locking, ISAM table structures, no transaction support, etc.). They also left out other 'key' RDBMS features (views, sub-selects, select into, foreign keys, stored procedures...).
If you can live without these features (and alot of website-web apps can) MySQL is a blazingly fast solution that scales reasonably well. However if you need any of the ACID stuff or if you have a high percentage of writes or deletes MySQL will absolutely suck. It is a special purpose tool and excels in doing the job it was designed for.
Oracle, MS SQL, Postgres, etc. are general purpose RDMS's and they are reasonably good at all aspects of database management but will rarely keep up with any narrowly focused app.
I use MS SQL, MySQL and to a lesser degree Oracle all the time in my web development job. Each is appropriate for different clients depending on the clients individual needs (and the fatness of their wallet).
a bunch of old, wrinkly, smelly violent fremen hosuekeepers are on their way now to help out the network admins.
George
The code is here in PDF format for the page and JAVAscript which is causing all this trouble. Sorry but PDF the only way i could get to make this readable in original format without being interpreted as HTML and continuing the problem
Happy reading!checkout www.steeleye.com. they make some great software for linux that does just that. all the code originaly came from NCR's lifekeeper, which is rock solid.
Why would Larry Ellison wish he was Bill Gates? Larry has more money.
I'm wondering if you are thinking of moving to a db with ACID support? Also, how are you replicating data should you take a hit on your one SQL server?
Justen Stepka
They have been at exodus for quite awhile....a few months minimum.
- nick
Its a security thing, companies pay big money to remain anonymous... I guess the way Exodus sees it is that information about server layout, type location is confidential.
Exodus is really uptight about letting people into the building. I just worked for a company doing a buildout at exodus. I was out with a freind when I got an emergency page. Some server was down or something. I was close to the colo so I swung by.
When I got to Exodus with my friend, she really needed to go to the bathroom. So I asked a security guard if she could use the facilities, which are located outside the main colo space (which requires a palm scan to enter). The security guard informed me that to use the bathroom would require the following:
They must hold on to her ID.
She must Sign in.
She must be issued a badge (a long procedure).
And a Security guard would have to be present at all times, waiting outside the stall.
That should give you an idea of how concerned Exodus is with security.
Alex B.
Actually, Exodus does not host ebay.com 100% anymore. ebay's machines might be still at Exodus' site, but ebay gets its bandwidth from Above.net, and will eventually move its machines over to their site.
For our company, we investigated both Exodus and Above.net, and found Above.net's connectivity was much better (more peering arrangements).
Do a traceroute to www.ebay.com for confirmation...
waiting
> but the mySQL API was designed to look just
> like the mSQL API. Most likely, they would have
> to change a lot of stuff around to make it
> compatible with any other database besides
> mSQL, and of course mSQL has been an inferior
> product to mySQL for quite some time now.
But, they're using Perl with the DBI interface, which is a completely uniform interface to various databases. For the most part, all they would have to do is change the DBI->connect() statement to a different DBD module and it'd be switched. Considering for the most part MySQL's SQL language is a subset of other RDBMs, they probably wouldn't have to change their SELECTs and such either.
I expect it's because MySQL is quite a bit faster, and it's usage seems to be more common than PostgreSQL. (...therefore, possibly more stable?)
WWJD? JWRTFM!!!
uh.
you believe that.
ACID transactions are the fundamental way of ensuring reliability in a distributed system. there is no "revolution" against transactions. in fact, they're making a comeback (see com+)
-Stu
... the amount of relevant information has remained the same ever since /.'s been on a dusty ol' 386 with a 14.4kb/s pipe...
/. ...)
(that is not to say the trollers and such are not fun, which they are.. they're just not useful outside the context of
Forgive me if this has been asked elsewhere, but why did y'all choose those distributions for those servers? I'm genuinely curious; I'm unfamiliar with the large-scale differences between distributions. (My computer runs Mandrake... that decision was based on the single factor that my friend happened to have a Mandrake CD on him.)
Actually, Frontier Globalcenter hosts Yahoo. Just for the record.
:)
WWJD? JWRTFM!!!
Maybe this is a screwup, maybe it's because I prefer browsing with Windows IE -- so sue me.
--
--
fat lenny's gonna lick your brain today.
I doubt it. The alternative-OS movement is still incredibly small in comparison to the dominate Windows platform. Even Slashdot can't manage to get half it's people running an Alt-OS, most of them use Windows. So I doubt including non-windows machines would have much of an impact at all.
I want a co-location provider with a 24x7, hot swappable, fully redundant team of ACLU lawyers!
The web page servers run Apache+mod_perl+Slash+adsystem == one big memory pig. When serving images you don't need all that stuff in httpd so it's a waste of resources. Most high traffic web sites run separate web servers with a stripped-down httpd so serving images does not drag down your overall performance. The added advantage is your web taffic logs are not polluted with image requests.
>the licesnse that MySQL is under does not come anywhere close to meeting the open source definition
Exactly. So why not move to a product that has it, like Oracle/Informix/...., or, if you are going to spend the $, why not invest the $-time in PostgreSQL, a database that IS opensource?
Is there any reason beyond: MySQL is what we have been using, so now we will continue to use it?
MySQL has said:
On Roll-Back
"MySQL has made a conscious decision to support another paradigm for data integrity, "
Ok, fine, that is a design choice. If they wanted it(rollback), they would have designed it in.
PostgreSQL has rollback, and just needs database replication, and they would LOVE to see that feature.
So, why work with MySQL, other than "it is what we have always done" or "We didn't think of another option"? Are you hoping to have them change the licence?
If it was said on slashdot, it MUST be true!
Why don't you run BOA or something like that. I tought images were pretty static, so why use Apache for that??? /pyder.....
_
/
\_\ sig under construction
Who/what exactly is Exodus?
All I can see from their homepage is that they're vaguely ISP-looking. But I have a personal policy of Not Bothering With Flashy Graphics-Laden Web Pages, so I didn't push any further through the morass.
From the context of the previous play-by-play article, I take it Exodus provides physical storage space and connectivity for your machines, and not much else...?
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Yes, my response is offtopic..
What chemical company?
--hunter
RateVegas.com - Vegas Reviews
Good points; I use the C API, so it would be a big pain for me to switch.
In the end, it's probably the speed, and since they're used to it and its various quirks.
D
----
Ignore Natalie, she moves as fast as a rock...just don't stand in from of her rocket launcher.
Only one goon in a room will shoot at you, but as soon as that one fails another will service you...Except that there's a second one also waving images at you.
You know the SQL Server is lurking in the middle, but you can't reach there yourself.
Yes, there should be a link to this story in the About page.
__
__
Men with no respect for life must never be allowed to control the ultimate instruments of death.
GW Bu
But I was just curious. Sometimes people do pull jokes like that :)
--Matthew
[eom]
"I thought I could organize freedom. How Scandinavian of me."
Unless you do some fancy sharing of connections, this is going to be a big problem when you use Oracle. This forces Philip Greenspun to use TCL/AOLServer for his work, since it allows connections to stay up between CGI invocations.
Actually, Apache's mod_perl module allows you to transparently cache database connections through its Apache::DBI module. I haven't actually looked at the Slash code, so I can't say that it actually does this, but that's one of the many big performance wins that come from using mod_perl rather than CGI.
--Mythos
Can I borrow the Quad Xeon? Maybe just one of the DUAL PII systems? Surely you have enough systems from old slashdot servers to put them together ;)
/. garage sale?
So when is the
Or are you gonna ditch the old stuff on ebay, or asseenin.com?
How much for the ugly couch in the WiReD article?
George
Hmm, last time I checked, people like to bitch about software (especially complex software) that is released too early and needs to be patched by various bug fixes. Quake 2 is a great example of this, as are several other Q2 engine games. While we're talking about the slash code, I still think what happens in the gaming world is going to bleed over to the net' community.
moox. for a new generation.
Bill and Ted's Excellent Adventure, or Bogus Journey?
Bad things often happen to good people,
It is up to them to see that they remain good.
That's a good question, but I fear the answer would be performance. I think the Slashguys should be focusing their efforts in a different direction though.
Here's where I speculate cause I've not checked out the code itself.
The SQL server (whatever variety they want to run) needn't be so beefy if there was some sort of caching integrated into the code. Are we hitting the SQL server every time a page is generated? Why not generate the index (for each of the kind of views like different thresholds) once a minute? Same with Slashboxes, they can be cached like nobody's business.
Plus, if you base the page assembly around caching, you can have the assembly later spread around the internet, and only have a database in one location (I'm being simplistic here).
Caching gives you sooo much performance if you use it sensibly.
cheers,
-o
Uh, why not just use Akamai for images? Then you could use all the machines for web servers, and judging by the normal speed of slashdot, this would be an advantage!
Why was the choice made to use one beefy-as-hell SQL server instead of multiple lesser powered systems?
Scott Ruttencutter
We Apprentice Developers and Designers
Can anybody tell me what purpose the NFS box serves? Multiple load-balanced webservers connecting to a single SQL server -- that's commonplace, but I can't figure out what is on the NFS box or which other boxes need that data. Am I missing something? How about a Slashdot topology diagram?
-- Liquor up front, poker in the rear.
Also, I bet they'd have a wonderful time rewriting all of slash to use Postgres instead of MySQL.
I looked over the setup for those servers and frankly, they seems to be quite an overkill, for this site itself. Maybe I am wrong, but if this site handles over that many people, since I have no clue on how many been hitting the site, then the setup is not bad. But Quad Xeon Processors for MySQL? What wrong with Dual Pentium II/III or single Athlon MySQL server?
:)
I can understand the need for failover for the MySQL which is a major requirement, but the computer itself is quite a major overkill. I do adminstration for over 16 different servers over 3 different clients and we uses Mandrake Distros on those and only problem I usually finds is the MySQL can be really stressful when there is many people using it on the same server with the apache server. I am in process of moving the MySQL over to new delicated server so it can handle that among several servers, and maybe that might call for beefy processor to handle the load.
If the site is setup to handle... let say.... 100,000 hits a day, what server configuration is needed for this? With MySQL on it own server, of course.
-- Amazing how the Internet still humms along.... -- Dispite all the flaws of Micro$oft in their software!
Not really. Rollbacks, Transactions, replication, subselects, etc have all been on the todo list. MySQL developers just needed to think out how they were going to impliment them well without sacrificing speed. There is a good article on this (interview with the main guy), but I don't have the URL on me. Check www.zend.com it's in my thread about 1 1/2 weeks ago. All the features that don't make MySQL a full RDBMS will be implimented in this year.
ok i see they have dual nics? is that for fault-tolerance or load-balancing, or what?
-If at first you don't succeed, call it version 1.0.
You laugh, but I hear that there's actually someone working on putting up a slashdot clone using just that setup.
Karma: Non-existant. Due mostly to the fact that you smell funny and nobody likes you.
Seriously, I've got a dual 9.1 GB SCSI hotswap from VA Linux (dual CPU boards up to 600MHz PIII, currently 450MHz PII (without the ID ...)) and another from Penguin Computing (my play machine, same basic config, but I got a CD/RW in this instead of a Zip drive).
... and for static serves, that's critical. heck, it's more important than dual CPUs or RAM!
What's with the single drive? You can't stripe or anything
Will in Seattle
While I haven't a clue why they're using Red Hat. Debian is an easy reason. Debian is the most stable distro right out of the gate. It's also the most "open-source" they won't accept a package unless it's Open Source.
Chris Hagar
"The price of freedom is eternal vigilance." - Thomas Jefferson
Everybody Chant ..
:)
"We Want Pictures.. We want Pictures.. We want Pictures"
Seriously though: it would be nice to actually see this setup. Don't forget to have CowboyNeal give us an oh so sexy pose near the almost as sexy VA SQL server...
This series is very useful for people just starting to work in linux admin environments, even if it's just a personal server.
Lots of varying methods are discussed of how to properly protect or run a server, and now we get a real life scenario of what happens when the shit hits the fan.
Don't just publish the Hellfire series, package up this one too.
I wouldn't have known about the playboy thing, except that the cage I was working in was right next to it... and there was an admin there who couldn't get his 300GB image database to back up properly. I kept joking with him that I was going to plug my laptop in to his switch (which was close enough to the wall of his cage) and see how much "premium content" I could snag. =)
As far as costs and security... a service like Playboy.com would probably be paying about the amount they were paying for the cage at Exodus just for proper bandwidth. Exodus offers all that bandwidth, but with a lot of added warranties against any type of failure. I think that's probably worth the price being paid.
Sure, Playboy.com probably doesn't need kevlar lined walls, but I can imagine a certain amount of chaos would ensue if someone decided to go after, say, Etrade's physical setup.
Some other fun things to do at Exodus:
Make friends with the guy running the shipping area. While I was there, I needed an extra fiber channel cable - I was shorted one in shipping, and fiber channel cables aren't something you can just go buy at the local computer shop. I asked the shipping manager if maybe I missed a box or something, and told him what I was looking for... he had at least a couple dozen cables - along with several high-end disk drives and an entire Dell server, which had all become the property of Exodus, 'cause no one had come back and picked them up in a decent period of time.
Exodus is a good place to dumpster dive. Their facilities are very low key. I doubt that there's a listed phone number for any of them and the one I visited didn't even have a sign indicating what it was, but if you can identify one... there's lots of interesting stuff being thrown away there.
I picked up an entire APC rackmount enclosure while I was there, and a dead Compaq server (which had overheated inside the enclosure). I just happened to be outside when someone was bringing it to the garbage. That was a "kid at christmas" moment for me. That Compaq had a working RAID controller and a dual-port Ethernet card as well (too bad they took all the disks out before they trashed it).
If you're ever working in one, or setting up a cage in one of their facilities, do yourself a BIG favor and bring some kind of chair or stool. There aren't any in the cages, nor any to loan out, and if you're there more than a couple hours, you'll probably want one. Also, a well appointed cage has a land-line phone. Cellular reception in the facility I visited was terrible (go figure), and if you need another phone, well, there was one in the ops area that customers could use, but that's not really a good thing if you need to talk to someone while you're working in a cage.
Walking among the cages at the facility I visited was extremely educational. I would guess that at least 60% of the computers I saw were either Compaqs x86 servers or Compaq Alphas. Probably another quarter were Suns. Almost everything else was Dell. I only saw two SGIs (both in the same cage), one IBM machine (in one of those cute U6 form factors), and maybe a half-dozen VA Linux boxes. One cage I walked past had 42(!) Sun Enterprise 4500s, and 6 Storage Arrays. One has to wonder what was going on in there.
-- I wanna decide who lives and who dies - Crow T. Robot, MST3K
I am sorry, but i do not know anything about web serving, anyways, why are all those SWEET systems needed when a 386 can fill a 10meg pipe? It is cool to list/know everything at Slashdot is running on sweet systems like that, why is so much horsepower needed?
Just goes to show how little I know about these things...
.,@
[..] through an Arrowpoint CS-800 (which replaced the CS-100 that blew up last week)
Do you really mean to say that one of those arrowpoints is a piece of junk after a DDos? Or did this have to do with something else?
Well, since my ignorance is now fully established, I'll just say I think you've got some nice stuff there...
xchg
xchg
jmp emailMe
What is the config of the BSD Firewall? Hardware/Software...
I wonder why there is a red hat box in the mix? what is the reason? Now I am a Debian bigot, but my guess is that so are you guys, is there somthing specific about Redhat and mysql that I don't know?
Second why not mylex cards in all the box's? mylex's new DAC110 SCSI cards are simply the fastest I have ever seen.
Why not Gigabit? I use it with Linux it works, it makes all that heavy duty hardware sing, 100mbit is just a passe :>
I am proud that you chose Potato for most of your box's
"think of it as evolution in action"
Hardware : Look at www.raidware.com and their web server director. For your NFS solutions, look at www.netapp.com and their network filers.
Topology : Think about putting some cache engines, proxy servers of some sort in front of your web servers. Think about mod_backhand for your load balancing at an Apache level (www.backhand.org).
Software : I am not qualified to say anything.
BTW : good job guys.
Eat suchi frequently.
Looking for a great online backup: Green Backup
Fault tolerance was a big issue. We've started by load balancing anything that could easily be balanced, but balancing MySQL is harder. We're funding development efforts with the MySQL team to add database replication and rollback capabilities to MySQL (these improvements will of course be rolled into the normal MySQL release as well).
If this was a big issue why did you choose MySQL over PostgreSQL? I dont know about replication, but Postgres does support transactions/rollback.
john
-- john
There's something weird with the EtherExpress Pro 100 cards with the Linux drivers. It's either a hardware thing (and the drivers haven't implemented a workaround), or a driver bug/race condition. In any case, the solution to this is to set the multicast_filter_limit option to 3 or less. Most people just set it to 0.
:).
/etc/modules file and make sure that your eepro100 line looks like the following:
If the driver is compiled into the kernel, you just edit the source and make sure the following line is set:
static int multicast_filter_limit = 0;
Recompile the kernel after this, of course
If you've compiled it as a module, then it's a much easier task. Edit your
eepro100 multicast_filter_limit=0
Of course, you can even do it at the command prompt with an "insmod eepro100 multicast_filter_limit=0". I will not be held responsible for those who initially try to remove the module from memory, via their telnet session.
Mmm. Slashquake. Imagine the possibilities.
Christopher A. Bohn
cb
Oooh! What does this button do!?
I mean, if I had $ 1 million a year, I could run multiple redundant T3s to my own office and use them for my own personal/company use, too.
I'm sure you could. But Exodus doesn't deal with anything as slow as multiple T3's. They advertise the fact that each of their data centers has multiple OC-12 lines. Each one of those is capable of 622.08 megabits per second. A T3 gets you only 44.736 Mbps.
They have huge battery systems, power conditioners, and multiple disel generators at each site. They are usually connected to more then one power grid. They can function without commercial power indefiniately.
They have redundent everything. Network feeds. Routers. Switches. Power. Cooling.
They have very tight security. Armed guards. Biometric (e.g., hand print) locks. Cameras. Steel doors. Double-walls. Personnel locks.
You use Exodus if you absolutely, positively cannot afford downtime due to third-party service failures. If you have to ask, you can't afford it.
dragonhawk@iname.microsoft.com
I do not like Microsoft. Remove them from my email address.
Hey, if you want to waste your money re-inventing the wheel, go right ahead.
DrLunch.com The site that tells you what's for lunch!
So we all know CmdrTaco is horrible at spelling, my question is, is "Oddessey" a knowing and humorous play on that, or is it an error? Given that the correct (or accepted as correct in English) spelling would be odyssey, is there any plan to correct the spelling?
I'm genuinely curious... with some people, it's difficult to tell the intentional from the unintentional spelling errors.
--Matthew
Hey, you're really close to my work. So why aren't I getting instantaneous page updates?
:-)
$ traceroute slashdot.org
[elided]
3 s1.dtn3.bos.ma.verio.net (199.103.133.29) 7.687 ms 8.224 ms 21.261 ms
4 d1-11-1-3.a00.bstnma04.us.ra.verio.net (129.250.118.41) 10.619 ms 9.470 ms 14.682 ms
5 d3-4-0.a00.bstnma05.us.ra.verio.net (129.250.118.29) 10.126 ms 10.057 ms 10.147 ms
6 fa-6-1-0.r00.bstnma05.us.bb.verio.net (129.250.30.113) 27.626 ms 10.447 ms 9.927 ms
7 p1-7-1-2.r01.nycmny01.us.bb.verio.net (129.250.3.177) 15.872 ms 16.028 ms 16.352 ms
8 nyc1-1.nyc2-1.verio.net (129.250.3.146) 156.702 ms 225.416 ms 188.964 ms
9 p1-1-0-1.r02.chcgil02.us.bb.verio.net (129.250.3.133) 55.075 ms 38.290 ms 39.869 ms
10 ibr01-s3-1-0.okbr01.exodus.net (216.32.132.185) 94.520 ms 39.250 ms 53.993 ms
11 bbr02-g1-0.okbr01.exodus.net (216.34.183.66) 47.932 ms 39.246 ms 39.315 ms
12 bbr01-p5-0.wlhm01.exodus.net (216.32.132.210) 51.079 ms 48.565 ms 48.641 ms
13 dcr03-g2-0.wlhm01.exodus.net (64.14.70.65) 74.841 ms 48.817 ms 50.115 ms
14 64.14.80.130 (64.14.80.130) 50.918 ms 50.388 ms 51.669 ms
15 64.28.66.204 (64.28.66.204) 51.335 ms 49.552 ms 51.074 ms
16 64.28.67.48 (64.28.67.48) 57.524 ms 53.040 ms 48.873 ms
Pity it doesn't actually work that way. All the way out to frickin' Chicago (at least) to go half a mile up the street?! I bet we have line-of-sight; think Exodus would be willing to set up an IR link?
I'm surprised that with all the money and equipment Slashdot has now, you guys still haven't implemented a caching server to prevent the /. effect on the rest of us who don't have the luxury of colocating at Exodus.
I know some have raised questions of copyright infringement and all that crap, saying it would be illegal to copy their content, but how is this any different than a real caching server doing this? Or google's method of caching all the pages it indexes? As an administrator, I'd much rather see my copyrighted content cached than have my server or pipe go down because of all the hits.
Now that you have your pretty new toys at Exodus, how about you actually use them to cache sites?
You can have a great and powerful redundant system over there, it looks pretty, but a breaker blowing will take them down for days! Yes, it's happened to me! I know, it's somewhat off topic, but a much needed warning to anyone looking for colo space! d1verse sr. sysadmin for a .com
Just out of curiosity, wouldn't it be easier to use something like PostgreSQL (which is just as freely available) that already has rollback & atomicity than to pay the MySQL people to develop it? Didn't y'all read the article on here a few weeks ago, "Why not MySQL?"
__________________________________________________ ___
rooooar
OK, so how many HTML hits are you serving, and how many image ones?
Spill it!
- Adam L. Beberg - The Cosm Project - http://www.mithral.com/
How do I contact you? I need to change the domain contact for horvitznewspapers.net from you to a role-base email address and you are not answering email sent to the address in the whois record.
,i>>3 load balanced Web servers dedicated to images
Why three image servers? Slashdot isn't exactly the most graphics intensive site I've seen. A few icons and the banner ads. Are you planning more graphics/art? Or, is this just to ensure that the Ads are loaded quicker than the rest or the page (jab to the ribs!).
Exodus isn't all that bad, but Rob & the gang should check out upcoming Colocation offerings from IBM - the facilities are Qwest, with AT&T, UUNET, Qwest, and Sprint as ISPs.
Lots of redundancy, lots of security, lots of available bandwidth!
-octaene@yahoo.com
Now that's what I call sexy!
So, why isn't the system disk included on a RAID config, mirrored or something? Not that much mroe expensive (1 drive is less than a grand, LVD or not), and could save you quite a bit of downtime. I mean, until the super-secret Odessey is done, you'd still have to copy across a new system disk, or swap it out, etc. Since it is a co-lo'd server, and I don't know what kind of service you're getting from Exodus, RAID could keep you from having to do more than go "Drat, lost a disk, Hrm." Just a question why the tech decision was made like this, with all the whomp-ass hardware involved.
Since I'm still on the steep end of the learning curve and the top is not in sight... :-)
I'm curious why there would be 5 servers dedicated to serving pages and 3 more just to images? As opposed to 8 servers dedicated to serving up the complete page?
Or maybe a reverse-proxy arrangement?
Seriously. This stuff is great and I wanna know more!
It sounds like you've got some interesting software products in the works. Instead of releasing them When They're Ready, why not release them Now? I always thought that collaborative development was crucial to open source -- that's certainly a recurring theme on these pages, especially when Microsoft code blows up.
Why following a "cathedral" development model when there are many folks out here willing and eager to help out with your code?
just curious
You could've saved yourselves a lot of money and got a nice Beowolf cluster of green iMacs..... You rich guys like to go overboard, don't ya? ;-)
Do you really need 1 Gig of ram on the web servers? Wouldn't the extra memory be better
used on the DB server? You're probably not using
more than 150MB for the httpd's.
The new slahsdot setup rocks!
And why no Athlon boxen ... or is that for the future ????
In any case, looks like nice, solid technology, as opposed to bleeding-edge tech, and thus extremely reliable. My hats off to your network architect: VERY nicely done. . . .
More importantly, how many people-hours are lost due to reading Slashdot each week? :-)
Ford Prefect
Tedious Bloggy Stuff - hooray?
That being said, the standard MySQL benchmark _still_ is 30 times faster for MySQL 3.23 than on PostgreSQL 7.0 (with fsync turned off, _and_ nonstandard speed-up PostgreSQL features like VACUUM enabled, I might add). The main reason seems to be some sort of failure to use the index in the SELECT and UPDATE test loops on the part of PostgreSQL.
The benchmark, for the curious, works like this:
First it creates a table with an index:
Then it fills the table with 300.000 entries with unique id values.Then, it issues a query like this:
which causes the backend to do one thousand read() calls. For each query.No wonder it's slow. An EXPLAIN query states that it's using the index, though. I have no clue what happens here. I've sent this to the pgsql-general mailing list and have just reposted it to -hackers.
Oh yes, the benchmark also revealed that CREATE TABLE in PostgreSQL 7.0 leaks about 2k of memory.
Heck, I work in Waltham. Next time you're out this way, give me a call/email. I'll buy you a beer. You deserve it.
-Mark
-- Ever notice that fast-burning fuse looks exactly the same as slow-burning fuse? I didn't... (Edgar Montrose)
I am sure everyone wants to know how much everything here costs. Here are the calculations for the linux boxes(info is off of the va linux custom configuration program):
webservers (type 1) = $4505 each
NFS server (type 2) = $7040
database server (type 3) = $25739
So the grand total is $68819. I haven't found the prices for the switches and firewall. I would suspect that the BSD box is close in price to the webserver (prob. a bit less).
Come play Heroes of Might and Magic Mini online.
Who/what exactly is Exodus?
Exodus is one of the world's biggest (in terms of service capacity available) Internet Service Providers.
"We're going to need bandwidth. Lots of bandwidth."
Exodus specializes in having more bandwidth then most of the third world. They've got NAPs (Network Access Points, i.e., backbone connections) all over the continental United States, and a few outside the US as well. They link this all together using both external and internal networks. The end result is, most anywhere on the net that has a good connection, has a good connection to Exodus.
They provide servers. Do you need to host downloads for ten million users? Exodus can give you servers to do so.
They provide co-location space. If their standard server packages just won't cut it -- bring your own. They'll give you a rack, a dedicated co-loc cage, or a dedicated high security vault.
Their web page has a lot of graphics because they have a lot of pictures of their equipment and graphs of their capacity. It is actually justified. You may want to make a return trip.
dragonhawk@iname.microsoft.com
I do not like Microsoft. Remove them from my email address.
You don't really talk a whole lot about where and for what you are using the NFS box for. How is linux's NFS working for you?
All I've ever heard were horror stories about its poor implementation. Water under the bridge? Righty right?
I assume your sharing static content and image serving using the NFS shares. Perhaps you've even gone so far as to utilize qmail or the sendmail/procmail NFS mail hack in order to store your email on the NFS share?
What shyed you away from the Linux Virtual Server Project's implementation of load balancing? Or the mon+hearbeat+fake+coda solution for HA? Or is your "Oddessey" work based on this in any way?
Are you loadbalancing your firewalling BSD box? Or have we reached a critical point of failure at the firewall?
Very interesting stuff, I'd like to know more details though.
http://windows.scares.us
Not to mention... ?ul? ?li?any outward-facing glass (doors, security fishbowl) is 90-minute riot glass. That is, it would take a battering ram 90 minutes to break through the glass ?li?Kevlar-reinforced walls ?li?sensors detecting everything from humidity and temperature to biological agents ?/ul? Yeah, pretty much the only thing getting through there is a nuke. :-) ?p? --jeddz
Receptionists/security chimps sit in a bulletproof enclosure
:-)
When I was touring the same facility they had the coolest metric for that enclosure: 90-minute riot glass.
Appearantly the average person (who is that, anyway?) could beat on it with a sledgehammer for 90 minutes before getting through
But do they have a diesel powered kitchen sink?
I remember back when there was a cachedot.slashdot.org that maintained a replica in case the server was having trouble. Is there any chance that this could be implemented again but in a different location?
Of course such a thing would not need to be as powerful as the main slashdot systems, but would provide some additional backup in case of another DDOS or a network outage of some sort.
Sort of a "battle bridge" for those of you who remember the days when Star Trek was good. (startrek.version = ST:TNG)
Mycroft-X
There has been a small talk about this story in the debian-devel channel on IRC just some minutes ago, and of course the great question was:
Why isn't the SQL server Debian as well?
If there's any problem with Potato's MySQL, I think Debian would be pleased to hear, whether it's a bug report in the BTS or whatever.
Thanks
Run 'ab' (the apache bench tool) with a few different settings, then tweak SQL a bit. Repeat. Tweak httpd a bit. Repeat. Drink coffee. Repeat until dead. And every time you add or change hardware, you start over!
I don't want to be a spoil-sport, but this kind of mechanic work sounds like a job for software!
I for one think that the new setup kicks ass. It's much much faster than the old setup. great job guys!
unfortunately the "highest score first" still doesn't work right, but I'd rather have faster loading pages than anything else. It's a pretty big feat when you think about how big the slashdot pages really are, with 300 comments in an html page generated on the fly. wowza.
___________________________
Michael Cardenas
http://www.fiu.edu/~mcarde02
http://www.deneba.com/linux
hyperpoem.net
For the MySQL box you list:
Red Hat Linux 6.2 (final release + tweaks)
Is that only a reference to tweaking the max number of processes in the kernel or did you apply some alien-technology-from-outer-space-experimental kernel patches?
If so, details pleeze!
I strongly believe that trying to be clever is detrimental to your health. -- Linus Torvalds
How secure is the location where the servers are located?
Are they in a nuke-proof bunker?
You might think this is a joke, but what if someone posts government secrets to Slashdot and the Secret Service (or whoever...) decides to pull a Steve Jackson on Andover HQ?
How would Slashdot stay up? Is there a mirror at another location ready to go up at a moment's notice? Would any of the authors not arrested post notice to Slashcode.org?
I must sound really paranoid, but this is the kind of contingency plan that needs to be considered if Slashdot is to become the flag bearer for Free Speech around the world.
Reality has a liberal bias
This isn't flamebiat. The correct term is shoutouts.
Prevent email address forgery. Publish SPF records for y
He did. Do you think that he tarred up and released the first Linux kernel the minute that it worked on his machine? I don't think so. Release Early is important, but so also is releasing code when it's ready - and if you KNOW that your code needs improvement before it's ready for consumption, why would you release an unpolished, buggy, incomplete package? Sometimes a little bit of work would push it above the usability threshold, below which people probably wouldn't take a second look at it.
1. Cluster the SQL box for high availability and fail-over.
2. Switch from NFS to SMB - even the apache site recommends this for speed.
2a. Get rid of NFS and just sync all your web-servers from one server - hence having local copies of all the code.
3. Look into having local instances of SQL running on the web-servers - read-only copies that are replicated from the main DB... then the central DB would only be used for write (aka comments and postings...)
Just my less than humble ideas...
BlackNova Traders
You're using Win2000, IIS, and Active Server Pages. We all know it. Quit making stuff up.
I don't know about the other slashdottes, but I for one would love to see how the Slashdot network is configured topologically.
We are considering using NFS as a backend storage facility for our web cluster nodes, but we are concerned with the performance of that type of setup as opposed to replicated data on the local web server nodes. Are you using the NFS server for this purpose?
Post your bug reports at the Slashcode Sourceforge page.
__
__
Men with no respect for life must never be allowed to control the ultimate instruments of death.
GW Bu
Rather than the retail price that you quoted?
GNOME or KDE on the RH6.2 box?
What? servers don't need a GUI? Don't let
redmond find that out...
So, how much traffic and hits/sec does this setup do?
I haven't looked at Postgres SQL specifically, but the mySQL API was designed to look just like the mSQL API. Most likely, they would have to change a lot of stuff around to make it compatible with any other database besides mSQL, and of course mSQL has been an inferior product to mySQL for quite some time now.
:-).
That's the most likely reason he did it; that's also why I've been sticking to mySQL. I don't have time to learn something new when I'm asked to take on 3,000 new projects every day
D
----