Wikimedia Simplifies By Moving To Ubuntu
David Gerard writes "Wikimedia, the organization that runs Wikipedia and associated sites, has moved its server infrastructure entirely to Ubuntu 8.04 from a hodge-podge of Ubuntu, Red Hat, and various Fedora versions. 400 servers were involved and the project has been going on for 2 years. (There's also a small amount of OpenSolaris on the backend. All open source!)"
In related(ly boring) news, Sun Microsystems replaced 200 old worn-out keyboards on their office workstations. Also, a handful of Microsoft employees patched their OSes, and some guy in Phoenix got a paper cut on his finger.
8.04 is a LTS release. Which is obviously the reasoning behind the version choice.
I did not know that ubuntu was a player in the server market.
http://www.ubuntu.com/products/whatisubuntu/serveredition
Summation 2
For such a large effort, it seems wild they had so many different distros running in their environment.
What do you guys think?
ACK
I did not know that ubuntu was a player in the server market.
THIS is what makes it "news that matters".
A Pirate and a Puritan look the same on a balance sheet.
So it's unlikely the decisions were influenced heavily from a budgetary standpoint. If they wanted to stay with a free RHEL derivative linux that's essentially identical to the one you pay for, they'd be using CentOS.
They chose Ubuntu. Maybe they just like it better? I think you can factor cost of out the equation.
With Gentoo, you have to be much more careful about what you update and when. They probably went to Ubuntu because it is based on Debian, and they can obtain support from Cannonical directly if needed.
How is this news?
Well they either should have stuck with 7.10 or waited for 8.10.
That's news...
8.04 is a long-term release. In the world of servers, that counts for something. Also, there were changes from 7.10 to 8.04 that were probably things Wikimedia wanted to take advantage of.
Bearded Dragon
Ubuntu server is not the same as ubuntu desktop... do some research before you make claims like this. Ubuntu server is SERVER centric appealing to enterprise class deploys. They have a very good pricing and support model in place. RHEL and CentOS are great distros with good support as well... but they are not perfect. Look at RHEL's recent incident with their RPM servers. My point is, just because Ubuntu has a great desktop linux os, does not mean that their server OS is fruity.
But as a server distro, I'm not so sure. I'm surprised that Wikimedia didn't go with a distribution that's more established for server needs.
As a server distro, it rocks. I've migrated from Gentoo to Ubuntu Server for my home server and I've never looked back. As for enterprise-level distros, I'd have to go with Debian. There's not a whole ton of differences between Debian and Ubuntu Server, but I would trust Debian's 'stable' repositories over Ubuntu's repositories in a mission-critical setting, as the packages in Debian's repositories seem to be more hardened as opposed to Ubuntu's packages, which tend to be more cutting-edge.
My finger hurts too. You know those bits of skin just above and behind your nails? Part on that the left side of my left index finger has gotten torn a little and now it's like a flap. The problem is, I don't need to alter the aerodynamics of my finger because I can't fly. It's really just painful, instead of useful, like on an aeroplane.
Actually, does anyone know how that happens?
Ubuntu server edition is stripped down and customizable, as well. I assume they didn't use the desktop edition.
This may be an outdated experience, but...I ran a single server with Gentoo for a while - until updating became such a tremendous pain. Manually merging configuration changes and such is simply not a good way to spend time, and neither is reading release notes to see whether I can simply use the old config and ignore new changes. Ubuntu is nice because installing and updating apps is easy, there is a wide variety of apps available for it, and it's quick and easy to install. Gentoo distro installation was a very lengthy, manual process - has this changed?
I'd agree with others that say that CentOS may have been a better choice, but in my eyes the choice between the two comes down to preference of package management systems rather than any difference in security or performance.
You are wrong there. A homogenous environment (up to a certain point) is MUCH better for scalability. Need more power? Get a new box, apply the standard customizations, throw it in the mix.
I agree that a cookie cutter approach like this does not yeild the greatest performance per box, but it does allow for a better performance/administration ratio.
But as a server distro, I'm not so sure. I'm surprised that Wikimedia didn't go with a distribution that's more established for server needs.
If you have an argument to make about the OS's merits as a server then make it based on facts. Tell us why you don't think it's a perfect fit on the server. Don't just say "I'm not so sure" and leave it hanging there. Support your position with something that can be argued.
the cuticle doesn't properly detach itself from the nail as it grows. The nail's growth slowly tears your skin apart.
Under the influence of Post-Cyberpunk Gonzo Journalism
I'm sure Xorg and KDE4 are high on their priority list for their web servers.
You wouldn't believe how much nicer Squid and MySQL look in Compiz.
http://rocknerd.co.uk
Uh, whuh? You've obviously never had to herd a large number of machines. Most stuff running the same OS is the only way to live - jumpstart/kickstart, standard patch clusters, one local package repository server, that sorta thing.
(I do in fact do this for a living. Standardised Solaris 10 servers with Blastwave for the open-source toys, CentOS 4 when we need Linux, local repository servers for both. A few Windows boxes with a locally-served copy of Cygwin on them. May I heartily recommend Cygwin on any Windows servers you may be stuck with - it makes life so much saner.)
http://rocknerd.co.uk
I second your comments on Gentoo. I had originally thought it would be a good choice for my MythTV box on older hardware... and it was fine, at first. After about two years of occasional updates, the updates got really painful. Updating config files sucked, and often packages wouldn't compile, I spent hours googling various compiler messages. At one point the mythtv package got upgraded to version 0.22, then it wanted to backdate it to 0.20. Unfortunately the desktop system I was using as a frontend was running Ubuntu, and I had version 0.21 on there, so things were all screwed up. I finally went to Ubuntu server, and it's been smooth sailing ever since.
If someone has the time to invest in understanding the whole portage system and knowing how to get exactly what they want out of it, and if they don't mind managing all of their config files after each update, then Gentoo is probably fine, and I'm sure it is ideal in soe situations. But it's definitely not for most people.
[standard Gentoo complaint]
Also, it takes a long time to compile stuff.
[/standard Gentoo complaint]
I put the story in the queue as an insight into how a top-10 free content site run by a severely under-resourced charity does its stuff. And it's all over the press this morning, fwiw.
http://rocknerd.co.uk
Actually, the only difference between "Ubuntu Server Edition" and the "regular" Desktop version is which packages get installed by default.
That's one of the things we like about Ubuntu -- the 'supported' version (should you want a support contract, or even just security updates for a longer period!) isn't a totally separate distro from what folks use at home.
When Red Hat split "Red Hat Linux" into "Red Hat Enterprise Linux" (supported, but for $ only) and "Fedora" (free, fast-changing, no long-term security updates), they lost the benefit that techs would likely be running the same version of the software on their desktops and servers.
Chu vi parolas Vikipedion?
Strangely enough, none of the things that bother you are an issue for us. Either they were fixed over two years ago, or they don't affect us.
Chu vi parolas Vikipedion?
Overall monoculture is bad; consistent setup and administration in a single buildout is good.
Chu vi parolas Vikipedion?
Mass installation of a customized distro can do better than mass installation of a general distro (eg, the kernel and software can be optimized for your use case).
And indeed, we use a slightly customized Ubuntu, in that we have our own patched versions of some packages (PHP, Squid, MySQL, some custom PHP extensions, etc) tweaked for performance or features we need, plus custom meta-packages to install the configurations we require on different server sub-types.
This is pretty easy to do on any distro with a decent package manager. I still like apt better than yum, though!
Chu vi parolas Vikipedion?
Just tear it off...
;-)
Warning: Animated grossness, requires QT/QT equivalent, maybe NWS depending on your work environment, but funny as hell nonetheless. And also COMPLETELY offtopic, I'll see you all in -1, Offtopic HELL!!!
These are on our new image/media-upload fileservers. We're trying out the wonders of ZFS (snapshotting for consistent backups and "rm -rf oops" protection, potentially filesystem-level replication, etc).
Since they're an isolated service type it's not a *huge* burden to have them be a little funky (eg, we don't randomly have an OpenSolaris box in the middle of the Apache/PHP cluster), though if we could do ZFS on Linux without jumping through scary hoops we'd happily to that instead!
We'll try it out for a while, and if we're happy with it we'll keep using it, if not we'll migrate to something else eventually (the machines should as happily run Ubuntu as they do OpenSolaris)
Chu vi parolas Vikipedion?
According to their web site, it's only supported for five years. You must have some bizzaro-world definition of "long term."
Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
[citation needed]
Just another person who's dealt with Ubuntu in a large enterprise setting. I don't mean for these comments to be flamebait, but it may come off that way. I'd just like to see more attention put toward them.
1. Incomplete automated installer. You can do nearly anything from Redhat's kickstart, but working with d-i doing partitioning, especially more advanced lvm and software raid setup is nearly impossible without some custom scripting hacks outside of d-i. Also, don't even ask what happens when you have a usb disk (or even just a card reader) plugged into the machine at automated install time, guess what gets recognized as /dev/sda... Speaking of which, since Ubuntu has their own installer, they don't support, fix, or use d-i, which means a lot of the time you will run into other random d-i installation bugs.
2. Ldap/krb5 stability. It's quite obvious that Ubuntu doesn't put a priority on testing or stability patching any of this, and in large scale deployments it just falls over on the server and client side.
3. Nobody in the enterprise uses cds or dvds to install, everything is automated from PXE, which means creating a local mirror to install from. Guess how difficult it is to mirror the "pool" directory without also getting the packages from every other version of Ubuntu. Yes you could use a script that parses the Packages file and only downloads the packages you need, but that just leaves more room for errors. Why can't I just have a single directory I can rsync?
4. When doing large scale automated apt-get update; apt-get upgrade tasks, ask what happens to apt-get/dpkg when a postinstall script fails, or there were file conflicts. Yes, the machine never fetches updates again. dpkg --configure -a and dpkg --purge --force-reinstreq and apt-get -f install are your manual cleanup friends. Also don't ask what happens when a user wants to install a local package with dpkg -i. Yes it prints an error, but unknowingly to the user the package actually gets half installed and breaks the automated update jobs. Why isn't there a --force flag to prevent this from happening?
5. When patching packages, there's at least 8 different ways a diff could be included in the sources. Here's a incomplete list of a few different schemes I've found over the years:
- Just drop the patches into patches/
- Just drop the patches into a non-standard patches/ directory
- Drop the patches into patches/ and add it to 00list
- Drop the patches into patches/ and manually patch the source yourself
- Edit the rules file and add in the patches manually
They really needs to adopt a single patching format, rather than quilt, dpatch, dbs, cdbs, and a bunch of other minor ones.
The sad part about this, is nearly all of these issues also exist in upstream Debian. I'd love to see these get fixed. I'd like more choices that I can run in the enterprise.
A nail clipper works better to remove the "flaps". And applying cocoa butter or shea butter to the area afterwards, as well as the area between the nail and the finger.
Your drooling over Gentoo kind of ignores the fact that the Gentoo developers are a bunch of screaming morons and can't seem to get straight which one's their ass and which one's their elbow.
As-is, I wouldn't even use Gentoo on a desktop. (How long has Nethack been masked because of their stupid-ass games policy?)
"You can either have software quality or you can have pointer arithmetic, but you cannot have both at the same time."
I'm actually pretty surprised. I know Ubuntu == Debian in a lot of aspects, but... To go to a distro that is *mainly* geared toward the desktop market (I know they have a server version, blah) for something as huge as Wikimedia, I'd think they'd rather go to Debian since it's considered more stable (although maybe more outdated as well). I have been a Debian zealot since the mid 90's and moved my DESKTOP to Ubuntu later on - but still think Debian is a best fit for servers.
Of course, there's always the whole "Ubuntu offers real support contracts" thing. That, in itself, is enough for any larger company to make the choice, right there.
It is pitch black. You are likely to be eaten by a grue.
Seriously, every "Off-topic"-modded post I've seen is only -1.
Perhaps it's for the best, as a -5 offtopc-mod would surely catch the attention of everyone. Oh look at this (_(_) (_|_) (_)_) Da Buttdance!
NO! NOOO! DON'T MOD ME -5 OFF-TOPIC!
Disclaimer: Been drinking too much Chimay tonight.
If you quote this signature there'll be 72 copies of Windows ME waiting for you in Heaven.
I need to overwhelmingly emphasize that OS X Server is *barely* suitable for a production environment.
I'm a big fan of Apple, and do appreciate the nice GUIs that they provided with OS X Server. However, it's not particularly stable, tends to break at odd intervals, and ignores many common Unix conventions, making it a huge pain to perform certain tasks, or do things not supported by the GUI.
It's a nice start, but I'd be very cautious about adopting it across your entire server infrastructure. Using it to host certain Apple-y apps might be fine, though I'd rely upon Linux/BSD for serious server tasks, especially if you already have the staff/experience to do so.
-- If you try to fail and succeed, which have you done? - Uli's moose
It is a lot longer than 1 year
Climate Progress - Hell and High Water
Try this for an idea... The whole concept of "installation" is wrong.
Build your own distributions. One per purpose.
Use something like RockLinux
to build a ramdisk image which contains all of the software and configuration required for a particular application. By "all" I mean "only". You end up with a single file which you put on a tftp server, you boot your servers over dhcp, they pick up the OS image and boot to the image on a ramdisk.
e.g. You might have one squid image, one PHP app server image, one Mysql rdbms server image etc. When the image boots it does whatever is required to run the app successully. e.g. putting a filesystem on the hard disk.
The benefits:
2 admins can run 500-1000 systems in a site easily because there is really only one machine; the network. Logarithmic increase in effort with the number of systems.
Deleted
FTR, make sure your ZFS pools don't get above 80-85% full. Our 24T pool went from "pretty good" to "abysmal" when we jumped to 91% capacity. I freed up a bunch of snapshots and got us back to 81% and the performance came back.
LOAD "SIG",8,1
LOADING...
READY.
RUN
>new Xorg ditching its config file
You can run xorg without a config file now, but you don't have to (I believe that was also true in xorg 7.3). And every version recently has been making more of the old config file redundant or unneeded. Instead it relies more on autodetection and sane defaults, which is a good thing. But you can still use the config file to override, if needed.
Climate Progress - Hell and High Water
nice, but you forgot the big one.
"the neutrality of this article has been disputed"
https://www.gnu.org/philosophy/free-sw.html
There is always risk involved when upgrading or deploying systems. Businesses don't upgrade just for the sake of upgrading. They will weigh the risks against the benefits and proceed if there is a clear advantage to upgrading. Like the saying goes, if it ain't broke, don't fix it. The cost of licenses can be minuscule compared to deployment costs, so much so that many licenses might as well be $0.00. Deployment costs can be some of your largest costs. How many people will it take to upgrade? What is their cost per hour to the business? Multiply that by the number of people involved. Have you deployed on an identical test system and tested your software to ensure that it will continue to function as required on the new production system? Do you have test scripts so that you can validate that it performs as required? Will you have to make changes to software or hardware to accommodate the upgrade? Will you need to update your documentation? What is your contingency plan should the upgrade fail? What will be the cost to the business if the system is unavailable outside of the deployment window?
Some systems, like SAP, may take years to be deployed throughout an organization. Your favorite distro might reach the end of support before deployment even completes. For other systems, your time line for product upgrades and support may not be entirely within your control. What if your system is part of a product that needs approval from the FDA? With five years of support you may have eaten up three years of that during product development and FDA approval, leaving only two years of support for the OS on your products. That could leave you with a short product lifecycle or mean that you have to perform significant upgrades in the field.
Other operating systems, such as Solaris, Windows, AIX, and HP-UX are supported for 10 and sometimes 12 years. The only saving grace for these enterprise Linux distros is that the source is available. But when the five years are up, then what? Will you still be able to pay Red Hat or Canonical to support your end-of-life Linux distro? What if they have made a business decision not to support end-of-life distros no matter what? If they will support it, it's safe to assume that your support contract will cost more than it did during the previous five years. And if you go somewhere else and hire some linux experts to support your distro, they won't have access to the information that the distro creators have. They won't have the documentation about why certain patches were applied, or specific changes were made, or other internal decisions. You better hope that your new support company is very careful and thorough.
So then, would it have been a better investment to pay for Solaris and 10 years of support, pay for 10 years of Linux support, or pay to upgrade your systems every three to five years? I don't know. It depends on your goals. Clearly Wikipedia likes to move faster than the average business. They seem to be continually upgrading their wiki software and like staying on the leading edge. From reading about their server setup, they appear to have a lot of redundancy and can reduce their risk when upgrading. Three to five years of support for their operating systems is probably sufficient for their needs. But don't let that lull you into thinking that five years is long term.
Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
A lot of folks seem to fail to realize that Linux has distributions. The kernel is the core of every linux system. From there, various organizations, Canonical being one of them, package the userland, a package manger, and an update service together, and call it their own. It's how Linux has worked for many years.
That being said, what you're really shopping for when seeking a Linux distribution is all the stuff around the Linux kernel. That is where Wikimedia found the benefit. Regardless the timeline, Canonical offered them a pro-bono support contract, there is evidence of long-term update availability, and an overall 'good' package set.
Also, for the record, Canonical does offer a server-edition of Ubuntu. See their website for more information.
uess how difficult it is to mirror the "pool" directory without also getting the packages from every other version of Ubuntu.
Not too hard you just have to use the right tool, https://help.ubuntu.com/community/Debmirror
Why can't I just have a single directory I can rsync?
IIRC the main reason debian introduce the pool structure is to allow packages to be shared between versions (particularlly testing and unstable) and therefore reduce the archive size.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
For home computer users, yes. Not for businesses.
Compared to microsoft? Server 2003: EOL 2010.
Besides which, you're forgetting this is linux we're talking about. Support runs out? You can open the hood and support it yourself (or pay someone else to do so). It's not like ubuntu would turn down paid support beyond the 5 year lifecycle of an LTS release.
"I think it would be a good idea" Gandhi, on Western Civilisation
His drooling over Gentoo? It's abundantly clear from his post and yours that he is many times more insightful and rational than you. His post says something meaningful as he describes the uses of various operating systems and concluding that Gentoo is not a good choice for servers, while you're just bashing Gentoo's policies and crying about Nethack.
Of course, this is slashdot, so you get the 'insightful' moderation. Congratulations.
This author takes full ownership and responsibility for the unpopular opinions outlined above.
Interesting. You've played with Gentoo on the desktop, and you think you have a clue about its potential for use on a system for "actual work" online. Right. Sorry if I'm not entirely convinced, especially considering that many others of us are completely up to the task of administering Gentoo to do real work.
Congrats on your choice to run Ubuntu servers. I'm sure it will prove to be a solid platform for your needs. But don't presume to tell us which distros are or are not fit for "actual work" unless you have a clue.
This author takes full ownership and responsibility for the unpopular opinions outlined above.
It's not apt vs yum or rpm vs deb - it's how well the repository's maintained. apt has a good reputation because Debian's repository is superbly well maintained. But Fedora's yum repos are much better maintained than Fink's apt repos.
It's not the software, it's the repository quality. Actual humans making sure everything plays nicely.
http://rocknerd.co.uk