Ask Slashdot: Best Linux Distro For Computational Cluster?
DrKnark writes "I am not an IT professional, even so I am one of the more knowledgeable in such matters at my department. We are now planning to build a new cluster (smallish, ~128 cores). The old cluster (built before my time) used Redhat Fedora, and this is also used in the larger centralized clusters around here. As such, most people here have some experience using that. My question is, are there better choices? Why are they better? What would be recommended if we need it to fairly user friendly? It has to have an X-windows server since we use that remotely from our Windows (yeah, yeah, I know) workstations."
Yellow Dog Linux ftw!
Redhat Enterprise Linux.
If you need something cheaper (no licenses), you can always go CentOS. Or you can mix both, having some RHEL and some CentOS machines.
morcego
Built for that very purpose.
"To those who are overly cautious, everything is impossible. "
It isn't about the OS, it is about the tools to manage it. Rocks is based on Centos, and helps you run the cluster.
http://www.rocksclusters.org/
NPACI Rocks is probably your best bet. http://rocksclusters.org/
How about Scientific Linux?
now we need to go OSS in diesel cars
Scientific Linux. http://www.scientificlinux.org/ Has the benefit of RHEL: a stable OS environment without some of the headaches of CentOS. If you have money (you probably don't) RHEL is good.
--I hate people when they're not polite -"Psycho Killer", Talking Heads
Fedora has components to help manage large deployments. https://fedorahosted.org/spacewalk/ It also has FreeIPA to help with a secure and scalable means of managing authentication/authorization/resources within the cluster. http://freeipa.org/page/Main_Page
I think an important question here is why was Red Hat chosen for the other clusters? Your requirements aren't very specific, there are hundreds of distro's that could meet your criteria.
MAC! wait, what?
Now we have such a clear winner on the choice of distro, perhaps we can discuss which would be the best editor on the cluster?
Centos is modified to be the base OS for the ROCKS Cluster.
http://www.rocksclusters.org/wordpress/
I've worked with various clusters over the past year.
The distro doesn't really matter, mostly it's what you feel most comfortable with. I'd slightly favor RedHat Enterprise or a respin of it, since it's easiest in terms of drivers for commercial cluster hardware and commercial software support, but Debian would be just as fine. I would choose a 'stable' distro though, so no Fedora, no Ubuntu (even their LTS isn't exactly enterprise grade compared to RedHat / Suse or even Debian stable) You don't want to have to update every week since this usually requires quite some work (making new images and rebooting all nodes)
What I found out matters a lot more is the scheduler you will use; Sun Grid Engine, PBS, Torque or slurm to name a few. Every scheduler comes with it strong and weak points, be sure to look at what matters most to you.
If you are unfamiliar with all of these things, pick a complete bundle like Rocks (it's based on RedHat Enterprise Linux), which makes setting up a cluster quite easy and still allows you to choose which components you want. That'll greatly improve your chance of success. But be warned; it's still a steep learning curve building and specially configuring a cluster. The most time is spent tuning queuing parameters to maximize the performance of your cluster.
Imagine a Beowulf cluster of BeOS, beotch.
My comprehension of this question is roughly 'please have a flamewar about the different flavours of Linux.'
Is 1563649 a prime number?
"It has to have an X-windows server since we use that remotely from our Windows (yeah, yeah, I know) workstations."
So what? One one hand in order to run Linux graphic apps on Windows you need an X-Window server... on the Windows machine, not the Linux one. On the other hand, how is it that you *must* use GUI-based apps? There's *really* no operational alternatives? (I've been administrating Linux and Unix systems for almost two decades and I never needed -as in "must", GUI-based apps for that).
If you are OK to go with RHEL, you also can look for SLES: SUSE Enterprise Linux Server. They also have SUSE Studio where you can make your own appliances. If you are large enterprise, they will even give you SUSE Studio appliance to be hosted in-house in your company for your own needs. They also have SUSE Manager — same as Spacewalk, but has more features in it (and is backward compatible with a Spacewalk).
RH support is phenomenal and that's why a lot of businesses use it. If you want it on the cheap, go with what you're comfortable and have your specific calculation packages built in (Debian if you like apt and open source packages, RPM if you use a lot of commercial packages). If you're looking for performance and specific hardware enhancements, go Gentoo or one of it's brethren. Go with something that you can easily re-image if you're looking for lots of changes in software lineups or conflicts.
Custom electronics and digital signage for your business: www.evcircuits.com
Scientific Linux 6.0 is built on Redhat Enterprise Edition 6 which is highly tested and tuned for server throughout put, power management, and stability compared to a stock vinalla kernel. The performance will be much better than a stock debian stable kernel or Ubuntu for example. Redhat has a bunch of hackers. Scientific Linux includes apps used for scientists which maybe your target market if you are a university too. If your old cluster has scripts and tools optimzied for Redhat and RPMs then makes sense to use a Redhat Distribution base.
If the scientific apps with Scientific Linux are not being utilized then just buy a license for RedHat Enterrpise Edition 6.1. The licensing fees are affordable if you have the budget for a large cluster and switches. With Redhat Enterprise edition you have support too if something goes down.
Remember to save a few bucks and go free is silly in an expensive project like this.
http://saveie6.com/
NPACI Rocks without a doubt. Red Hat centric, you need to put in some work to understand how it ticks, once you so and set up your cluster properly, it is very solid and reliable.
So what user should do? Use substitute of Debian or accept no substitute.
I was just pricing 2U database servers that had 32 cores each. A 128 core cluster is now just four small off-the-shelf servers in a rack for less than a hundred grand.
Period.
Are you sure that you know? You run local x window server on your windows machine when you use x window programs.
http://www.returninfinity.com/baremetal.html
I've got 10+ years experience managing a large (2000 core, 1+ PB storage) compute cluster. If you're using one of those annoying commercial apps that assume Linux = Red Hat Linux (Matlab, Oracle, GPFS,etc.), then CentOS or Scientific Linux are the way to go.
If you don't have that constraint, consider Ubuntu or Debian. apt-get is my single favorite feature in the history of Unix-dom. Plus, there are often pre-built packages for several common cluster programs (Torque, Globus, Atlas, Lapack, FFTW, etc.) which can get you up and running a lot faster than if you had to build them yourselves.
Debian -- easy to manage, easy to create new packages for, least amount of nonstandard, distribution-specific stuff (except configuration files management, but that is a result of having to keep individual packages' configuration tied to packages).
Contrary to the popular belief, there indeed is no God.
1. What types of computation is the cluster going to be used for? MD, CFD, ???
2. What software will be used on the nodes? CHARMM, GAMESS, LAMMPS, NWChem, etc.
3. Do you have a preference for a Linux distro? If not, it really doesn't matter that much if you are rolling your own cluster and software stack. It will just determine what things are used for package management and what services in the distro you might want to turn off in order to get the most memory for apps and not the base OS.
4. You should be using SSH as the main interface for the actual compute nodes and maybe (big maybe) have an X server on the login/compile head nodes, but NOT the compute nodes. You want the compute nodes to be as bare as possible to conserve as much RAM and scratch disk space for apps as possible.
Having said all that, CentOS, Fedora, SuSE and RHEL are probably the most popular on distributed memory clusters today. You will also want to make sure that whatever compilers you are using are compatible with the Linux distro you want to use, unless you are relying completely on gcc or binary applications. I have built many clusters from scratch and can be a point of contact should you have additional questions.
we run our 320 core cluster on debian squeeze. infiniband support out of the box. the gridengine is a mater of apt.-get install. comes with tons of scientific sofware.
Scientific Linux is totally awesome, but a project of this size, especially with the IT knowledge on hand, needs the support and first-rate product which RedHat provides.
If your cluster is going to be closed circuit (no internet access) I would recommend RHEL5 as finding and installing RPMs is generally easier when your not able to use the default Distro package utility. If you cluster will have access to the internet (you'll be able to use the Distro package app) I'd recommend Ubuntu10.04 as the Distro repository is up to date and constantly growing.
For building and maintaining a small cluster, especially to anyone whose main job is not going to be maintaining the cluster, you should take a look at Rocks. It actually builds on top of a regular Linux distro, although only certain distros work. Redhat Enterprise, CentOS, and Scientific Linux are mentioned in the documentation as being compatible.
What Rocks does is add a bunch of cluster-specific tools to the underlying distro. It helps take care of networking and setting up the compute nodes for easy maintenance and configuration. You basically configure your front end, and then it is extremely simple to manage the computer nodes (including installation; installation by default is done over the network between the compute node and the head node). I have also found the Rocks mailing list to be extremely helpful even to folks who are new to building clusters.
Hi,
I work at a Supercomputing Institute. You can run many different OSes and be successful with any of them. We run SLES on most of our systems, but CentOS and Redhat are fine, and I'm using Ubuntu successfully for an Openstack cloud. Rocks is popular though ties you to certain ways of doing things which may or may not be your cup of tea. Certainly it offers you a lot of common cluster software prepackaged which may be what you are looking for.
More important than the OS are the things that surround it. What does your network look like? How you are going to install nodes, and how you are going to manage software? Personally, I'm a fan of using dhcp3 and tftpboot along with kickstart to network boot the nodes and launch installs, then network boot with a pass-through to the local disk when they run. Once the initial install is done I use Puppet to take over the rest of the configuration management for the node based on a pre-configured template for whatever job that node will serve (for clusters it's pretty easy since you are mostly dealing with compute nodes). It becomes extremely easy to replace nodes by just registering their mac address and booting them into an install. This is just one way of doing it though. You could use cobbler to tie everything together, or use FAI. XCAT is popular on big systems, or you could use system imager, or replace puppet with chef or cfengine... Next you have to decide how you want to schedule jobs. You could use Torque and Maui, or Sun Grid Engine, or SLURM...
Or if you are only talking about about like 8-16 nodes, you could just manually install ubuntu on the nodes, pdsh apt-get update, and make people schedule their jobs on google calendar. ;) For the size of cluster you are talking about and what I assume is probably a very limited administration budget, that might be the best way to go. Even with someting like Rocks you are going to need to know what's going on when things break and it can get really complicated really fast.
Doesn't gain you a thing. Drivers are loaded on demand as needed for local hardware. Unused drivers are not loaded at all, and do not impact performance or memory usage.
All custom kernels can do at most is to reduce the size of the initrd file used during boot.
The initrd is a compressed cpio file containing the contents of a memory resident root filesystem used during hardware initialization. Once hardware is identified, then required drivers (disk/video/keyboard/mouse) are loaded and the real root filesystem is used (using the driver from the initrd).
Once the real root is mounted additional drivers (if any) may be loaded as directed by configuration.
The only gain in a custom kernel is reducing the time to compile a kernel...
I built one with Debian Lenny plus I developed a scheduling system in perl using kernel containers + CGroups. So, the research team would think they own a "real" linux and I can share resources in a better way. Also I used perl-cgi to make a containers design, this way no one touchs my real OS. The research leader just use my container design to create and deploy new containers, then the new container is booted in a testing machine so the person can install whatever he needs and cloning later to deploy on the cluster, 400 core + 1.8TB ram.
but that is ok! rather than asking quickie questions and expecting quickie answers, you can start by learning the difference between a system administrator and IT professionals.
"user friendliness" is proportional to sysadmin's abilities or proportional to $$$ for commercial tech support
1. A (stale) link to get you going on hpc clusters: http://www.hpccommunity.org/section/kusu-45/
2. http://www.platform.com/ - Dell's/Redhat official hpc cluster (at least a couple of years ago) which was based on kusu (see previous link). In other words, RH was(is?) using a third party for their RH HPC - correction needed if things have changed. - a great yo-yo system (DellRedHatPlatform) in case you have issues.
3. http://www.caoslinux.org/
A lot of this depends on what you're doing with your cluster and what apps you're running. However, Scientific Linux is used by quite a few large clusters and all of the US ATLAS and CMS clusters run on. As others have mentioned, you probably want to be more interested in how the cluster is managed and nodes setup and kept up to date. I'd recommend something like cobbler and puppet or some other change management system so that you can setup profiles and automatically have that propagated to the various nodes automatically. This is preferable and easier than going through and making the same configuration changes on 5-10 machines.
"When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
Wrong too. Use the distro you work better with.
Am I eval()? - http://www.monst3r.com.br
I'd have to agree with the Debian/Ubuntu route if you want user friendliness. I've always found Debianesque systems much more manageable than other distros. If I have to provide most of the IT myself, I prefer Debian/Ubuntu. There are some science Debian distros as well (and repositories).
Scientific Linux would likely be faster overall for computationally heavy tasks but it really depends on what you are planning on doing. Debian wouldn't be slow, just not quite as fast as Scientific Linux; but again, that might not matter very much in the big picture.
I run a 730-core cluster on debian/gridengine. We're a debian shop, and keeping the cluster platform the same as our desktops is an advantage. Configuring gridengine takes some effort, but so far we're pleased with the result.
Simple question. The OP asked WHY you feel that is a solution for large-cluster HPC.
It looks like so far your only reason is "i liek it!" - I personally have no opinion or experience with HPC clusters, but so far nearly all of those who do are recommending something that is either RHEL or RHEL-based (Rocks or Scientific Linux), if only because it allows you to leverage commonality with the big cluster operators with installations in the Top500.
Disclaimer: I'm an Ubuntu user, and I greatly enjoy it, but I have not seen many examples of actual scientific clusters running it.
retrorocket.o not found, launch anyway?
Why? I know Ubuntu is the standard recommendation for grandma these days, but what makes you think it's particularly appropriate for a computational cluster? For instance, do you really need GNOME on a high performance cluster?
Give me Classic Slashdot or give me death!
Disclaimer, I worked on the produce for a number of years. I now work at a different Linux company...
Scyld is built on top of Red Hat EL, can also run with CentOS, but uses a custom Kernel. It has a lightweight provisioning mechanism that makes maintenance of compute nodes very easy, and the single system image approach makes job management significantly easier than a traditional Beowulf cluster. I don't know if they test it out with Scientific Linux these days.
Open Source Identity Management: FreeIPA.org
We use RHEL/Scientific Linux & Perceus (http://www.perceus.org/). It is solid and easy to add new nodes.
http://bccd.net/faq - Debian based Bootable Cluster CD. 'nuf said.
--You're BOTH right. It's a floor wax AND a desert topping!
Correction: the X11 server runs on your glass; eg: your Windows system. All you need then are X11 clients on the Linux cluster nodes.
So yeah, you'll need the libs and other support files for X11, but not the server itself. You'll save a bit on disk space by not installing the server. If it's just a single X11 client you need to run, then you can figure out exactly what it needs and not have a bunch of other crap (fonts, *GL, window managers, libs you're not using...) installed. Plus, you won't have a daemon running that takes resources despite being idle, and is an attack vector since it manages user logins.
I built a small (32 node) Beowulf cluster for an informatics group at the University of Bonn. We started off with a SuSE, discovered that it was hard to get some drivers compiled, then went to Debian, discovered that some of the boot up scripts were a bit troublesome to keep up high availability, then went to Gentoo ahref=http://www.gentoo.org/rel=url2html-17894http://www.gentoo.org/ /> and were quite pleased how *everything*, including rebuilding a node up from the boot loader, could be scripted. Of course every situation has its unique hazards, but if you want tight system with everything under your control, it's hard to beat Gentoo, (however good ol' Debian came very close).
I believe you can successfully build a computational cluster from any linux distribution. I am sure you could go wild and use slackware if you want.
But I guess the quesiton is who will administrate the cluster ? from what you say, I feel like you will and you say yourself you don't know much about that. Then I would recommend to keep the distribution installed by the vendor because they will probably give you software support. But if you change it, they probably won't.
Important things have already been told. But in summary the question is what are you going to do with the cluster. What application are going to run on it. Are you going to develop application to run on it ? or are you using premade applications ? If you are developping with it, you probably want more up to date softwares. If you are using some premade applications, you probably want the best compatibility...
I'm not really a Ubuntu fan, but with the cluster I manage (120 physical cores, 960GB RAM) we've ended up going with Ubuntu 10.04 running Sungrid Engine for a couple of reasons. - The LTS support, by the time the support period ends we should be replacing the hardware any way. - It provides a Grid engine package by default (might not be the latest but it's good enough) for distributing the workloads - A lot of people are already familiar with Ubuntu - Most third party apps provide support for it - It's very stable - It's free Note if your users are heavy R users have a look at installing the Revolution R package from the third party repositories. It can provide some massive speed ups for Matrix work and a number of other jobs.
With some chance of being modded down, I suggest Gentoo Linux. With Gentoo you can compile your kernel and everything else which might give you some arguable performance increase. Because Gentoo is a source-based distribution, it might help you with scientific development because all the library (boost, itpp, lapack, etc) headers (and source) are immediately available. There is support for scientific libraries like atlas, ACML, etc.. and you can easily change the default library for blas/laplack using a simple command line. You can also find up to date scientific software in the official Gentoo repository.
I don't know about you but I find very useful being able to inspect the code of core libraries and patch it for my needs, if needed.
Just my 2 cents.
Just to clear out a misconception that arises from time to time: you do not need an X server on a server exactly in the same way you don't need a web browser on your HTTP server. To understand that, you can think of an X server as a "browser" for the X protocol. On the server you just need some support libraries (which help applications in talking the X protocol).
I've read through the suggestions, and many good ones have been posted.
But, we live in the cloud age. I'll suggest taking a look around to see if your compute requirements could be met by using an available resource in the cloud. The opportunities there are exploding, and hard to gather info on as its fast moving, but *if* your compute can be made to fit in and around something like cloud foundry, Azure, or perhaps Hadoop or other number crunching cloud ops, - the advantage is they only charge for what you use. So you can ramp up or down (at least this is the theory) your compute power to a greater degree and with more flex than you can by building all your own nodes.
(And I know, the question poser asked Linux, but with compute and cloud sometimes its good to move away from platforms, and focus on the actual compute needed. You care about the calc, and not about which flavour it crunches on..
Just my tuppence in this complex question..
We`re all equal
I have to disagree. Ubuntu has a nasty habit of letting non-mainstream, non-desktop related bugs pass through several release cycles. We've just this last week spent 3 full days trying to figure out why my perfectly working NFS boot over PXE cluster broke when we did a safe upgrade. Turns out there's been a bug in portmap since lucid, which still exists in natty which causes the NFS rootfs mount to fail. We had to to recreate the filesystem from scratch and install lucid without updates, then hold portmap back manually (after much trial and error to find out which package was breaking). I've had other issues with Ubuntu server too, so to me this is not an isolated incident. I wouldn't recommend Ubuntu for any scientific work, and especially not something as 'unusual' (read -> not desktop oriented) as cluster.
I'll leave the clustering distro advice to others, but if I understand your needs regarding X-windows, what you need is an X server running on your windows (or other ) client machine so that the program running on the cluster can display on your desktop/laptop. The X programs may need appropriate libraries, but you don't need an X server running on the cluster.
See Xming for a good, free, open source X server for windows. There are other options available, but that's what I use, and find it to be stable and reliable. (For a Windows program... )
Then use putty to SSH to your cluster, with X11 forwarding to your locally running X server.
WALSTIB!
you will be much more comfortable with debian squeeze, RPM sux
It's a FAQ there, but you really should be asking this on the beowulf list, after skimming the list archives for any of the eight and a half million answers (in gory detail) that have been posted there in response over the years. Slashdot has plenty of nerds and I'm sure a lot of cluster geeks (who are likely on the beowulf list) but the beowulf list is sort of distilled cluster geekery/wisdom.
http://www.beowulf.org/mailman/listinfo/beowulf
rgb (Google "rgb duke beowulf" if you like -- I used to help answer this question once a month a few years ago on list, although I'm too busy and less active now.)
Even when the experts all agree, they may well be mistaken. --- Bertrand Russell.
The smallest glibc distro I know. Doesn't come pre-configured with cluster tools, doesn't even have prebuilt packages for them. But, it'll easily compile most of the software you require (C++ is one exception, I had to rebuild the compiler), and, most importantly, has a build system you can use to put together your own .iso which can be installed in under 5 minutes, probably even less.
Has recent 2.6 kernel and latest glibc, which means it'll also run executables built in other equivalent distros. I've run the Sun (oh, Oracle....) JVM with it, no modifications required.
I'll preface this by saying that I'm an HPC admin for a major national lab, and I've also contributed to and been part of numerous HPC-related software development projects. I've even created and managed a distribution a time or two.
There are two important questions that should determine what you run. The first is: What software applications/programs are you expecting the cluster to run? While some software is written to be portable to any particular platform or distribution, scientists tend to want to focus more on science than on code portability, so not all code works on all distributions or OS flavors. Small clusters like yours often focus on a few particular pieces of scientific code. If that's the case for you, figure out what the scientists who wrote it use, and lean strongly toward using that.
The second question is, who will run it? Many small, one-off clusters are run by grad students and postdocs who work for their respective PI(s) for some number of years and then leave. In this scenario, it's important to make sure things are as well-documented and industry-standard as possible to ease the transition from one set of student admins to the next. (And yes, PI-owned clusters have a surprisingly long lifespan. Usually no less than 5 years, often longer.) To that end, I strongly recommend RedHat or Scientific Linux.
We, and most large-scale computational systems groups, use one of two things: RHEL and derivatives, or vendor-provided (e.g., AIX, Cray). We run CentOS but are moving away from it ASAP. The Tri-Labs (Livermore, Sandia, and Los Alamos) use TOSS, which is based on CHAOS (https://computing.llnl.gov/linux/projects.html), which is based on RHEL. Many other sites use Scientific or CentOS. Older versions of Scientific deviated more from upstream, which caused sites like us to use CentOS instead. That's no longer true with SL6, and since CentOS 6 doesn't even exist yet (and RHEL6.1 is already out!), there are strong incentives to move to SL6.
Let me address some other points while I'm at it:
Why RHEL? If you can run RHEL itself, do so. RHEL isn't built with the same compilers it ships with; the binaries are highly optimized. Back when we were working on Caos Linux, we did some benchmarks that showed RHEL (and Caos, FWIW) to be as much as twice as fast as CentOS running the exact same code. So if performance is a consideration, and you can afford a few licenses, it's definitely worth considering. The support can be handy as well, particularly if this is a student-run cluster.
Why Scientific Linux? If you need a free alternative to RHEL or are running at a scale that makes RHEL licensing prohibitive, SL is the way to go, without a doubt. It's maintained professionally by a team at Fermilab whose fulltime job is to do exactly that. They know their stuff, and they're paid for it by the DOE. Other rebuild projects suffer from staffing problems, personality problems, and lack-of-time problems that SL simply doesn't have.
Why not Fedora? Stability and reliability are critically important. Fedora is essentially a continuous beta of RHEL. It lacks both the life-cycle and life-span of a long-term, production-quality product.
Why not Gentoo? Pretty much the same answer. The target audience for Gentoo is not the enterprise/production server customer. Source-based distributions do not provide the consistency or reproducibility required for a scale-out computational platform. You'll also have a hard time getting scientific code targeted at Gentoo or other 2nd-tier distributions.
Why not Ubuntu or Debian? Ubuntu is a desktop platform, not a server platform. Again, it boils down to their target market. There's really no value-add in the server space with Ubuntu, so why not just run Debian? If Debian's what your admins know best, it's worth considering, but keep in mind that very, very few computational resources run Debian, so you may have to do a lot more fending for yourself if you go that route.
Why not SLES? Mostly a pers
Michael Jennings | HPC Systems Engineer, Lawrence Berkeley National Lab | Author, Eterm (eterm.org)
I will shout out some random distro and not say anything else to back it up!
The problem with that suggestion is that the people maintaining the code don't have a clue what QA means. And before people whine - I used Gentoo as my primary distro for around three years. The emerge system is great - but the data inside is crap.
If you want to build your stuff from source and actually have a working system, look at the Debian-based distros. There's this nifty "apt-build" thing that lets you build software with whatever compile options you want (so you can still do -O3 -funroll-loops on everything if you really hate memory), just like Gentoo does. And there are packages for just about everything; partially because Debian's been around forever, and partially because "just about everyone" uses Ubuntu now. Gentoo does have a few "hacking" apps which are hard to find on other systems, but that's irrelevant to this discussion (and BackTrack is the way to go for that stuff anyway, IMHO). The primary difference is that you can build with source code that will actually work, and probably won't blow your system up when you just do a routine update. Wheras with Gentoo, some random kid who's too 'leet for testing might just promote to stable a new version of Xorg or Apache (both real examples from experience) which works fine on his system but breaks everyone else's in the world. And by "might" I mean "will". :)
I'm posting that mostly because quite a few Gentoo users think that only Gentoo (and maybe some of the BSDs) can easily rebuild a system from source, so they put up with atrocious quality assurance (which is admittedly extremely difficult given the Gentoo user base, and supposedly has gotten better) because they don't know that there are quite usable alternatives that are also more mainstream.
SuSE Linux Enterprise Server is a proven HPC solution if management kicks up a stink and demands commercial support.
I've worked with SLES and have been fairly happy with it.
Scientific Linux is probably the optimal choice though ;)
While I'd probably still recommend RHEL/CentOS/Rocks/whatever, to answer this specific question...
Ubuntu is an easy-to-use polished layer on top of Debian's unbeatable history of Doing Shit Right. Yes, there are some mistakes in their history like everyone else, so skip the "but in 1996 Debian did some obscure thing wrong" and "one time some boob screwed up the random number generator in ssh" - but overall, Debian is an incredible base for just about everything. Ubuntu takes Debian's inherent coolness and then makes releases more than once a decade. :) With the Ubuntu route, you get a number of different kernels which would benefit HPC applications - like a 32 bit kernel with the large memory support enabled, a kernel with RTC support, etc. You get the ability to install the ubuntu-minimal version (use the alternate installer) which is smaller than the minimal version offered by the other popular distros, and then install the packages you want. You get the QA benefits of using a distribution that has a *lot* of eyes upon it. And you get apt-build so you can recompile and fairly easily package things up.
Someone supporting HPC clusters shouldn't just pop the graphical install disk in and take what's installed; there is a fair amount of cutomization which should be done (ideally through Cfengine). So, while Ubuntu does have a nice, really easy install process for Grandma, there's an incredibly powerful and configurable architecture underneath that unassuming front end. If you have the deep knowledge required to understand why one distro really is better than others, it's actually worth taking the time to read through the documentation in the Ubuntu wiki, learning how all the different Debian things work together, and generally spending the time it takes to seriously use and inspect Ubuntu behind-the-scenes. It's nicely architected because the Debian people - weird as they may be - have spent decades building a very well designed platform that people like Canonical can extend.
It's really not as bad a choice as one might think if all they know about Ubuntu is that it's easy to install and use out of the box. :)
So, there's been a bug for years, but you just hit it recently? Sounds like a new bug. ;)
(I'll pretend I haven't seen all sorts of problems with NFS root on different Ubuntu releases for the last several years; the bug seems to relate to the way the mounting and detecting-of-mounting works; my name's probably in a few of the bugtracker threads)
128 cores isn't enough to worry about - just install a distro you like and feel comfortable maintaining. although 128 cores isn't many, you should probably think about the style of install you want. lots of people seem to like diskful installs - afaikt purely because it's familiar. most significant clustering sites use diskless (NFS root) though, because it's so much easier to maintain. there's never any question of nodes getting out of sync. traffic due to NFS root is trivial. another best-practice is to configure 1-2 admin nodes (no users, provides NFS, scheduling and monitoring services), one or more dedicated login nodes, and discourage users from touching the compute nodes directly (among other things, give them non-routable addresses.) get or make a ticket system to keep track of user and system issues. monitor the heck out of your systems.
I'm an HPC center admin and system programmer, 10+ years. I think we've been in the top 50 several times.
Quantian live cd has backported Openmosix kernel and beaucoup science and math goodies.
http://dirk.eddelbuettel.com/quantian.html or if website is down just google " quantian" and check out the cached page.
I found it not hard to use, when I last used it a few years ago, which was a nice feature when Openmosix was in progress.
Frankly I miss it.
*Repent!Quit Your Job!Slack Off!The World Ends Tomorrow and You May Die!
Just imagine a Beowulf cluster of bullcrap!
Can you imagine the licensing costs of Windows bullcrap(tm)? At least with Linux it's free...
There's no place like
I maintain multiple ~100 core clusters and made extensive use of a few other clusters each with 10k-100k cores. RedHat, CentOS, Debian, Suse, Cray Linux, etc. They all use something slightly different.
What it really boils down to is this:
(a) Is it Linux?
(b) Are you, as the primary maintainer, comfortable with it?
If your distro of choice answers "Yes" to both of those, then you've made a decent choice.
If you end up going to tens of thousands of cores, the choice will make more of a difference. But at your scale, it's really just what you're comfortable with.
Actually, in the realm of biomedical supercomputing, "none of the above" has already been done. Check out the Anton supercomputer designed and built by D.E. Shaw Research. The entire supercomputer, right down to all of the processor cores themselves, were specially designed and built specifically for molecular dynamics research. The system has no operating system and, as such, no overhead. Every processor cycle goes straight into the calculations. It is capable of churning out simulations of 150,000+ atom protein complexes on the order of several microseconds long, using wallclock CPU time of a few days.
If you are in a Debian shop, use Debian.
If you are in a RedHat shop, use RedHat.
The main reason is that if you already have other large clusters running $distro then someone already figured out deployment, maintenance, package management, drivers and a hardware vendor. Picking up a new distro throws aside major piles of work that have already been done and gives you the pleasure of re-inventing the wheel.
I'm guessing you'd rather get your cluster up and running, doing Real Work, rather than spending a bunch of time getting user authentication working correctly. Especially when somebody already did that.
Also, keep in mind that you do things like go on vacation and get sick, so the other people who are intimately familiar with whatever you already have can help out (and you can help them, too)
As an admin of a cluster that has evolved and changed over the last 7 years, I think I can help a bit. That being said, you truly failed at defining your real needs. I understand liking to stay with what people know, but from an end user point of view with interactions with a cluster, the only interface that they use is the only thing that needs to stay similar, and you can almost certainly use the same interface on any other linux distribution. So that said, what are you currently using to submit, queue, and schedule jobs? There are a few proprietary solutions out there and several open ones. There is Grid Engine (or Sun/Oracle Grid Engine), PBS, Maui, Torque, and several others out there. That should be the only real interface that an end user should have to the cluster.
Now comes the second question I have. Why do they need to be running X? I can understand having the X server installed and all the libraries, but you absolutely should only be running your servers at run level 3 (i.e. command line). You can still run applications if you set your display to a remote X server as the output device, in this case one that you run on windows desktop like Cygwin, or Xming. All you do by running X.org on the cluster nodes is waste about 1 gig of memory and 5-10% CPU resources, which could be utilized by your end users' jobs/applications.
Third, what kind of applications are your end users running? Are they real parallel environment applications using some sort of MPI (LamMIP/OpenMPI), or off the shelf products like Clustered Matlab (which actually uses MPI, but it is built into the product already, you just need to configure it properly)? Or are you really just running lots of batch jobs which may or may not be multithreaded applications, but do not do any intra-node communication?
Fourthly, how are you monitoring your existing cluster? Are you using something like Ganglia?
Finally, what kinds of third party software do you need to be able run/use? Is there anything that is commercial which may have limited support to specific linux distributions?
All of those things are questions that you need to really answer in order to recommend a distro.
All things being equal, personally, I would deploy a cluster using "Rocks Cluster" distro. It is designed from the ground up to be a easy to maintain and deploy cluster distribution. There are plenty of HPC specific packages/application/libraries available to be deployed on the nodes. "Rolls" are available, which basically contain a group of packages/applications/tools which are typically used together, or otherwise easily configure/install software that is required on each system, possibly with some complex interactions (for instance there is a "Ganglia" roll, which easily installs the Ganglia cluster monitoring software and automatically sets it up based on your Rocks installation. There is a "BIO" roll, which contains many open source tools and librarys which are useful in doing biological research clusters, like ClustalW, Glimmer, NCBI BLAST, just to name a few. Then there is the HPC roll, which is just some basic things like MPICH, MPICH2, OpenMPI, iozone, iperf. There is also a roll for PVFS for setting up a quick Parallel Virtual File System cluster).
It is designed from the ground up to be a cluster, not just a bunch of nodes running linux with high speed interconnects. It has management utilities to deploy applications across all nodes at once, quickly install OS on all your cluster nodes via PXE booting the compute nodes. Flash/upgrade the BIOS of computer nodes remotely via PXE boot. Basically it is designed to be managed and maintained as a cluster, not "x" number of individual systems. Seriously consider something like it.
http://www.rocksclusters.org/wordpress/
We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
Admittedly, my experience is a few years out of date, but it used to be that the immediate answer to this question was Scyld, the direct descendant of the original "Beowulf" cluster created at Goddard Space Flight Center by Donald Becker. We used it for 3d rendering and video processing and it was really slick, and being based on RHEL it was easy to get people who knew how to work on it/software updates/support in forums, etc.
I've only seen one comment in support of Scyld here, has it fallen out of favor for some reason?
If you are planning a cluster for numerical or combinatorial computations the best option is using an nVidia card. You will have about 300 cores for at most $3,000. And there is no special cooling and space requirements.
There are commercial nVidia drivers for Linux but Red Hat may be the best option due to its support.
I really think the cluster epoch is over, now GPU computing si the most economic and efficient way for massive computation.
You can also get AMD cards but nVidia has better technology (CUDA)
not years, months... The same bug has existed in portmap since maverick and was, I assume, back ported to lucid, which would be why my lucid update broke. The system hadn't been updated since January, which is why I only picked it up now.
The problem with that suggestion is that the people maintaining the code don't have a clue what QA means.
Gentoo developers don't maintain code, they maintain software packages. That means our main objective is to distribute the software to the users without minimal modifications, so that you get pretty much what upstream developers distribute. The only exceptions are when a patch that fixes a bug, security vulnerability or even a build problem, is available, then we would try to integrate it earlier than upstream. We also ensure the build system works, and we have documented policies on how to do that. Besides the regular stabilization process, there is also a QA team responsible for checking minimal ebuild (the portage recipes on how to compile software) quality. I had some commits reviewed by them so believe me when I say they are quite picky:
http://www.gentoo.org/proj/en/glep/glep-0048.html
And before people whine - I used Gentoo as my primary distro for around three years. The emerge system is great - but the data inside is crap.
For someone using Gentoo for around three years, that doesn't seem like a very insightful answer does it? What is the "data inside"? Are tou referring to ebuilds? :)
If you want to build your stuff from source and actually have a working system, look at the Debian-based distros. There's this nifty "apt-build" thing that lets you build software with whatever compile options you want (so you can still do -O3 -funroll-loops on everything if you really hate memory), just like Gentoo does
It is not the same thing. I was told by some people that they needed to be constantly asking the sysadmins to install the development packages of scientific libraries in Suse Linux (same applies to debian), depending on the use case, that can take weeks.
And there are packages for just about everything; partially because Debian's been around forever, and partially because "just about everyone" uses Ubuntu now.
Debian is not very famous for up to date software or is it? Sure you can add alternate repositories but you don't need to do that on Gentoo..
The primary difference is that you can build with source code that will actually work, and probably won't blow your system up when you just do a routine update. Wheras with Gentoo, some random kid who's too 'leet for testing might just promote to stable a new version of Xorg or Apache (both real examples from experience) which works fine on his system but breaks everyone else's in the world. And by "might" I mean "will". :)
Obviously the Gentoo's policy on package stabilization can't catch every single package problem out there. The policy was somewhat made to allow a fair trade-off between stability and availability. We could increase stabilization times but software would be available in a less timely manner...
Major breakages are not only Gentoo developer's fault. Sure sometimes a Gentoo developer messes up and makes its users rebuild the entire installed software, but most of the times are either bad decisions from upstream developers or because a major change (which breaks stuff) is really needed. If you understand how library linking and versioning works, I don't think I have to explain further..
Oh and by the way, latest versions of Portage have a nice feature called "preserve-libs" which prevents breakage if the API of a library changes..
I'm posting that mostly because quite a few Gentoo users think that only Gentoo (and maybe some of the BSDs) can easily rebuild a system from source, so they put up with atrocious quality assurance (which is admittedly extremely difficult given the Gentoo user base, and supposedly has gotten better) because they don't know that there are quite usable alternatives that are also mor
... because then you can take advantage of the cluster. It would take seconds(or minutes if installing KDE) instead of days and weeks for the compilation... don't forget MAKEOPTS="-j129" as the manual states ...
Hello,
Thank you all for the informative replies, this will help us in deciding what to use.
It seems that Redhat or a variant thereof is what most of you agree is good, so we will probably go with one of those. Especially since that is what we have used in the past.
The reason for having X is that we work in X, some of the software we use need that for various reasons such as plotting. This will only be used on one node. Since this will be a small cluster (probably 4 boxes with 32 cores each) we do not intend on building a separate box for running X. We might use one of the old boxes for X, but I think we still would want the same dist on all of them for simplicity. (Oh, and to those who asked: these will be in racks and not used for desktops)
Answer to another question that came up: This is for use at a university, we will be using it mainly for (nuclear physics) simulations/calculations based on Monte Carlo methods.
Again, many thanks!
Currently supports RHEL, spotty support for Fedora, Scientific Linux, CentOS, SuSE Linux Enterprise 11, Windows 2008 and up, and ESXi 4 and up.
Debian and Ubuntu have made appearances in trunk, but I haven't tried it out personally yet.
XML is like violence. If it doesn't solve the problem, use more.
After 7 years working at CERN on the GRID project for LHC, i would recommand scientific linux, the target of this distro is to run the largest GRID in the world gLite on a free redhat, you have also in the EPEL repository many tools for large scale computing maintain by the GRID team in CERN and Fermilab.
Can you imagine the licensing costs of Windows bullcrap(tm)? At least with Linux it's free...
Until you need technical support or drivers for new devices.
Pigskin-Referee
Linux: Yesterday's technology, tomorrow
If you do not have cutting edge devices on you system, FreeBSD might be a good choice. It is quite stable although the number of devices it supports is somewhat limited. It also offers a fairly good support system.
Pigskin-Referee
Linux: Yesterday's technology, tomorrow
The distro does matter, often in ways not particular to being a cluster, but perhaps in ways making it easy to manage in general. For example, I'm moving away from Ubuntu (server) because it is too hard to selectively upgrade a single package or group of packages without imposing an upgrade on other packages. This is where "hand holding" has turned into "wrist crushing". So I'm moving to Slackware (which is getting a lot more capability through the SlackBuilds community).
now we need to go OSS in diesel cars
Bah.
Any statements about 'up to date' software immediately shows a glaring lack of comprehension about code stability.
Debian is only behind, if you like to use beta quality software. People with server farms, managing large quantities of data, don't WANT the latest and greatest, they want STABILITY. Stability is thousands, yes thousands of times more important than new features in code.
Gentoo has its place. However, that place is not anywhere near a data center, not anywhere near a corporate office, not anywhere near a server farm. Anyone with any competence in the real world won't use Gentoo for serious work, for the reasons listed above. Frankly, if you show me a resume with the word "Gentoo" on it, you're not going to get hired.
Gentoo's very nature ensures that it will *ALWAYS* be a BETA or even ALPHA quality build product. That's not because it's compiled from source, that's because of the way Gentoo manages packages, and because of a dozen different things that are in other projects to work towards stability. Gentoo seems to think that nothing is more important than the latest and greatest.. and as a result....
Well... instability is what you get.
(before people get all silly about this, that doesn't mean Gentoo doesn't have its place. However, stop trying to tell me that a home-built car should be deemed street worthy -- without even having to abide by the current legislation for street worth cars!)
(Lastly -- comments from the above post, such as "We could increase stabilization times but software would be available in a less timely manner..." and "Gentoo developers don't maintain code, they maintain software packages." and "but most of the times are either bad decisions from upstream developers or because a major change (which breaks stuff) is really needed." shows how stability is the last thing on a Gentoo package maintainer's mind...)
Competitive market pricing for technical support or drivers (hardware vendors often provide) for new devices is available for Linux, BSD....
MS, Apple, Oracle... and hardware vendors will at their discretion provide the same as L/FOSS at a higher non-competitive price and bug-fixes or crap-design current WinOffice toolbar when/if they want.
Closed-crap software is never competitive, but is customer-hostage focused for gross-profits and low MOTSSS overhead.
IOW: If you want pfuck yourself, but don't phuck US, EU, or others.
Unaccountable leaders are masters, and unrepresented people are slaves. How do US and EU fare?
If you have experience with a particular distribution, go with that. I set up a 512 core cluster using Debian about five years ago. If you go that route, I suggest using FAI for installation. That way you can re-image your systems on reboot and easily keep things up-to-date and make config changes system-wide by just rebooting your nodes. Many software packages both commercial and open source are RedHat focused. I had to create my own deb packages for many softwares. This trend is not as strong as it was before, but RedHat still dominates the software world. Take that into consideration and know what you're using it for. As for building a RedHat cluster, I can't comment on that like all of the others who have never built one. I don't have enough experience to give any thoughts towards it.
I am sorry but your reply is sliding a little into a flame war and possibly out of context, so I'm going to stop right here.
You're specifying a solution before you seem to really articulated your requirements. For example, you have identified the following:
1) ...new cluster (smallish, ~128 cores)
2) It has to have an X-windows server
3) Implied use of a Linux distribution
These are all different aspects of the solution. What you should first do identify and document some use cases and some performance requirements. The closest you come to this is:
A) User familiarity
B) Remote access
But these alone are insufficient to justify any expense implementing a new system. Therefore, I suggest you don't upgrade and instead use the current system.
Yes. Next question?
Never ascribe to malice that which can adequately be explained by tenure.
That wouldn't be true for one app running a few very taxing queries, but in our case it's getting hammered by dozens of apps running hundreds of smaller queries per second, which parallelizes rather well.
I'm sure I'll get hate for pointing this out but its true: Linux is free if your time is worthless. I've have looked into offering Linux as an alternative OS in my little retail shop for years, and every year i find nothing has changed. Until Torvalds either retires or someone fires his irritating ass so that Linux can FINALLY, after everyone else (Solaris, BSD, OSX and Windows, hell even OS/2) has had them for over a decade, get a stable kernel level hardware ABI so drivers don't shit themselves and die every time Linus gets an itch to fuck shit up in the kernel, then Linux will remain a black hole of time wasting where you have to spend days or even a week or more every six months doing the "forum hunts" trying to find "fixes' for the multitude of drivers Torvalds breaks constantly.
Which makes sense if anyone would look at it from an engineering instead of religious dogma perspective, as you are talking about literally tens of thousands of drivers nearly all of which have to interact with a kernel that Torvalds treats as his personal plaything and with little regard to the thousands of man hours he is pissing away, not only by all the developers that have to go in and ifx what he has broken but in all the hours users waste with forum hunts.
Not to mention how many Linux users it ultimately ends up costing because mom&pop retailers like me, the kind of guys YOU NEED to get on board Linux, as we have NO real ties or support from MSFT and in our position could really help Linux with sales and after sale support who stay away from your OS because with all the man hours forum hunts suck up it makes Linux literally MORE expensive than Windows. My time is a minimum $35 an hour, at the rate it only takes 2.5 hours to make Linux more expensive than Windows 7 HP and I can easily waste half a day on a forum hunt, what with searching for, tweaking, and multiple attempts to get said fix working.
So until the day comes I can sell a box with Linux on it and be confident that the drivers will continue working for at LEAST three years minimum, preferably five, then "Linux is free if your time is worthless" is simply the truth for all those that do this as something other than a hobby. Home users aren't gonna learn CLI to apply fixes, neither are SMBs and SOHO, and they sure as hell aren't gonna sit around with a list of make/model/rev of every piece of hardware they have in order to do forum hunts.
I want Linux to succeed in the desktop and retail markets, I really really do. I grew up in the days of GEM and Commodore and having lots of choices, and I believe lots of choice makes for a healthy and vibrant ecosystem. but someone is gonna have to face the fact that Torvalds is a douchebag. It is all well and good he invented the kernel, but it ain't 1991 anymore and Linux isn't just some plaything for Torvalds to futz with and share his changes over IRQ. The kernel is the heart of a multi-billion dollar OS, being counted on by millions, yet Torvalds treats it NO differently than he did at the beginning.
And before anyone says "LTS" let me say LTS is a bad joke. As long as much software is tied to which kernel you are using LTS is a codeword for "run out of date and possibly insecure software" and it is just ridiculous. It ain't 1991 folks, having drivers shit themselves and die is simply unacceptable in this day and age, especially when your competition gives on average a decade of support for their OS. Frankly this problem would be trivial to fix with a stable ABI for drivers, but Torvalds and his ego won't admit he made a mistake. The current way was fine when it was a hobbyist OS, it simply isn't anymore. Now you either shell out for expensive enterprise gear (which negates any savings by going Linux) where a team of developers have to constantly fix drivers for the life of the contract, or you are SOL, because you'll be wasting time on forum hunts. Sorry but that is just unacceptable.
ACs don't waste your time replying, your posts are never seen by me.
Until you need technical support or drivers for new devices.
I know math is hard--but let's see if I can break it down for you. Microsoft is $259 per incident. Ubuntu us $320 per year.
I know the Ubuntu number looks bigger until you realize that you can call 50 times for that same price. Calling 50 times for Windows would cost you just under $13,000.
But who calls for tech support for Linux anyways? I build and maintain ubuntu-based firewalls, spam filters, mail servers, virtual servers, and VoIP servers. I've never had to call for support.
There's no place like
I'm sure I'll get hate for pointing this out but its true: Linux is free if your time is worthless.
You misunderstand time then. It costs me nothing but time to setup a linux workstation for my wife. I spend under an hour, and she has a clean, non-virus-infected netbook. If I went the Windows route (because as you say, my time isn't worthless), I have to go shell out ~$150ish for Windows for her netbook...and I have to go work for almost a full day in order to pay for that. So not only do I waste an hour of time installing it for her, I waste a day working on my day job to pay for it. No thank you.
As for the business world, would you rather pay someone $1,000 for a mail server install or $5,000 for a mail server install. In case you're confused, the $1,000 install is entirely paying for my time to setup whatever mail options you want. In the case of the $5,000 install, $3,500 is for Microsoft software licensing and $1,500 is for my install time with the options Microsoft lets you have.
But whatever--keep telling yourself that linux sucks because of ABI breakage instead of the real reason: you want to keep your drivers closed source so you can lock users in, and you're too slow or stupid to recompile your drivers.
My wife has been running Ubuntu for 4 years now, the only time she called me was when her SSD died. She did the upgrades too. So where's the problem?
There's no place like
First, thanks for a rational response to a topic you probably find marginally offensive, given your implied role with the Gentoo project.
I reject the idea that packagers are only responsible for making a package compile with minimal changes. Someone needs to be looking at the whole picture instead of focusing on their small slice of the world, and packagers are in the best place to do that (or at least play a huge role in that). I see that constantly in my day job (enterprise security) where every business area only cares about their piece of the pie, completely ignoring (or just not understanding) how their slice fits in to the whole picture. It's frustrating there, and frustrating in my OS. :)
I thought it somewhat contextually obvious that the "data inside" referred to the primary data source for portage, but yes, "bundles of source code and supporting files known as ebuilds". :)
I don't know what you're talking about with the reference to Scientific Linux (RedHat-based) and SuSE (which is, I suppose, SuSE-based); neither of those are debian-derived; both are RPM-based distros that I dislike. :) Debian and Ubuntu are common Debian distros. Here's the first useful Google result related to apt-build - https://nigibox.wordpress.com/2009/10/01/apt-build-%E2%80%94-optimize-your-debian/ - I'd suggest reading about it more, and about apt in general. At a high level, you can pin package versions from multiple repositories with apt, and you can rebuild everything from just one package and its dependencies up to the whole darned system with apt-build. Portage is a cool system, but if you look in-depth, the apt/dpkg world has a very comparable feature set. It does not suck nearly as much as rpm (even with yum/yast wrapped around it), or other package systems like pkgtool, or whatever it was that Stampede used (it's been a while since an i686-native distro was a novel idea), or HP's POS swtool, or AIX's lpp format, or...
Debian Stable isn't known for up-to-date code, as that branch's goal is somewhat obviously "stability". You can use "unstable" and get very up-to-date code, or you can use a derivative like Ubuntu for a pretty good compromise in between. :)
I will wholeheartedly embrace the idea that many (in fact, most) Gentoo problems are user problems. But there are still way more problems than acceptable which are issues which maintainers should have caught. I'm willing to grant that it's way hard to catch the problems I find unacceptable - between upstream changes and downstream Stupid Users(R), there are just too many variables for anyone to manage. Ultimately it comes down to the distro user's personal level of tolerance; my tolerance is pretty low, but just slightly higher than Gentoo could previously reach. Other people have different tolerance levels, and I don't think they're stupid for using Gentoo. Heck, I support RHEL during my day job, and I *hate* the way RedHat does things (both in the distro and as a business) - but I don't for a moment think my employer is stupid for wanting to use RHEL. I endorse the variety of distros and people's choice to use the distro which best suits their needs. I do think that Gentoo fills a pretty narrow niche, though, and that it's a poor choice for environments where stability or reliability are the top priorities. Based on previous experience which may no longer be completely valid - Gentoo only fills a stability need well through the use of a mostly-binary install, and at that point, Gentoo's primary benefits are very much diminished.
I do like the Gentoo philosophy, though, and I've heard that things have turned around after the initial turmoil after Daniel Robbins left. But honestly, Gentoo offers me zero benefits over Ubuntu at this point. I get acceptable stability, and a very flexible build environment in the rare case that I need that. The only decen
whatever you'll do, you will have to work, to learn more about your Linux system, or to subcontract someone to (or buy support) to run your cluster and help your users. The point is then: do you have a budget (time and money) for that? Are you interested yourself to learn more about Linux systems (hence to spend less time on numerical codes or science)? If not, you'll need to pay someone to do the work. If yes, you need to learn a lot.
And here comes the religious dogma I was talking about! Isn't it funny that the argument against having stable functioning drivers always comes down to IDEOLOGY, with the rant most people link to going so far as to call those that refuse to hand over source "leeches" and hope the kernel futzing breaks their drivers?
I mean WTF is it to you if some do and some don't? Is that ANY different than right now? Nope, as you still have companies like Nvidia that makes binary blobs, only now you get to watch them break every six months. Does having open drivers keep Linux from breaking? Nope again as the open drivers break just as often thanks to Linus and his kernel fucking, because if you could look at it logically instead of a faith based perspective you'd see that there are only so many devs, and there are fewer of them than drivers to fix so drivers will ALWAYS be broken when Linus gets a wild hair up his ass, every. single. time!
And allow me to say that if your way "worked" in any kind of reasonable fashion retailers wouldn't avoid your OS like the clap which I can assure you we most certainly do. Not just all the thousands of mom&pop shops, dotting the entire country, but big names like Best Buy, Staples, Walmart, do you think they avoid your OS because of its "quality construction" or a secret conspiracy? NO! It is because they take the box home, run updates when the little icon tells them to and get a broken machine like it is 1993 all over again, and promptly take that broke ass shit back! And since we retailers can't sell used as new that means we take a hit on every return making Linux even MORE expensive!
As a final word allow me to give you proof, undeniable proof like a slap to the face your current way is broke ass shit. Now I'm sure you'll find some excuse, like "Use Distro X" or "You should buy hardware Y" but in the end all you will have is excuses because this proof should make even YOU take note! Now you and your fellow converts think we retailers are just full of it, that it should "just work" right? well when one of the biggest OEMs on the planet has to DISABLE the repos and spend considerable money and man hours keeping a badly out of date "corporate repo" just for their customers because if they don't the drivers WILL break then i'm sure you can see why both little guys like me and big guys like Walmart and OEMs like ASUS have washed their hands of your OS. I mean when fricking netbooks, a class of machine built around Linux strengths and which started out more than 30% Linux ends up completely obliterated by a decade old Windows OS it is high time to ask yourself "What are we doing wrong?" and I'd say basing your OS on religion instead of sound design practices and trusting your customers to make purchases that will benefit them (such as choosing FOSS drivers where possible so they have LTS) is a good example of why Linux is so far behind everyone else, and why even free you are getting hammered by an OS with a $100 barrier to entry.
ACs don't waste your time replying, your posts are never seen by me.
Until you need technical support or drivers for new devices.
I know math is hard--but let's see if I can break it down for you. Microsoft is $259 per incident. Ubuntu us $320 per year.
I know the Ubuntu number looks bigger until you realize that you can call 50 times for that same price. Calling 50 times for Windows would cost you just under $13,000.
But who calls for tech support for Linux anyways? I build and maintain ubuntu-based firewalls, spam filters, mail servers, virtual servers, and VoIP servers. I've never had to call for support.
Actually, support starts at $195 per incident and there are several different plans.
There is no charge if Microsoft is unable to rectify the complainant's problem..
You have conveniently failed to address what I was stating in my original post; ie, the "free" factor evaporates once support is required. In effect both Microsoft and any allegedly "free" OS, the cost is the same for support as long as it is not required.
In 20 years I have never called MS for any technical support. I am able to read and comprehend technical manuals, etcetera rather well. Plus, when a new device is released I do not have to wait for months (years, never) for support for that device with Microsoft. I have customers that demand high quality service.
Now, I do have a FreeBSD server at home. It is a nice hobbyist toy and I do enjoy playing around with it from time to time. However, I would never use it in a mission critical environment.
Pigskin-Referee
Linux: Yesterday's technology, tomorrow
Scientific Linux still hasn't put out the 5.6 release, they instead went for the 6.0, while RedHat is at 6.1 because 6.0 is so very buggy.
please provide a list of all the supercomputer clusters running Debian in this world? I know the weather services in the Phiippines and Germany operate a Debian cluster for forecasting, but are there any others? The cluster Sysadmins of the world seem to go with RedHat or derived
And here comes the religious dogma I was talking about!
Yes, your ability to forsee that someone might disagree with you makes you correct.
Isn't it funny that the argument against having stable functioning drivers always comes down to IDEOLOGY,
Funny--I remember some bitching about ABI, but there was a whole ton of other crap you put in there about your ideology. I remember you bitching about having to hunt through forums (ever had to wade through a forum full of Windows noobs? "Uh, I rebooted and it fixed everything."), bitching about your time being oh so important, your retail woes, and other things completely irrelevant.
with the rant most people link to going so far as to call those that refuse to hand over source "leeches" and hope the kernel futzing breaks their drivers?
Nope--I don't call you a leech. I think you have a chosen business model (not to release the source because you want to lock people in), and that's fine. Just don't keep bitching about your inability to keep up. You seem like the kind of person who would have a business model around sending morse code via telegraph and then bitch that the internet costs way too much because the protocols keep changing every few decades (IPv4/IPv6) and you have to upgrade your router and switch your thinking from morse code to SMTP...all the while the telegraph becomes more and more obsolete.
I mean WTF is it to you if some do and some don't? Is that ANY different than right now? Nope, as you still have companies like Nvidia that makes binary blobs, only now you get to watch them break every six months.
Yup--and I don't buy their crap. The machines that I do work with that have nvidia work well though because they are using the open source driver made by the community. And while it has occasional issues too, it gets fixed faster than the Nvidia blob.
Does having open drivers keep Linux from breaking? Nope again as the open drivers break just as often thanks to Linus and his kernel fucking, because if you could look at it logically instead of a faith based perspective you'd see that there are only so many devs, and there are fewer of them than drivers to fix so drivers will ALWAYS be broken when Linus gets a wild hair up his ass, every. single. time!
And allow me to say that if your way "worked" in any kind of reasonable fashion retailers wouldn't avoid your OS like the clap which I can assure you we most certainly do.
Yeah--Amazon really *hates* linux. It's constantly fscking up their retail business... </sarcasm>
Not just all the thousands of mom&pop shops, dotting the entire country, but big names like Best Buy, Staples, Walmart, do you think they avoid your OS because of its "quality construction" or a secret conspiracy? NO! It is because they take the box home, run updates when the little icon tells them to and get a broken machine like it is 1993 all over again, and promptly take that broke ass shit back!
And since we retailers can't sell used as new that means
There's no place like
Actually, support starts at $195 per incident and there are several different plans.
I went to microsoft.com/support and clicked on server support. It starts at $259 everywhere I looked. The same server support for Ubuntu is more expensive for your first incident--but if you have two incidents, you're ahead of Microsoft.
There is no charge if Microsoft is unable to rectify the complainant's problem..
You have conveniently failed to address what I was stating in my original post; ie, the "free" factor evaporates once support is required. In effect both Microsoft and any allegedly "free" OS, the cost is the same for support as long as it is not required.
The free factor doesn't evaporate. Something like Ubuntu still costs $0 while Microsoft's offerings do not start at $0.
In 20 years I have never called MS for any technical support. I am able to read and comprehend technical manuals, etcetera rather well. Plus, when a new device is released I do not have to wait for months (years, never) for support for that device with Microsoft. I have customers that demand high quality service.
Really? You've been able to fix your own bugs with Windows ME, Windows Vista, Exchange 5.5, Sharepoint, etc? You're telling me in 20 years, you've never run into a developer-created bug or that you've magically prayed to the Ballmer and it wasn't an issue? I haven't even done that in Linux. The difference is I can fix most of my own Linux bugs. With Microsoft, you must call them--even if they end up acknowledging it and reversing the support charge at the end.
Which new devices have you used in Windows that weren't already available in Linux?
USB was supported first in Linux.
IPv6 was supported first in Linux.
I know wireless sucks in Linux, but that's because communication with a wireless card isn't a standard like IPv6 or USB is a standard.
Plugging crap into my windows box generates endless popups and disk thrashing while it searches for drivers and usually fails. In Linux, I plug it in and by the time I look back up at the screen, I have a camera or USB drive mounted, or even a bluetooth device ready to use...
Now, I do have a FreeBSD server at home. It is a nice hobbyist toy and I do enjoy playing around with it from time to time. However, I would never use it in a mission critical environment.
Funny--I talked with a guy yesterday who runs a site used heavily by the insurance business. He said when he launched the site he chose BSD because the Microsoft option was prohibitively expensive. He had 5 servers and two load balancers. He said other than the hardware, it cost him nothing. Contrast that with whatever server version of Windows does clustering and load balancing. I remember running it back in the 2000-era (iirc) and it was tens of thousands of dollars.
There's no place like
Do you even hear yourself? The amount of logical hop jumping and plain denial is just astounding! I guess denial isn't just a river in Egypt huh? Because the simple fact that you honestly believe that I should tell customers to learn to recompile their own drivers which BTW won't do SHIT when it comes to some of Linus's serious kernel fucking, is simply beyond ridiculous. How can you stand here with a straight face and claim your OS is ready for the masses ,em>if they need to compile their own drivers
And WTF do Windows forums have to do with shit/ Or Lowes? you NEVER need Windows forums after simply updating the OS whereas you BETTER be ready to spend an assload of time at your distros forums with make/model/rev thanks to updates breaking shit left and right, which again I linked to. This is a classic case of "moving the goalposts" as you refuse to acknowledge, even though I provided links rubbing your nose in it, that even Dell can't keep the drivers working which is beyond insanity! And who gives a shit what some enterprise, which BTW has these things called "admins" that get paid big bucks to deal with broken shit like drivers and has about as much to do with retail PC sales as a car does with an F-16, have to do with this discussion? Did I MENTION anywhere enterprise? Or say in any place that we were talking about, in no particular order, enterprise deployments, servers, routers, cell phones, or any other damned thing that isn't a retail Linux sale? Nope don't think so.
In the end the numbers don't lie. no retail B&M store will touch your OS, and after 20 years Linux is so far behind /. has an article congratulating Linux on reaching a whole 1%! Woo Hoo, and it only took 20 damned years! If the "community" continues like you with elitism, refusing to see problems and correct them, refusing to make things easy for the user, and most importantly refusing to keep Linus from constantly breaking shit, then don't be surprised that it takes Linux another 20 damned years to reach that magical 2%. The simple fact is it isn't 1993, and users aren't gonna jump through flaming hoops simply for "free as in freedom, fight teh power!" bullshit. you gotta be better or at the very least as good, and frankly with the kernel futzing Linux doesn't even rank as high as Windows 98 in my book, MAYBE Win 3.1. Because with Windows 98 I could actually take a RTM and update it to the last patch and the drivers still worked whereas the Linux update notifier may as well be a "Break Linux NOW!" button, for all the broken drivers. It is pretty God damned sad when you can't even run updates without your OS shitting itself, and if the choices are 1.-Give them a broken OS and telling them "RTFM Noob LOL!" 2.-Turning off ALL updates and leaving them as vulnerable as any other unpatched OS, or 3.-Installing Windows and at least having it run until EOL without having broken drivers? Well at $35 an hour it really only takes a single forum hunt to make your OS more expensive than Windows. But you pretend it is all a conspiracy, that we 'just don't understand" your OS. it reminds me of that old joke "I have no friends, Linux has no friends, maybe I can be Linux's friend!" because the public sure as hell ain't touching it!
ACs don't waste your time replying, your posts are never seen by me.
Haha
That was an accurate, yet ballsy post to mention in slashdot of all places. I got flamed and modded down when stating the obvious with Linux myself, yet I still like it as a Server OS. Linux usage has gone down according to statcounter from nearly 1% to .7%. A very big drop thanks to Windows 7. I used to use Linux but finally gave up on it for the reasons you described. I am contemplating installing it today in a VM so I can run a LAMP stack with PostgresSQL as well as Joomla. But I am in the small small minority of users.
Windows has it's weaknesses but being a consumer OS for business and home users is certainly not one fo them. Compiling a kernel is rediculous. I did PC support as a contractor on teh side and only mentioned Linux to a tech shop because the user needed a server for 10 users and didn't want to pay for Windows 2003 Server Small Business Edition and only needed a file server, domain controller, and a simple email and internet site. Linux fit the bill and hosted all 4 nicely, but that was supported server hardware and not for John in the Office to play his games or run Office on his Toshiba laptop with strange/cheap hardware. Dells do not even make good drivers for Windows in my experience and I hate them with a passion. However, since Michael Dell returned the quality has improved tremendously.
I read your posts and you know your stuff. I used to charge $75/hr when I lived in Alaska for the rates and it sounds like the $35/hr might be a little low for your expertise. I never heard of that app you mentioned that kills adware infected with Flash. I will give it a try since I use music on youtube and prefer not to live without it.
http://saveie6.com/
Dell and Asus once sold Linux briefly at BestBuy before they pulled it. Walmart did too. Why?
Because Joe Six Pack became furious as to why MS Office wouldn't work or why his resume created with OpenOffice looked like crap when a potential employer opend it with Word. OR why little Timmy's pc games with DirectX couldn't run on them? ETC.
Not to mention BestBuy realized that consumers buying these cheap linux books would not provide any profit margins by buying anti virus software and printers. They lose money on every machine sold and only make it by spammer you with accessories and software. Bad for retail ...
This is why Windows is here to stay. If you hate it, save up for a Mac. That is a consumer OS as well yet expensive that is higher quality than Linux or Windows. Or get one of those tablets running Andriod. There are options.
http://saveie6.com/