Domain: linuxvirtualserver.org
Stories and comments across the archive that link to linuxvirtualserver.org.
Comments · 57
-
Re:1000+ a day is trivial have you thought of amaz
You forgot to mention LVS
-
Re:1000+ a day is trivial have you thought of amaz
No mention of Linux Virtual Server?
It's not exactly easy to set up, but it provides all possible types of load balancing and even the load balancer itself is HA'd by heartbeat (in a 2-node LB cluster).
Downtimes can be reduced to single seconds.
(My LoadBalancer cluster switched if the master LB didn't check in for 5 whole seconds)Webserver reply times can be just as tight.
Client sessions can be bound to the answering webserver and the bindings can disappear when the designated webserver dies. -
LVS - back end web server
Grab a crappy old athlon tbird box with a gig of ram and set it up as a router/firewall running *LVS (Linux Virtual Server) to forward web requests to your back end web server. You can start out with one web server and gauge the load. If you want to scale the system, add more backend web servers and configure LVS with the new backend ip addresses.
For redundancy on the athlon router, trunk a couple nics for network, and boot from cdrom (knoppix) if you are worried about system disk failure. You could also buy a 3ware 2 lane raid card for a couple bills and sata raid a couple
hard disks if cdrom boot doesn't work for you. It's cheaper to keep a couple cdrom drives on hand, and spare knoppix
cds, than setup a bootable hard drive raid system.Figure out if you want a shared filesystem for the web servers, or just rsync the important stuff between them for
starters. Software raid on another crappy athlon box will work well for backend storage in the beginning. If you
have high disk load, you may need to upgrade the fileserver if transfer rates exceed bus bandwidth. The point is,
you are non-profit and running on a shoestring budget. start out cheap and dirty. Spend money on hardware later when
you find out where your bottlenecks develop.If you loose a backend webserver, LVS can be configured to handle it in different ways.
-
Some solutions
Others have already covered the "1000 users isn't much" aspect. Benchmark, and verify what each server can handle of your anticipated load, but they're probably right.
Option 1: Don't do it yourself. Look into renting servers from a hosting company. They will often provide HA and load balancing for free if you get a couple servers. Also, having rented servers makes it much easier to scale. If you find that you have 100,000 uniques per day, you can order up a bunch more servers and meet the load within minutes to hours. If you overbought, you can scale back down just as fast.
Option 2: http://www.linuxvirtualserver.org/ plus http://www.linux-ha.org/ . You use LVS to load balance out to a cluster (including removing failed servers from the pool). You use HA so that two LVS machines can fail over to each other. Note that you can run LVS on the same machines as your load, for a small environment. This is much more DIY than the Windows setup, of course... But honestly, if the setup requirements of this scare you away, then you're not ready to run a fault-tolerant network, regardless of OS.
Option 3: http://www.redhat.com/cluster_suite/ . Less DIY, more money. Perhaps that's better for you.
Option 4: Buy a commercial solution. Every major network vendor sells a HA/LB product. I've used them from most of the big players... I'm not going to write a review here, but it'll suffice to say that while they each have their good and bad points, any of them will get the job you've outlined done.
As for the network: The general rule is to reduce your single points of failures to the minimum you can afford. Common ones are: The ISP (BGP is a pain); the routers (Each ISP goes to its own router); the switches between (you need to full-mesh links from the two routers to two switches, down through the line as many layers as it goes; your switches need to run STP or be layer 3 switches running OSPF or another routing protocol; don't forget to plug the load balancers into different switches); the power (Servers, switches, and routers on separate UPSes such that losing one will leave a fully functioning path); and depending on how far you want to take this, the data center itself (in case of fire/meteor/EPO mishaps).
Note that all of this is required even for your Windows solution. Are you sure you don't want option 1?
:) -
Re:Reference Materials
http://www.linuxvirtualserver.org/ or anything about F5 BigIPs. Most of understanding load balancing is about understanding (a) how to fool layers above you in the OSI stack (switching on layer 4 through 7 -- particularly 7 -- can take a while to wrap your head around) and (b) the algorithms to pick which physical server gets the next connection (round robin, least connections, predictive, whatever).
-
Re:Use a forwarding server on the front end
Or go open source: http://www.linuxvirtualserver.org/ has a load balancing agent.
-
You need to start looking at server redundancy
Drives die. Fans die. Power supplies die. Motherboards die. Having a RAID array is not enough. There are plenty of other things in a single system that can go wrong that can take the system down for a period of time. The biggest issue I have with the limited description here is the fact you are talking about one system. If you want availability, you need to be looking at scaling by the machine.
Now you can just have other machines waiting to take the load with a quick reconfig, or you can start doing things automatically (people have mentioned using things like Nagios for monitoring, but monitoring doesn't give you uptime . .it gives you response to minimize downtime . . .)
If you want to look at solutions that don't cost, check out LVS. It'll allow you to balance your ports across multiple systems (you can even balance win32 and linux systems if you for some reason wanted to) with a couple different methods (I prefer the DR method myself). The setup isn't that bad with several of the recent kernels in the major distro's including all the ipvs turned on by default, so you may not even have to recompile a kernel on your balancer systems.
Now, of course you can't depend on a single balancer any more than you can depend on a single web server; there is support using the HA linux stuff to allow you to have backup LVS systems to take over as a balancer if your primary balancer bites it (heartbeat and ldirectord is your friend here).
With a pair of fairly low end systems and some monkey work at the keys, you can have a system that will balance your tcp traffic (or setup an automagic failover from a system) that can be as good as some of the commercial balancing products out there.
I currently use LVS with heartbeat/ldirector to balance the following:
Win32 Apache Servers
Linux Apache Servers
Win2k3 IIS Servers (the LVS system balanced better than the built in WLBS from MS . . .and there was a lot less broadcasts
Postfix
Amavisd-new
And as others have mentioned about setting up some good monitoring (ala Nagios if you want), we monitor the virtual services on the LVS systems in addition to the real servers' services so that we can know if we are still delivery service externally even though real server B is down...
When you get bigger, then you should even start looking at having datacenter redundancy . . . deploying the meteor net never seems to be the right answer to the 'Force Majeur' question . . . -
POstfix + Mysql
Look at postfix + mysql
http://www.sweeney.demon.co.uk/pfix_imap_virtual.h tml
Mostly, U will need a cluster for everything.
If you are seeking for a all around opensource, start with this link, later, to use LVS, the tool for makeking load balancing clusters go here:
http://www.linuxvirtualserver.org/
And if you really are looking for a opensource cheap software costs (not very cheap tco) also you can build your OWN san with ata over ethernet:
http://sourceforge.net/projects/aoetools/
And for webmail a usefull but also ligth interface:
http://www.squirrelmail.org/
With all the licence cost savings, you can Invest a lot of time, and have a fair amount of flexibility.
Sendmail inc, has high availability solutions:
www.sendmail.com
Also, you can spend a lot of money and buy a very bit IBM machine with lots, and lots of lotus notes licenses, with that kind of money spent, you can put IBM at your knees if a lawer makes a good contract..
Also, to complete the solution you can setup nagios and mrtg for monitoring.
http://www.nagios.org/
http://people.ee.ethz.ch/~oetiker/webtools/mrtg/
I think, to setup the hole thing, U will need, like about 50 good servers, (maybe u can try IBM openpower with virtualization, it IS a risc CPU), and like.. humm.. a month of technical tests...
The mysql backend will give you centralized administration, LVS will provide scalability and good servers will give you uptime...
And if EVEN you like, you can make a Linux Routers using sangoma hardware:
http://wwww.sangoma.com/
Everything can be done with Linux by now... The cuestion is how much responsability do you want to have regarding the stability, and overall functionality of the solution.
IBM, HP, RedHat, SuSe, and ANY Linux Consulting firm would be interested in having you as a success history.
Good Luck, and May the Source be With You -
Re:Free, but not better
This is also possible with lvs and is called DR http://www.linuxvirtualserver.org/VS-DRouting.htm
l -
Stepwise Deployment + advances in virtualizationFirstly : Agile Methods in System Administration == Stepwise Deployment.
Incremental adding of features, upgrading, transitions etc. One step at a time, evaluating the value of each step.It is not as easy as it sounds, although recent advances in virtualization at the OS level and service level is going to make it a lot more interesting if not easier.
-
Very nice
Installing and administering the various open source tools can be tedious work, especially without documentation of how to put things together.
A quick Google search though reveals a lot of free papers and manuals on this very topic. -
Re:SFW
Why I'm responding to a troll is beyond me, but I'll point out 2 reasons why even the trolls should RTFA.
According to SCO's own release and the review, a maximum of 8 processors are supported, not "scaling to hundreds of CPUs" as the parent states. Also, the review actually said more about SCO's products than I've ever gotten from SCO themselves, even back in '95 when I was looking for a UNIX for Intel (I chose Linux mainly because I couldn't find enough info on SCO, and the BSD documentation was something I wasn't able to make sense out of at the time). Admin GUIs are not something I expected from SCO, but apparently they're there. Their clustering technology is intriguing, and is another thing I didn't know they were even capable of.
If for no other reason than to "know your enemy" a good "technical" review of their product speaks more than any press on either sides of the lawsuits can for the company in the long run.
For those that must know, I run a number of servers, mostly Red hat ES 3.0 servers (including a 3 tier LVS cluster), with some Win 2k/2003 mixed in, and am writing this from a Powerbook running OS X. It's glad to know that is doesn't sound like SCO has made any jumps that would make me consider their product for work, so I need not fear the dark side. -
Re:yes, that's actually the basic idea
What you've said is true only for the case of web sites with static or almost static content, where you could have the content in the local drives of each webservers, and use rsync to distribute new content (web site changes) to all the servers.
But it's a very different situation when your webservers handle very dinamic content, specially when the content is upload by the users. In this case, you have three alternatives:
1) Content in the database. Is up to you to use a clustered database to provide High Availability and Load Balancing
2) Content in a NAS (NFS, etc.). You have the same content for all the webservers, and with drbd you achieve High Availability... but you don't have Load Balancing.
3) You use GFS or other distributed File System (don't know the issues on this option).
btw, for load balancing at the IP level I would recommend Linux Virtual Server, and Heartbeat to achieve High Availability in the balancers. -
Opentext Livelink
Don't re-invent the wheel. Get a customizable product and an expert that can customize it.
I suggest Livelink. Well, it's not free. It costs money. It may cost lots of money if you want all those nice features. It's not open source. But I have enough Karma to burn.
;-)Web page: http://www.opentext.com/
The consulting company I work for is based on knowledge. Fast, reliable and secure (permisson based) access to archived knowledge is mission critical. So there never was a problem buying the software we need for business, no matter what it costs.
My job is not Livelink. But I work in the same room as our Livelink expert. So I collect a little bit of knowlegde about Livelink. I'm the one he asks for Unix and network tricks.
Livelink has a document management (that's the main part), team rooms, workstreams, and a lot of other nice features. For details, have a look at the web page. Livelink is a core server, extended by a lot of scripts (in a custom language named Oscript), and a tiny CGI that passes requests from the webserver to the core server. If you own a development kit, you can customize nearly every aspect of Livelink, and you can see lots of code written by Opentext. So if you have the money, you can at least see most of the sources.
We use three dual-CPU W2K machines with Apache 1.3.x as Web and application servers, a fourth dual-CPU W2K machine for the indexer and search engine, a Sun 420 running Solaris 9 for the database (Oracle), and Linux Virtual Server (LVS) as load balancer for the webservers. We have about 1500 users all around the world.
Why so many servers? Most of the time, one web server is completely idle. Opentext would recommend a single server setup, and that would be sufficient. But we have demanding consultants, our problems are response time and availability. We have some queries that block a server for a while. So we need at least two servers. The third server is for load peaks and for downtimes of one of the other servers. Index and search also need a lot of power that would block a single machine, so it's placed on the fourth server.
Why W2K? The most recent version of Livelink requires it.
Why Sun? Oracle on Windows simply sucks, the raw CPU power of the previous multi-CPU x86 database machine was larger than the one of the Sun machine, but Oracle runs much faster on the Sun. (Now all corporate databases are switched to a Oracle/Sun cluster, but that's a different story.)
Why LVS? Simple: It works. We tried a load-balancing software called Resonate, a really fitting name for a piece of software that should implement a control loop. We kicked it because it was hard to maintain and did not work reliably on our machines. We tried LVS on a really old desktop and it worked great, even if we tried really hard to confuse it. Now it has its own x86 server running Slackware, and we did not have a single second of trouble with it.
Why Apache? We used Netscape Enterprise Server / iPlanet. It had a pretty web-based config tool and much bloat, and it costs money. Apache does the same job for free, and its configuration is a simple text file that can be copied to the various servers. MS IIS has bugs. Lots of bugs. Its mouse controlled. We did not even think about a test system with the IIS.
Tux2000
-
Re:My question is....Even I don't think you are trolling, so just to start you off in your quest:
- China - Wensong Zhang, LVS Project
- Japan - Kunihiro Ishiguro, Zebra Project
- India - Naba Kumar, Anjuta Project
-
Re:try grep -i (TRY UPPERCASE SEARCH)It didn't take very long to find out where the IP Virtual Server code came from.
http://www.ussg.iu.edu/hypermail/linux/kernel/001
1 .1/0909.html> > I noticed that the ip_vs.h include is not in the main kernel tree or ip
A list of all the patches for IP Virtual Server can be found here: http://www.linuxvirtualserver.org/software/ipvs.h
> > virtual switch support while I was attempting to buid the pirahnna web
> > server. Is this module a patch located somewhere else on
> > ftp.kernel.org.
>
> Jeff,
> Red Hat started included the IPVS patches from
> http://www.linuxvirtualserver.org/ starting with RH6.1 (I believe). You
> can find the patch they use in the kernel src.rpm, or go get the patch
> from the URL listed above.
Dax,
Thanks. I noticed the pirahna web server rpm would rebuild unless the
kernel had this patch. I was wondering why it wasn't in the stock kernels
since it's GPL. We may want to consider including it.t ml -
What broken ass load balencer...are you using that it dosent have Persistence, (Affinity in cisco-speak)...
Especially when you can get it for free linux virtual server. Yes you could pass along all your values in hidden form fields. But then you could also write a C++ compiler in cobol.
-
Re:Damn - nearly got excited
You mean like LVS?
-
Re:what about N1?
Actually I think Tivoli and N1 are very similar product. Both of them allow bunch of hardware to be managed as a single resource rather than inidividual server/equipment. This is done using grouping of hardware into single resource, and then running agent to monitor them.
How is that different than the Linux virtual server project? -
UltraMonkey is LVSI said it last time this came up in 1999 and I'll say it again. Ultramonkey is a combination of LVS (for balancing) and other tools (for fail detection, weighting, etc.).
It doesn't make very much sense to say "Should I use UltraMonkey or LVS?" as the latter is a piece of the former. There are other combinations of LVS+other stuff that you might put into that sentence: "Should I use Piranha or UltraMonkey?" or "Should I use UltraMonkey or Joe Macks LVS Config scripts?" or even "Should I build my own LVS scripts or use an existing framework?"
There are other HTTP load balancing options out there. Squid has a new branch in CVS called rproxy that handles multiple backend web servers very effectively with failure detection and other fun stuff (not to mention caching). Pound is a reverse proxy that does load balancing of HTTP traffic and SSL wrapping (most everything Squid can do for reverse proxying minus the caching features).
Balance is a generic TCP load balancer with some nice features. The best features being that it is simple and works on more platforms than just Linux and handles more than just the HTTP protocol. It probably has some disadvantages for some situations because it operates at a lower level than the HTTP proxies above, though it can probably do lots of the same things LVS does (I don't know very much about Balance).
Eddie is a neat framework written in Ericssons Erlang language. Seems to be dormant, but I think it is in pretty widespread use so is probably pretty stable.
Links:
-
Implementations for Linux and FreeBSDThe daemons exist, i'm not sure about their legal status however:
-
Re:how to build a high performance/reliable webser
1)/5) For the front end, you might be better off with a weighted load balancer (or LVS on the cheap). Also consider a specialized HTTP multiplexer like NetScaler/Redline (these typically give content encoding, SSL acceleration for free).
3)This is probably a bad idea
-
Re:"Three times the power?"
At a website I used to work at, they decided they needed to use Windows 2000 Advanced Server for web clustering. That is, quite possibly, the worst decision they ever made (aside from going with Windows 2000; trust me on this one.)
Win2k AS Load Balancing (aka WLBS: Windows Load Balancing Service) works by detecting other computers on the network with the same service, and they decide who will handle what request. They both have a primary IP, which is unique, in addition to a "virtual" address, which is the same on all of them. They also have a fake MAC address which is identical on both (makes for interesting ping responses.)
An interesting thing we noticed about WLBS is that, unless a computer is off the network, it will still be in the cluster. I.e. if IIS fails on one machine, as long as you can ping it, it will still get traffic.
When we moved from WLBS to LVS, we noticed a 50% drop in average CPU usage. This is probably due to the fact that now the clustering horsepower was moved off the web servers, but still, a free product versus a rather expensive one. And we've had better uptime now than ever before. -
Re:But any web server is high-performance
Aside: Is there any open source software that manages session affinity yet?
Yes. Linux Virtual Server is an incredible project. You put your web servers behind it and (in the case of simple NAT balancing) you set the gateway of those computers to be the address of your LVS server. You then tell LVS to direct all IPs of a certain netmask to one server (i.e. if you set for 255.255.255.0, 192.168.1.5 and 192.168.1.133 will connect to the same server).
The only problem I had with it was that it does not detect downtime. However, I wrote a quick script that used the checkhttp program from Nagios to pull a site out of the loop when it went down (these were Windows 2000 servers: it happened quite frequently, and our MCSE didn't know why :)
There are higher performance ways to set up clustering using LVS, but since I was lazy, that's what I did. -
Re:Coda?
See High Availability for more informaiton.
Coda is the best present option for fs dependant data storage on mostly open-source plaforms. We are using Coda for our MySQL table files, ZODB files and logs.
Coda may still be beta software, but if Open Source software like Coda is considered beta code then Windows 2000 + sp2 must have been alpha code. :-) (And yes, I have Win2000 on my machine and even occasionally am forced to boot into it so I speak from experience). -
Re:i have....
But 'sticky' isn't 'zero affinity', is it? So what you really want it what the original poster suggested, a SSL-speaking proxy (eg Squid in SSL accellerator mode) that terminates the SSL session and forwards the request inside it to a cluster of non-SSL webservers (using RRDNS perhaps, or LVS if you want a 'smarter' solution). The downside there is your squid proxy is doing a lot of work, so you probably want to have a backup one and use something like heartbeat to fail-over to it if there's a problem with the first one.
-
More LVS info for those interested...For those interested in using LVS for software routing, it's fairly simple. Basically, you patch a stock Linux kernel and use a tool similar to ipchains to establish virtual services. These services forward requests to your back-end real servers according to a flexible ruleset that you design.
You can use NAT to hide the real servers from the Internet if you like. This allows you to use most any web server you like (such as IIS), but more fancy routing tricks can be done with Unix or Linux servers for even better results. We use NAT at our site (university EE department) and it can handle more load than we will ever receive -- our objective is high-availability. Also, you can use different methods for different server clusters on the same director (e.g. tunneling tricks for Linux apache servers, and less magic for IIS).
And LVS can be set up such that once a user connects to a particular server, his subsequent connections go back to the same server.
Useful links:
-
More LVS info for those interested...For those interested in using LVS for software routing, it's fairly simple. Basically, you patch a stock Linux kernel and use a tool similar to ipchains to establish virtual services. These services forward requests to your back-end real servers according to a flexible ruleset that you design.
You can use NAT to hide the real servers from the Internet if you like. This allows you to use most any web server you like (such as IIS), but more fancy routing tricks can be done with Unix or Linux servers for even better results. We use NAT at our site (university EE department) and it can handle more load than we will ever receive -- our objective is high-availability. Also, you can use different methods for different server clusters on the same director (e.g. tunneling tricks for Linux apache servers, and less magic for IIS).
And LVS can be set up such that once a user connects to a particular server, his subsequent connections go back to the same server.
Useful links:
-
More LVS info for those interested...For those interested in using LVS for software routing, it's fairly simple. Basically, you patch a stock Linux kernel and use a tool similar to ipchains to establish virtual services. These services forward requests to your back-end real servers according to a flexible ruleset that you design.
You can use NAT to hide the real servers from the Internet if you like. This allows you to use most any web server you like (such as IIS), but more fancy routing tricks can be done with Unix or Linux servers for even better results. We use NAT at our site (university EE department) and it can handle more load than we will ever receive -- our objective is high-availability. Also, you can use different methods for different server clusters on the same director (e.g. tunneling tricks for Linux apache servers, and less magic for IIS).
And LVS can be set up such that once a user connects to a particular server, his subsequent connections go back to the same server.
Useful links:
-
Re:The problem here is....One of the projects I'm involved in is the JANET Web Cache Service which is a top level proxy service for the UK academic community.
We use LVS code to load balance our squid boxen at layer 4, and have successfully shifted some 120Mbps through one of our nodes using direct routing on the backends, rather than NATing the system - this configuration barely loaded the frontend (which was only a 500MHz machine) and load balanced some 15-16 backend machines.
ISTR there is some early layer 5-7 code on the LVS site somewhere, but I've not used it, so I don't know how stable it is, or what the performance is like.
-
Re:Linux Virtual Server Project
LVS itself doesnt have the capability of looking into the contents of the packets it is directing. It is a layer 4 load balancer, hence it has no understanding of the http protocol. You will need to look into KTCPVS or DRWS for a layer 7 balancer that can inspect urls. This web page should give you all the details.
-
You want to use lvs
Use Linux Virtual Server. I have 15 ldap/webservers being load balanced in 3 sites (each site of consists of 2 LVS servers in a hot/standby config) with each HA pair of lvs systems load balancing 5 servers.. if 2 or more of one sites servers go down then the site's lvs system will begin to route 20% up to 100% of the traffic to the other 2 sites. You will need to read a ton of docs but its pretty easy to setup once you get the hang of it. Its rock solid so far. I am planning on implementing them all over our company network.
-
Re:A great siteHigh availability should not be confused with handling load. High availability ensures uptime for a server. Load balancing distributes a load across multiple servers, allowing the handling of larger loads. Linux-HA is for the former.
Here are some links to some load balancing projects I'm aware of:
- lbnamed - A load balancer written in Perl
- Super Sparrow - A Linux-based load balancer
- Ultra Monkey - A high-availability and load balancer solution based on Linux (it looks like Super Sparrow may be Ultra Monkey's load balancer)
- LVS - A high-availability and load balancer solution based on Linux
-
Linux Virtual Server Project
We have recently done just this using the Linux Virtual Server Project, and it has turned out very well. Just be prepared to read a lot of documentation.
Basically, you patch a stock Linux kernel and use a tool similar to ipchains to establish virtual services. These services forward requests to your back-end real servers according to a flexible ruleset that you design.
You can use NAT to hide the real servers from the Internet if you like. This allows you to use most any web server you like (such as IIS), but more fancy routing tricks can be done with Unix or Linux servers for even better results. We use NAT at our site (university EE department) and it can handle more load than we will ever receive -- our objective is high-availability. Also, you can use different methods for different server clusters on the same director (e.g. tunneling tricks for Linux apache servers, and less magic for IIS).
And LVS can be set up such that once a user connects to a particular server, his subsequent connections go back to the same server.
Also, you can use freely-available third-party tools like Mon to watch your real servers for failure and dequeue them, page you, etc. etc. The bottom line is, since you are using Free tools to do this project, you are limited by your imagination as to what you can do with your cluster.
I have been very happy with the result. And so have many others. If you want to hear big names, LVS is used by linux.com, Sourceforge, zope.org, VA Systems, and RealNetworks, according to their deployment page.
-
Linux Virtual Server Project
We have recently done just this using the Linux Virtual Server Project, and it has turned out very well. Just be prepared to read a lot of documentation.
Basically, you patch a stock Linux kernel and use a tool similar to ipchains to establish virtual services. These services forward requests to your back-end real servers according to a flexible ruleset that you design.
You can use NAT to hide the real servers from the Internet if you like. This allows you to use most any web server you like (such as IIS), but more fancy routing tricks can be done with Unix or Linux servers for even better results. We use NAT at our site (university EE department) and it can handle more load than we will ever receive -- our objective is high-availability. Also, you can use different methods for different server clusters on the same director (e.g. tunneling tricks for Linux apache servers, and less magic for IIS).
And LVS can be set up such that once a user connects to a particular server, his subsequent connections go back to the same server.
Also, you can use freely-available third-party tools like Mon to watch your real servers for failure and dequeue them, page you, etc. etc. The bottom line is, since you are using Free tools to do this project, you are limited by your imagination as to what you can do with your cluster.
I have been very happy with the result. And so have many others. If you want to hear big names, LVS is used by linux.com, Sourceforge, zope.org, VA Systems, and RealNetworks, according to their deployment page.
-
Linux Virtual Server Project
We have recently done just this using the Linux Virtual Server Project, and it has turned out very well. Just be prepared to read a lot of documentation.
Basically, you patch a stock Linux kernel and use a tool similar to ipchains to establish virtual services. These services forward requests to your back-end real servers according to a flexible ruleset that you design.
You can use NAT to hide the real servers from the Internet if you like. This allows you to use most any web server you like (such as IIS), but more fancy routing tricks can be done with Unix or Linux servers for even better results. We use NAT at our site (university EE department) and it can handle more load than we will ever receive -- our objective is high-availability. Also, you can use different methods for different server clusters on the same director (e.g. tunneling tricks for Linux apache servers, and less magic for IIS).
And LVS can be set up such that once a user connects to a particular server, his subsequent connections go back to the same server.
Also, you can use freely-available third-party tools like Mon to watch your real servers for failure and dequeue them, page you, etc. etc. The bottom line is, since you are using Free tools to do this project, you are limited by your imagination as to what you can do with your cluster.
I have been very happy with the result. And so have many others. If you want to hear big names, LVS is used by linux.com, Sourceforge, zope.org, VA Systems, and RealNetworks, according to their deployment page.
-
Linux Virtual Server Project
We have recently done just this using the Linux Virtual Server Project, and it has turned out very well. Just be prepared to read a lot of documentation.
Basically, you patch a stock Linux kernel and use a tool similar to ipchains to establish virtual services. These services forward requests to your back-end real servers according to a flexible ruleset that you design.
You can use NAT to hide the real servers from the Internet if you like. This allows you to use most any web server you like (such as IIS), but more fancy routing tricks can be done with Unix or Linux servers for even better results. We use NAT at our site (university EE department) and it can handle more load than we will ever receive -- our objective is high-availability. Also, you can use different methods for different server clusters on the same director (e.g. tunneling tricks for Linux apache servers, and less magic for IIS).
And LVS can be set up such that once a user connects to a particular server, his subsequent connections go back to the same server.
Also, you can use freely-available third-party tools like Mon to watch your real servers for failure and dequeue them, page you, etc. etc. The bottom line is, since you are using Free tools to do this project, you are limited by your imagination as to what you can do with your cluster.
I have been very happy with the result. And so have many others. If you want to hear big names, LVS is used by linux.com, Sourceforge, zope.org, VA Systems, and RealNetworks, according to their deployment page.
-
Linux Virtual Server Project
We have recently done just this using the Linux Virtual Server Project, and it has turned out very well. Just be prepared to read a lot of documentation.
Basically, you patch a stock Linux kernel and use a tool similar to ipchains to establish virtual services. These services forward requests to your back-end real servers according to a flexible ruleset that you design.
You can use NAT to hide the real servers from the Internet if you like. This allows you to use most any web server you like (such as IIS), but more fancy routing tricks can be done with Unix or Linux servers for even better results. We use NAT at our site (university EE department) and it can handle more load than we will ever receive -- our objective is high-availability. Also, you can use different methods for different server clusters on the same director (e.g. tunneling tricks for Linux apache servers, and less magic for IIS).
And LVS can be set up such that once a user connects to a particular server, his subsequent connections go back to the same server.
Also, you can use freely-available third-party tools like Mon to watch your real servers for failure and dequeue them, page you, etc. etc. The bottom line is, since you are using Free tools to do this project, you are limited by your imagination as to what you can do with your cluster.
I have been very happy with the result. And so have many others. If you want to hear big names, LVS is used by linux.com, Sourceforge, zope.org, VA Systems, and RealNetworks, according to their deployment page.
-
Thanks for telling us!
I run two server farms and have been asked to provide High Availability for them. I was also asked to do public nameserver and virtal hosting for nearly twenty corporate domains, not to mention another hundred-or-so portals. I was asked to provide failover and redundancy, Content Management, Source Code Control, Document Management, Workflows, LDAP, scheduling and reporting.
All on a budget less that the cost of a Sun 4500.
There was only one solution on the market: linux. I used the IPVS heartbeat + mon + fake + coda layout with Apache for virtual hosting and front-end, Weblogic for the java backend, Zope for my CMS / Document Management, daemontools for process monitoring, Checkpoint firewalls (not my choice mind you) and last but not least linux on every single machine in the farm(s). I have multiple NICs with bonded channels between the servers providing me with near-Gb Ethernet speeds between my data servers and hosts.
Linux took our server from from 100% M$ and literally constant system crashes and reboots to 100% (so-far) uptime except for scheduled outages AT&T is our telco and they only give us 99.96% uptime.
At least here, M$ is dead. We are evaluating linux on the desktop to see if we can use Wine with Lotus Notes and Office. If so then we might start switching desktops for some groups. -
Re:Whoa...
Or instead of a really expencive hardware loadbalancer you could use Linux Virtual Server a linux based software loadbalancer - works really well. I've been using it together with NFS to balance a quite high traffic website.
-
Rock solid and highly recommended.
We are running Zope 2.5.0, and it is rock solid. The performance is excellent and the utility is amazing. It allows a totally modular setup, content management is a breeze, and this is useful when there's no central administrator for all aspects of the site (Graphics, logic and content can all be managed seperately, totally securely, all through a web-based interface or via WebDAV or FTP).
The setup starts with an LVS server, connected to an OpenBSD firewall, backended by three ZEO servers running on FreeBSD 4.4, one DB server (PostgreSQL 7.1.2) running on FreeBSD 4.4, and one central webserver running Apache 1.3.22 on Slackware 8.0, with OpenSSL 0.9.6 and Mod_ssl, with web proxying through the ZServer to the Apache box via virtual hosts. (Proxy Pass Reverse in Apache).
This combination of Linux, FreeBSD, OpenBSD, Apache, Postgres, Zope, and various other open source software packages, has been rock solid and a box has only ever gone down for hardware upgrades (RAM, HDD, etc) and software updates (kernel updates, etc).
Overall, I recommend Zope 100%, but be aware that a lot will depend on your total setup, particularly if you have high-demand sites that you want to implement.
-
Re:They don't compensate for downtime?!
For a cheap fail-over look at this:
http://www.linuxvirtualserver.org/~julian/nano.txt
It is far from perfect (read the bad news section) and it can take awhile to get it working, but you will have "fail-over". -
Re:Why iptables (Linux 2.4 Firewalling) SucksI was eagerly waiting for Linux 2.4 from the day I heard somewhere it would support ipfilter.
Oh, this is the same confusion like in Swansea NET-2 versus BSD NET/2. Linux never attempted to support ipfilter. The core framework in kernel is called netfilter. IPtables are built on top of netfilter (as is Linux virtual server, etc).
I don't know much about ipfilter, but I think it is (at least partly) user-space solution. Because we already had a fast kernel-space solution, I see no point in moving back to the user-space.
With ipchains (2.2 kernel), I run router with four 100Mbps ethernets (on an old Celeron 266) and over 250 rules, and it works on full bandwidth. This is impossible with partial user-space solution.
-Yenya
-- -
Re:Not a beowulf cluster
this is NOT a high-performance (beowulf) cluster.
Umm, sorry? Are you implying that the only high performance clustering solution available is beowulf? Stop smoking the cheap crack, it's messing with your head. Most clusters are not focused on computation, but providing services. Often the bottleneck that's being overcome is not the CPU.
Just because it's not plotting weather patterns doesn't mean that it's not high performance. The performance is just measured in other ways (HTTP req's /sec, FTP throughput, etc. etc.) This is an infinitely more common, and to most, more useful type of cluster. Sometimes I think the /. crowd is a little _too_ fixated on beowulf clusters...
That linux lacks this has been one of Microsoft's marketing points, so this is a really good thing.
That you _believe_ that Linux lacks this is testimony to the success of Microsft's marketing...
It's here, and it's been around for a while. http://www.linuxvirtualserver.org
is where you can read about it. Not only does LinuxDirector exist, but it scales farther than MS's offerings, is GPL'd, and in typical Linux fashion, doesn't require fancy, expensive, matching hardware... -
High Availability Clustering.I won't touch on PVM or MPI clustering, but as far as High Availability clustering goes, most of the distributions will use some form of lvs Since it uses nice command line utiities you can write your own scripts, or you could use the gui they offer as well. Slap that software on any distro(make sure that the kernel's patched right) and you're ready to go.
I've done this myself, and without starting a flame war, I've found that the easiest setup was achieved using RedHat. Their piranha tools make things easier and since the servers came with RedHat, I didn't have to waste too much time, nor did I have to drop a couple thousand dollars for their cluster distro, it all comes in the general distribution. During research for this project I read quite a bit about the TurboLinux distribution. The internals aren't much more than lvs, but the price tag scares you away (not that you couldn't do it with a stock TL and LVS, but to use their special distro it costs
... just like RedHat's. You're not really paying for the software, but rather the tech support). Whatever you decide, keep in mind a few things ..
1. Any distro can do it.
2. When you get the cluster up, do what you can to keep the distro/OS in the cluster the same. You'll save yourself a good bit of headaches in administration and make using the weighted algorithims a reality (ex: NT won't respond to the uptime, or ruptime polling requests, so you're stuck with the static weight that you assigned read the HOWTO for more).
3. If you are using lvs, use direct routing. It's fast. -
Clustering software or management software?
If you are looking for software to create a cluster, there are several, depending upong what type of cluster you are trying to create. If you are creating a service-based cluster, check out TurboLinux Cluster Server, Linux Virtual Servers, PolyServe Understudy, and Legato. There are many others available, including hardware solutions from Cisco, F5, and Alteon. I'm not too familiar with Beowulf-type clusters.
If you are looking for software to manage groups of systems, that's a whole different story. You might look into Enlighten DSM, Tivoli, or OpenNMS. I'm sure there's a lot of competition in that field as well, but I don't have any experience with those products. -
Re:Obligatory Beowult
Hmmmm. No. I don't think clusters are suited to the type of application in question. Quake wouldn't really benefit from massive parrallel processing. Yes, from more processing power - as long as it's on a local bus and not distributed over a network into seperate nodes. The main factors in Quake speed would be Random Access Memory and the quality of graphics adapter used in the comptuer running the game.
I think Beowulf is an excellent technology for many applications though, especially back-end services which need the extra oomph and can be distrubted cleanly, as well as, of course, some academic processes, such as analysis, etc.A good example of how Linux can be made into a model of distributed service handling is the impressive HA Linux. (High Availability Linux). The team is working closely with the Linux Virtual Server project, and the technology looks impressive. In a few years it could even compete with Sun's high end technology.
-
Sun Certified Programmer for the Java Platform
- Sun Certified System Administrator for Solaris
-
Fail-over
In addition to linux-ha, which includes links to Linux Virtual Server, Piranha, Ultramonkey, you can also find organizations that do this for a living. One (the company I work for, to be honest) is Mission Critical Linux. Specify what your needs are, exactly (web service, database failover, file system, etc), then look around.
By the way, is your consultant a reseller of Solaris (since I see he suggested that)?
jeff -
LVS and Linux-HAWe have just been involved in creating a high availability clustered solution using completely linux for a client.
For this we used the Linux Virtual Server Project and also The Linux High Availability project.
This provides a great, resiliant service, the project is live and running like a dream !!!!
Dont believe what you hear from these overpriced consultants.
-
Linux does support this...
You could go with an expensive commercial solution like BigIP from F5, but those will run you at least $30k or so. You could also use Polyserve Understudy, which does pretty much the same thing only under Linux, and it's only about $400 or so. If you have all this expensive Cisco equipment and a Cat6000, you can run Local Director on that without buying additional hardware.
However, I suggest:
http://www.linuxvirtualserver.org or
http://linux-ha.org or
http://www.eddieware.org
It all depends on your application that you're running. If it's just http, any of these will work, but if it's something else, you're stuck with linux-HA or Linux Virtual Server. Eddie will only do http as far as I know. Plus Eddie uses Erlang, which may affect performance.