Best Practices For Infrastructure Upgrade?

Why? by John+Hasler · 2009-11-21 11:04 · Score: 2, Informative

Why virtual servers? If you are going to run multiple services on one machine (and that's fine if it can handle the load) just do it.

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.

Re:Why? by MeatBag+PussRocket · 2009-11-21 11:14 · Score: 4, Funny

redundancy.

--
i wage a holy war against the apostrophe.
Re:Why? by John+Hasler · 2009-11-21 11:23 · Score: 2, Insightful

> redundancy.
+5 Funny.

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
Re:Why? by lukas84 · 2009-11-21 11:24 · Score: 2, Informative

Virtualization does not automatically imply redundancy, and VM-level high availability will not protect you against application failures.
Re:Why? by nabsltd · 2009-11-21 13:19 · Score: 2, Informative

Just p2v his entire data center first,
This brings to mind one other big advantage of VMs that help with uptime issues: fast reboots.
Some of those old systems might have to be administered following "Microsoft best practices" (reboot once a week just to be safe), and older hardware might have issues with that, plus it's just slower. Add in the fact that VMs don't have to do many of the things that physical hardware has to do (memory check, intialize the RAID, etc.), and you can reboot back to "everything running" in less than 30 seconds.
Although you never want to reboot if you can avoid it, this one factor gives you some serious advantages. If you have to apply a patch that requires a reboot, you can do so just by making sure the server isn't being used right now, and it's likely that people won't even notice. Of course, you don't do this until after you have done the same thing on the test server, and know that the patch won't cause issues.

then work on 'upgrades' from there.
And the test environment is a big thing that VMs can provide to help those upgrades. Just p2v the system, then clone it to create the test version. Use snapshots and torture the test system as much as you want.
Re:Why? by mysidia · 2009-11-21 16:27 · Score: 2, Insightful

It creates a configuration nightmare. Apps with conflicting configurations.
Changes required for one app may break other apps.
Also, many OSes don't scale well.
In a majority of cases you actually get greater total aggregate performance out of the hardware by divvying it up into multiple servers. When your apps are not actually CPU-bound or I/O bound.
Linux is like this. For example, in running Apache.. after a certain number of requests, the OS uses the hardware inefficiently, and can't answer nearly as many requests as it should be able to. By dividing it into 4 virtual servers instead, for your 4 CPUs, you can multiply the number of requests that can be handled by 10 or 20 fold.
You may even think your CPU bound on Linux when you are not: load average may be high due to number of Apache processes that are contending with each other, and can create a false impression of high CPU or IO usage, when in fact, you have a bottleneck in the app/kernel's parallel processing capabilities.
Exchange is also like this.. better to scale out 2 virtual machines with 32gb a RAM and 4x3ghz CPUs dedicated to it each, than one server with 64gb RAM and 8x3ghz CPUs. The former is a beefy server but doesn't have much advantage from adding the extra resources. The two servers virtualized on one box may have much better performance than 1 physical server, if you are using Intel Nehalem CPUs and properly configure your VMs (i.e. you actually do it right, and perform all recommended practices including LUN/guest partition block alignment, and don't just use default settings).
Re:Why? by mysidia · 2009-11-21 16:56 · Score: 2, Informative

That's where Windows 2008 MSCS, HAProxy, or Redhat cluster suite comes in.
For example, if you want a highly-available web service, you would have two VMware servers that you run a Webserver VM for on each server.
Then you would have a diskless load-balancer running HAProxy, to feet incoming web requests to a working web server.
For database services... you'd have a MySQL or MSSQL VM on each host, and a SAN or shared storage block filesystem with a GFS formatted LUN, and a Quorum disk (Linux) or Witness File share on a third physical host (for Windows 2008 MSCS), with clustering services configured so the SQL process is only active on the one host at a time, and only when quorum is met; if failure of another node is detected, a remaining node that can meet quorum will fence (KILL) the other VM, and then take over.
So in this manner, you can meet HA in a virtualized environment.
Although there are some considerations, like guest system clock accuracy, reliability of network connections to ensure an erroneous failure isn't detected during times of high load, and supported configurations for OS vendors' clustering capabilities

Re:Cloud Computing(TM) by lukas84 · 2009-11-21 11:04 · Score: 5, Insightful

No, the budget questions comes later.

The first questions are: What are your businesses requirements regarding your IT infrastructure? How long can you do business without it? How fast does something need to be restored?

Starting with those requirements, you can start with possible designs that fit those solutions - for example, if the requirement is that a machine must be operational at last a week after a crash, you can build computers from random spare parts and hope that they'll work. If the requirement is that it should be up and running in two days, you will need to buy servers from a Tier 1 vendor like HP or IBM with appropriate service contracts. If the requirement is that everything must be up and running again in 4 hours, you'll need backups, clusters, site resilience, replicated SAN, etc. pp.

The question of Budget comes into play much later.

I'd say by pele · 2009-11-21 11:04 · Score: 5, Informative

don't touch anything if it's been up and running for the past 7 years. if you really must replicate then get some more cheap boxes and replicate. it's cheaper and faster than virtual anything. if you must. but 150 users doesn't warrant anything in my oppinion. I'd rather invest in backup links (from different companies) between offices. you can bond them for extra throughput.

Re:I'd say by The+-e**(i*pi) · 2009-11-21 11:14 · Score: 2, Insightful

I doubt with only 150 people they would want to spend the money to have a server at every office in case that offices link went down. I agree wholeheartedly that the level of redundancy talked about is overkill. Also will WWW, mail, DNS, ... even work if the line is cut regardless if the server is in the building?

Think about the complexity of duplication by El+Cubano · 2009-11-21 11:07 · Score: 4, Insightful

there's hardly any fallback if any of the services dies or an office is disconnected. Now, as the hardware must be replaced, I'd like to buff things up a bit: distributed instances of services (at least one instance per office) and a fallback/load-balancing scheme (either to an instance in another office or a duplicated one within the same).

Is that really necessary? I know that we all would like to have bullet-proof services. However, is the network service to the various offices so unreliable that it justifies the added complexity of instantiating services at every location? Or even introducing redundancy at each location? If you were talking about thousands or tens of thousands of users at each location, it might make sense just because you would have to distribute the load in some way.

What you need to do is evaluate your connectivity and its reliability. For example:

How reliable is the current connectivity?
If it is not reliable enough, how much would it cost over the long run to upgrade to a sufficiently reliable service?
If the connection goes down, how does it affect that office? (I.e., if the Internet is completely inaccessible, will having all those duplicated services at the remote office enable them to continue working as though nothing were wrong? If the service being out causes such a disruption that having duplicate services at the remote office doesn't help, then why bother?)
How much will it cost over the long run to add all that extra hardware, along with the burden of maintaining it and all the services running on it?

Once you answer at least those questions, then you have the information you need in order to make a sensible decision.

Re:Think about the complexity of duplication by psych0munky · 2009-11-22 05:54 · Score: 2, Interesting

Maybe this is asked elsewhere in these threads, but the one thing that seems to not be asked here is not just "What are the business requirements?", but also "What are your business application requirements?". While it may seem implied in the former question, IME, it is usually not addressed enough by simply asking the former. In asking the former, it seems that you get nice "businessy" answers like "we need Y application to be back up and running in X time". What it doesn't answer, is what are the requirements for Y application? Does it need to have internet connectivity, connectivity to a central database, or is it completely stand-alone? In the second case, unless you have a sufficiently advanced application (most aren't), simply putting an instance of Y application locally in case your link goes down, may not cut it if it does not have suitable "caching" mechanism to store data until the link comes back and then forward it on to the central DB.
I have seen many hardware upgrades "fail" even though the upgrade was technically successful. This was usually caused by the project team asking the right business questions, but forgetting to drill down and ask the right questions of the application providers (vendors or internal development staff).
I was actually involved in a Active Directory "upgrade" project where the project team was wanting not to simply upgrade AD to the latest version, but also refactor the directory structure (due to some really poor choices on the initial implementation which was causing daily grief for the maintainers of the information), without considering the impacts to the applications we had built in-house that were using AD for Authn and authz (most would've likely been able to handle the changes since they were fairly configurable in this regard). I raised this concern many times and almost everytime, it was ignored, or it was "yeah, we will consider that", and then it got dropped on the ground. Fortunately, just prior to implementation, the project got "put on the back-burner" and the project manager (a contractor) was let go due to "budget cuts". Hopefully when/if this gets traction again, we will actually look at what else besides the network and people's workstation login's will be affected.
I still struggle to understand what causes this rift between infrastructure people and development people (I have been on both sides, but mostly on the development side), as a poor application choice can severely restrict what can be done with a company's hardware, and inversely, a poor infrastructure choice can unexpectedly break an application.
However, if you are only a company of 150ish employees, hopefully you are still small enough to deal with issue quickly and efficiently (it seems to get worse as corporations get bigger).

Get someone experienced on the boat! by lukas84 · 2009-11-21 11:09 · Score: 5, Insightful

You know, you could've started with a bit more details - what operating system are you running on the servers? What OS are the clients running? What level of service are you trying to achieve? How many people work in your shop? What's their level of expertise?

If you're asking this on Slashdot now, it means you don't enough experience with this yet - so my first advice would be to get someone involved who does. Someone with many people with lots of experience and knowledge on the platform you work on. This means you'll have backup in case something goes south and your network design will benefit from their experience.

As for other advise, make sure you get the requirements from the higher-ups in writing. Sometimes they have ridiculous ideas regarding they availability they want and how much they're willing to pay for it.

Re:Get someone experienced on the boat! by TakeyMcTaker · 2009-11-21 17:05 · Score: 2, Insightful

The main piece of missing information that annoys me is that part of the network service list that says "-- and some more." Half the services that were listed could be easily outsourced to any decent ISP, with cost depending on security, storage, and SLA requirements. ISP hosting or even colocation services give you cheap access to better redundant Internet links than your office will ever touch.
The other half could be done with a cheap firewall/VPN box at each site. In the age of OpenWRT, these boxes often have services like Multi-WAN, DNS, DHCP, SSL, VPN, and IDS built-in. Buy two of those, sync configuration, hook them up to a networked power switch, and script the power to shut off one and power up the other whenever a network service test fails. All that equipment is still less than the cost of a single 1U+ server with equivalent services, and any custom scripting would be for minor convenience functions -- not a service requirement. I find specialized hardware/firmware solutions are far more reliable than software/server solutions. They are also often cheap enough to keep an offline spare handy for emergency replacement.
Even a low-power retail NAS box could be used for complete network authentication, SSH, and SSL data services. It could probably serve an office up to 250 users, depending on simultaneous load -- 50 easy. Slap some cheap (less than $0.10/GB!) TB+ SATA drives in there, and you have multi-TB RAID storage per site, that can be rsync replicated to all nodes. Give each site their own cheap master storage node, replicated to each other. The rsync script(s) could be scheduled or event triggered, as needed. Netgear ReadyNAS boxes can also run Subversion/WebDAV/Autocommit/svnsync.
I'm betting the meat of these services are in that nebulous "and some more" area, and that those service requirements change everything.
Some brand names that carry one or more of the products mentioned above, and can be found in any Fry's or decent online store, without even having to deal with a sales rep:
Netgear
Linksys (now sometimes Cisco rebranded)
Dlink
Cradlepoint (3G/4G wireless backup!)
Apple (Airlink are surprisingly good routers!)
Qnap
Thecus
Sans Digital
Digital Loggers, Inc.
APC
I wouldn't ever recommend Buffalo, and 3Com might be on the list if HP had not bought them recently.

Re:And the Key Factor is.... by lukas84 · 2009-11-21 11:10 · Score: 2, Informative

Again, wrong approach. Ask the higher-ups what kind of availability they want. The cost is derived from their wishes.

Take your time by BooRadley · 2009-11-21 11:13 · Score: 4, Insightful

If you're like most IT managers, you probably have a budget. Which is probably wholly inadequate for immediately and elegantly solving your problems.

Look at your company's business, and how the different offices interact with each other, and with your customers. By just upgrading existing infrastructure, you may be putting some of the money and time where it's not needed, instead of just shutting down a service or migrating it to something more modern or easier to manage. Free is not always better, unless your time has no value.

Pick a few projects to help you get a handle on the things that need more planning, and try and put out any fires as quickly as possible, without committing to a long-term technology plan for remediation.

Your objective is to make the transition as boring as possible for the end users, except for the parts where things just start to work better.

--

-- lk t lv ll th vwls t f wrds. T svs lts f tm t wrt bt ts pn n th ss t rd nd mks m lk lk cmplt dpsht.

Re:Cloud Computing(TM) by Anonymous Coward · 2009-11-21 11:15 · Score: 2, Funny

I disagree when you have a budget of 800$ and some shoestrings it eliminates a lot of questions ;)

Affordable SME Solution by foupfeiffer · 2009-11-21 11:15 · Score: 2, Interesting

I am still in the process of upgrading a "legacy" infrastructure in a smaller (less than 50) office but I feel your pain.

First, it's not "tech sexy", but you've got to get the current infrastructure all written down (or typed up - but then you have to burn to cd just in case your "upgrade" breaks everything).

You should also "interview" users (preferrably by email but sometimes if you need an answer you have to just call them or... face to face even...) to find out what services they use - you might be surprised to find something that you didn't even know your Dept was responsible for (oh, that Panasonic PBX that runs the whole phone system is in the locked closet they forgot to tell you about...)

Your next step is prioritizing what you actually need/want to do... remember that you're in a business environment so having redundant power supplies for the dedicated cd burning computer may not actually improve your workplace (but yes, it might be cool to have an automated coffee maker that can run on solar power...)

So now that you know pretty much what you have and what you want to change...

Technology wise, Virtualization is definitely your answer... and there's a learning curve:
VMWare is pretty nice and pretty expensive.
Virtualbox (I use) is free but doesn't have as many enterprise features (automatic failover)
Xen with Remus or HA is the thinking man's setup

All of the above will depend on reliable hardware - that means at least RAID 1, and yes you can go with SAN but be aware that it's a level of complexity you might not need (for FTP, DNS, etc.)

Reading what you've listed as "services" it almost sounds like you want a single linux VM running all of those things with Xen and Remus...

Good luck, and TEST IT before you deploy it as a production setup.

openVZ by RiotingPacifist · 2009-11-21 11:16 · Score: 3, Funny

For services running on linux, openVZ can be used as a jail with migration capabilities instead of a full on VM,

DISCLAIMER: I don't have a job so I've read about this but not used it in a pro environment yet

--
IranAir Flight 655 never forget!

Don't do it by Anonymous Coward · 2009-11-21 11:18 · Score: 5, Insightful

Complexity is bad. I work in a department of similar size. Long long ago, things were simple. But then due to plans like yours, we ended up with quadruple replicated dns servers with automatic failover and load balancing, a mail system requiring 12 separate machines (double redundant machines at each of 4 stages: front end, queuing, mail delivery, and mail storage), a web system built from 6 interacting machines (caches, front end, back end, script server, etc.) plus redundancy for load balancing, plus automatic failover. You can guess what this is like: it sucks. The thing was a nightmare to maintain, very expensive, slow (mail traveling over 8 queues to get delivered), and impossible to debug when things go wrong.

It has taken more than a year, but we are slowly converging to a simple solution. 150 people do not need multiply redundant load balanced dns servers. One will do just fine, with a backup in case it fails. 150 people do not need 12+ machines to deliver mail. A small organization doesn't need a cluster to serve web pages.

My advice: go for simplicity. Measure your requirements ahead of time, so you know if you really need load balanced dns servers, etc. In all likelihood, you will find that you don't need nearly the capacity you think you do, and can make due with a much simpler, cheaper, easier to maintain, more robust, and faster setup. If you can call that making due, that is.

Trying to make your mark, eh? by GuyFawkes · 2009-11-21 11:25 · Score: 3, Insightful

The system you have works solidly, and has worked solidly for seven years.

I, personally, am TOTALLY in agreement with the ethos of whoever designed it, a single box for each service.

Frankly, with the cost of modern hardware, you could triple the capacity of what you have now just by gradually swapping out for newer hardware over the next few months, and keeping the shite old boxen for fallback.

Virtualisation is, IMHO, *totally* inappropriate for 99% of cases where it is used, ditto *cloud* computing.

It sounds to me like you are more interested in making your own mark, than actually taking an objective view. I may of course be wrong, but usually that is the case in stories like this.

In my experience, everyone who tries to make their own mark actually degrades a system, and simply discounts the ways that they have degraded it as being "obsolete" or "no longer applicable"

Frankly, based on your post alone, I'd sack you on the spot, because you sound like the biggest threat to the system to come along in seven years.

These are NOT your computers, if you want a system just so, build it yourself with your own money in your own home.

This advice / opinion is of course worth exactly what it cost.

Apologies in advance if I have misconstrued your approach. (but I doubt that I have)

YMMV.

--
http://slashdot.org/~GuyFawkes/journal

Re:Trying to make your mark, eh? by bertok · 2009-11-21 11:57 · Score: 4, Interesting

I, personally, am TOTALLY in agreement with the ethos of whoever designed it, a single box for each service.
...
Virtualisation is, IMHO, *totally* inappropriate for 99% of cases where it is used, ditto *cloud* computing.
I totally disagree.
Look at some of the services he listed: DNS and DHCP.
You literally can't buy a server these days with less than 2 cores, and getting less than 4 is a challenge. That kind of computing power is overkill for such basic services, so it makes perfect sense to partition a single high-powered box to better utilize it. There is no need to give up redundancy either, you can buy two boxes, and have every key services duplicated between them. Buying two boxes per service on the other hand is insane, especially services like DHCP, which in an environment like that might have to respond to a packet once an hour.
Even the other listed services probably cause negligible load. Most web servers sit there at 0.1% load most of the time, ditto with ftp, which tends to see only sporadic use.
I think you'll find that the exact opposite of your quote is true: for 99% of corporate environments where virtualization is used, it is appropriate. In fact, it's under-used. Most places could save a lot of money by virtualizing more.
I'm guessing you work for an organization where money grows on trees, and you can 'design' whatever the hell you want, and you get the budget for it, no matter how wasteful, right?
Re:Trying to make your mark, eh? by GuyFawkes · 2009-11-21 12:00 · Score: 3, Interesting

Get real, for 150 users at WRT54 will do DNS etc....
Want a bit more poke, VIA EPIA + small flash disk.
"buy a server".. jeez, you work for IBM sales dept?

--
http://slashdot.org/~GuyFawkes/journal
Re:Trying to make your mark, eh? by pe1rxq · 2009-11-21 12:06 · Score: 2, Insightful

Is it so hard to not mix up dhcpd.conf and named.conf? Do you need virtualization for that?
Let me give you a hint: YOU DON'T

--
Secure messaging: http://quickmsg.vreeken.net/
Re:Trying to make your mark, eh? by dbIII · 2009-11-21 13:30 · Score: 2, Funny

There's two ways of looking at these things.
To me a room full of dedicated machines each running a single simple thing due to the 1990s approach of replacing a server with a dozen shit windows boxes that can't handle much but are cheap screams "a dozen vunerable points of critical failure".
Even MS Windows has progressed to the point where you don't need a single machine per service anymore in a light duty situation. Machines are going to fail, you may be lucky and it could be after they have served their time and been sold off, but fans, power supplies or a pile of other components that will stop the machine delivering the service will fail someday. A couple of half decent machines with rendundant power supplies which will give you the option to have all of your services within a decent timeframe if one goes down is a far better option than a pile of critical points of failure depending on the reliability of $5 fans.
Such things are cheaper now than a roomfull of crap boxes.
Now if I was the story submitter I'd put together a plan to have a box or two that can take over any of those required services at short notice. Someday something will break, and it's better to have a box ready or a plan you can read at 2am instead of bumbling through. Of course, GuyFawkes would fire me for that while if he was doing it his way I'd simply try to talk him out of his NT3.51 philosophy. Where is he going to buy a WRT54 at 2am on a Sunday morning in 2015 anyway?
Re:Trying to make your mark, eh? by bertok · 2009-11-21 13:56 · Score: 2, Interesting

Years ago the Microsoft DNS implementation had a very nasty memory leak and used a lot of cpu - you really did need a dedicated DNS machine for small sites and to reboot it once a week.
I think that's why people are still thinking about putting it in a virtual box so it can't eat all the resources, even for a pile of trivial services that a sparcstation 5 could handle at low load.
In practice, everyone just builds two domain controllers, where each one runs Active Directory, DNS, DHCP, WINS, and maybe a few other related minor services like a certificate authority, PXE boot, and the DFS root.
I haven't seen any significant interoperability problems with that setup anywhere for many years.
Still, virtualization has its place, because services like AD have special disaster recovery requirements. It's a huge mistake to put AD on the OS instance as a file server or a database, because they need to be recovered completely differently. The last thing you want to be doing during a restore is juggling conflicting restore methods and requirements!
Re:Trying to make your mark, eh? by bertok · 2009-11-21 14:05 · Score: 2, Insightful

Get real, for 150 users at WRT54 will do DNS etc....
Want a bit more poke, VIA EPIA + small flash disk.
"buy a server".. jeez, you work for IBM sales dept?
I'm responding to your comment:

I, personally, am TOTALLY in agreement with the ethos of whoever designed it, a single box for each service.
I recommended at least two boxes, for redundancy. He may need more, depending on load.
For a 150 user organization, that's nothing, most such organisation are running off a dozen servers or more, which is what the original poster in fact said. With virtualization, he'd be reducing his costs.
One per service is insane, which is what you said. If you wanted dedicated boxes for each service AND some redundancy, that's TWO per service!
Backpedaling and pretending that a WRT54 can somehow host all of the services required by a 150 user organization is doubly insane.
Re:Trying to make your mark, eh? by BitZtream · 2009-11-21 20:35 · Score: 2, Interesting

No, you need seperate servers for when the DHCP upgrade requires a conflicting library with the DNS servers which you don't want to upgrade at the same time.
THIS is where virtualization becomes useful.
On the other hand, my solutions is a couple of FreeBSD boxes with jails for each service. You could do the same with whatever the Linux equivalent is, or Solaris zones if you want. No need to actually run VMs.
Just run a couple boxes, seperate the services onto different jails. When you need to upgrade the core OS, do it on your backup box first, get all the services upgraded, switch it to your primary and repeat on the other.
Its not a matter of config files, its a matter of dependencies. If you've never run into a dependency conflict, you don't have much experience. Upgrading every service at the same time isn't always an option, sometimes newer versions in repositories are broken with regards to something you use or need.

--
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager

What 150 users? by painehope · 2009-11-21 11:26 · Score: 5, Insightful

I'd say that everyone has mentioned that big picture points already, except for one : what kind of users?

150 file clerks or accountants and you'll spend more time worrying about the printer that the CIO's secretary just had to have which conveniently doesn't have reliable drivers or documentation, even if it had what neat feature that she wanted and now can't use.

150 programmers can put a mild to heavy load on your infrastructure, depending on what kind of software they're developing and testing (more a function of what kind of environment are they coding for and how much gear they need to test it).

150 programmers and processors of data (financial, medical, geophysical, whatever) can put an extreme load on your infrastructure. Like to the point where it's easier to ship tape media internationally than fuck around with a stable interoffice file transfer solution (I've seen it as a common practice - "hey, you're going to the XYZ office, we're sending a crate of tapes along with you so you can load it onto their fileservers").

Define your environment, then you know your requirements, find the solutions that meet those requirements, then try to get a PO for it. Have fun.

--
PC moderators can suck my White pierced, tattooed dick. If you think pride == hate, s/dick/Aryan meat mallet/g.

P2V and consolidate by snsh · 2009-11-21 11:26 · Score: 4, Interesting

The low-budget solution: buy one server (like a Poweredge 2970) with like 16GB RAM, a combination of 15k and 7.2k RAID1 arrays, and 4hr support. Install a free hypervisor like Vmware Server or Xen, and P2V your oldest hardware onto it. Later on you can spend $$$$$ on clustering, HA, SANs, and clouds. But P2V of your old hardware onto new hardware is a cost-effective way to start.

Re:P2V and consolidate by masdog · 2009-11-22 04:39 · Score: 2, Informative

VMWare converter is free, and it works with ESXi.

Check it out here.

--
My Sysadmin Blog

Re:Cloud Computing(TM) by lukas84 · 2009-11-21 11:27 · Score: 2, Insightful

Yes, but for example management wanting 24/7 2 hour up&running SLA and having hired a single guy with a budget of 800$ will not work - this is important to get sorted out early. Management needs to know what they want and what they'll get.

Simple and straightforward = complex by sphealey · 2009-11-21 11:41 · Score: 4, Insightful

So let's see if I understand: you want to take a simple, straightforward, easy-to-understand architecture with no single points of failure that would be very easy to recover in the event of a problem and extremely easy to recreate at a different site in a few hours in the event of a disaster, and replace it will a vastly more complex system that uses tons of shiny new buzzwords. All to serve 150 end users for whom you have quantified no complaints related to the architecture other than it might need to be sped up a bit (or perhaps find a GUI interface for the ftp server, etc).

This should turn out well.

sPh

As far as "distributed redundant system", strongly suggested you read Moans Nogood's essay "You Don't Need High Availability" and think very deeply about it before proceeding.

Re:Google(tm) Cloud by jabithew · 2009-11-21 11:43 · Score: 2, Insightful

It is if you recommended outsourcing everything to the cloud.

--
All intents and purposes. Not intensive purposes.

Re:Cloud Computing(TM) by mabhatter654 · 2009-11-21 11:59 · Score: 3, Insightful

Except of course that management ALREADY HAS that because they've been very lucky for 7 years. Why spend money for what works (never mind we can't upgrade or replace any of it because it's so old)

I think what the article is really asking is what's a good model to start all this stuff. Your looking at one or two servers per location (or maybe even network appliances at remote sites) We read all this stuff on Slashdot and in the deluges of magazines and marketing material...where do we start to make it GO?

Maybe this is really a uni project by natd · 2009-11-21 12:15 · Score: 3, Interesting

What I see going on here, as others have touched on, is someone who doesn't realise that he's dealing with a small environment, even by my (Australian) standards where I'm frequently in awe of the kinds of scale that the US and Europe consider commonplace.

If the current system has been acceptable for 7 years, I'm guessing the users needs aren't something so mindbogglingly critical that risk must be removed at any cost. Equally, if that was the case, the business would be either bringing in an experienced team or writing a blank cheque to an external party, not giving it to the guy who changes passwords and has spent the last week putting together a jigsaw of every enterprise option out there, and getting an "n+1" tattoo inside his eyelids.

Finally, 7 years isn't exactly old. We've got a subsidiary company of just that size (150 users, 10 branches) running on Proliant 1600/2500/5500 gear (ie 90's) which we consider capable for the job, which includes Oracle 8, Citrix MF plus a dozen or so more apps and users on current hardware. We have the occasional hardware fault which a maintenance provider can address same day, bill us at ad-hoc rates yet we still see only a couple of thousand dollars a year in maintenance leaving us content that this old junk is still appropriate no matter which we we look at it.

--
Only big ligs use sigs.

Re:Cloud Computing(TM) by mabhatter654 · 2009-11-21 12:32 · Score: 2, Insightful

Why would you buy a cluster not the same architecture? You don't know what you're talking about. VMs generally aren't used to change architecture like that. In a Virtualized Cluster the "OS" is just another data file too! Just point an available CPU to your file server image on the SAN and start it back up... that's smart, not lazy!

Most people need virtualization because managing crappy old apps on old server OSes is a bitch. The old busted apps are doing mission critical work, customized to the point the manufacture won't support them and management doesn't want to pay out for the new version... or the new version doesn't support the old equipment. The leading purpose for VMs is to get new shiny hardware with a modern OS and backup methods to segregate your old hard to maintain configurations to instances. Then the old and busted doesn't crash the core services anymore. Instances that used to be on dedicated, busted hardware that used to require a call-out can be rebooted from your couch in your jammies! (I vote VNC on iPhone as thee killer admin app!) VMs include backup at the VM level, so those old machines that refused to support backup can be backed up "in spite of" the software trying to prevent it.

Re:Cloud Computing(TM) by lorenlal · 2009-11-21 13:10 · Score: 3, Interesting

I think what the article is really asking is what's a good model to start all this stuff. You're looking at one or two servers per location (or maybe even network appliances at remote sites).

I totally agree with your premise. In my experience taking something that appears to work (when you realize you've really just been lucky) requires some time to bring about the change that the business really needs.

Now, as for having two servers per location, that heavily depends on how those sites are connected. Are they using a dedicated line or a VPN? That's important since that'll affect what hardware needs to be located where. It's possible (even if unlikely) that some sites would only need a VPN appliance... But since the poster seems to want general advice:

VMWare ESXi is a pretty good starting place for getting going on virtualization. I've had a great experience with it for testing. When you feel like you've got a good handle, get the ESX licenses.

If SAN isn't in your budget, I still recommend some sort of external storage for the critical stuff... Preferably replicated to another site... But you can run the OS on local storage, especially in the early stages. But you'll need to get everything onto external storage to implement the VMotion services and instant failover. Get a good feel for P2V conversion. It'll save you tons of time when it works... It doesn't always, but that's why you'll always test, test and test.

As for the basic services you stated above (www, ftp, email, dns, firewall, dhcp):
Firewall (IMHO) is best done on appliance. Which should be anywhere you have an internet connection coming in. I'm sure you knew that already, but I'm trying to be thorough.
Email is usually going to be on its own instance (guest, cluster, whatever)... But I find that including it in the virtualization strategy has been quite alright. In fact, my experience with virtualization has been quite good except when there is a specific hardware requirement for an application (a custom card, or something like that). USB has been much less of a headcache since VMWare has support for it now, but there are also network based USB adapters (example: USBAnywhere) that provide a port for guest OSes in case you don't use VMWare.

Most of the poster don't 'get it' by plopez · 2009-11-21 14:12 · Score: 2, Interesting

The question is not about hardware or configuration. It is about best practices. This is a higher level process question. Not an implementation question.

--
putting the 'B' in LGBTQ+

Linux Vserver by patrick_leb · 2009-11-21 14:13 · Score: 2, Informative

Here's how we do it:

- Run your services in a few vservers on the same physical server:
* DNS + DHCP
* mail
* ftp
* www
- Have a backup server where your stuff is rsynced daily. This allows for quick restores in case of disaster.

Vservers are great because they isolate you from the hardware. Server becomes too small? Buy another one, move your vservers to it and you're done. Need to upgrade a service? Copy the vserver, upgrade, test, swap it with the old one when you are set. It's a great advantage to be able to move stuff easily from one box to another.

Astroturfing.. by Junta · 2009-11-21 14:19 · Score: 2, Insightful

If MS is going to astroturf, you need to at least learn to be a bit more subtle about it. That post couldn't have been more obviously marketing drivel if it tried. Regardless of technical merit of the solution (which I can't discuss authoritatively).

The post history of the poster is even more amusingly obvious. No normal person is a shill for one specific cause in every single point of every post they ever make.

To all companies: please keep your advertising in the designated ad locations and pay for them, don't post marketing material posing as just another user.

--
XML is like violence. If it doesn't solve the problem, use more.

Re:Cloud Computing(TM) by Anonymous Coward · 2009-11-21 14:55 · Score: 2, Informative

We've probably dropped ~20K (w/o licensing) in our VMWare ESX cluster. Basically it's the "poor man's version" because of all of our purchasing restrictions, but here's about what it is:

Basically a box with some 15K SAS drives in RAID1+0. Cost ~$5k
Server with some SATA 1TB drives again in RAID1+0. Around 5K as well
3x cluster nodes. Dual 771 with 8 or 16GB of RAM
Management node running Win Server 08

It's not a super config, and a lot of people will argue that it's not a true setup, but it's sufficient for our needs. I think we hit 4% CPU utilization across all the nodes the other day.

With VMWare, watch the 2TB filesystem limit. We ran in to that with our SATA array. Basically you have to slice it in to 2TB chunks to get VMware to accept it as a datastore.

As far as networking goes, we have a couple of gigE switches running the traffic. Our SANs are redundant, as we clone all of the machines from our SAS "SAN" to our SATA. If the "production" SAN goes down we can start up the clone from the SATA box in minutes. After the primary SAN comes back up we can VMotion it across to the other data store.

Re:Cloud Computing(TM) by turbidostato · 2009-11-21 15:12 · Score: 2

"The tension between budget and business requirements can be useful but it is largely a paper tiger."

Yes indeed, but not because of the reasons you highlight. There is no tension between budget and requirements since budget is just a natural outcoming from the requirements themselves: you don't need 24x7 services; you lose XXX dolars per hour when the service is down. Once you factor in the risk management is wishing to take your budget is just a matter of a multiply: it's XXX dolars per downtime hour multiplied by the risk you are accepting. You lose 10.000 per downtime hour and you don't want to lose more than 100.000 on a risk you measured to have a 10% chance (a ten hours downtime)? Then your allowed front cost for this is 30.000 (for iron under three years amortization).

I'm used to hear about "I want uber-redundancy and 24x7 disponibility" "well, that'll cost you XXX" "But I can't pay that!" That means that you don't earn that much from that system. It's never "I can't afford it" but "it doesn't get me so much".

Re:Cloud Computing(TM) by digitalchinky · 2009-11-21 16:22 · Score: 4, Insightful

That's a little harsh don't you think?

There are untold numbers of us in this guys position. Asking slashdot is a damn good start at finding a new methodology. Everyone has an opinion, some of them quite intelligent, a few might even work. It's ok for the fortune 500 cube dwellers to jump on the phone and call in a long standing contractor to 'handle it' - the rest of us have to slog through the marketdroid crap and translate the latest buzzword infestations to human speak - then just hope we don't screw it up or waste money.

So far the best suggestions appear to be to figure out how critical things are first (which will shape the hardware requirements), budget second. All the while this is encompassed by the usual core job functions that still need to get done.

So rather than point out the redundant, how about using your fingers to provide a potential solution.

Insurance... by magusnet · 2009-11-21 16:56 · Score: 4, Funny

1) Buy a comprehensive insurance policy
2) Write a detailed implementation plan that you copied from a Google search
3) Wait the 3-6 months the plan calls out before actual "work" begins
4) Burn down the building using a homeless person as the schill
5) Submit an emergency "continuity" plan that you wanted to deploy all along
6) implement the new plan in one third the time of the original plan
7) come in under budget by 38.3%
8) hire a whole new help desk at half the budgeted payroll (52.7% savings)
9) speak at the board meeting: challenges you over came to saving the company
10) Graciously accept the position of CIO

(send all paychecks and bonuses to numbered bank account and retire to a non-extradition country) :)

Re:Cloud Computing(TM) by magarity · 2009-11-21 17:20 · Score: 2, Insightful

Except of course that management ALREADY HAS that because they've been very lucky for 7 years

Whoa there - so using this logic we can assume the company has no fire insurance, etc, because they've been lucky and not had their building burn down in 7 years? Managers might not understand technical issue but one thing managers worth the title CAN do is manage risk ie: balance cost of risk mitigation against risk. I can well imagine a company of 150 people that actually doesn't have any mission critical servers worth spending a lot on redundancy, etc. I can also imagine a company that has gotten lucky while at the same time, the IT person(s) haven't explained IT risks/costs in proper terms because they assume the managers just aren't technical.

The original questioner definitely needs to do a proper risk / cost analysis and present it to the managers. (But right now his "ideas" are WAY too vague and not business need driven) A prompt, proper analysis and plan/alternate plan(s) for risk and risk avoidance is going seriously wanted. It will CYA for that magic moment any day now when these 7 year old systems start failing.

Re:Latest Trends by Z00L00K · 2009-11-21 19:29 · Score: 2, Informative

Any server that can offer a RAID disk solution would be fine. Blade servers seems to be an overkill for most solutions - and they are expensive.

And then run DFS (Distributed File System) or similar to have replication between sites for the data. This will make things easier. And if you have a well working replication you can have the backup system located at the head office and don't have to worry about running around swapping tapes at the local branch offices.

Some companies tends to centralize email around a central mail server. This has it's pros and cons. The disadvantage is that if the head office goes down everyone is without email service. But the configuration can be more complicated if each branch office has it's own.

It's also hard to tell how to best stitch together a solution for a specific case without knowing how the company in question works. There is no golden solution that works for all companies.

The general idea is however that DNS and DHCP shall be local. If they aren't then the local office will be dead as a dodo as soon as there is a glitch in the net. Anyone not providing local DNS and DHCP should be brought out of the organization as soon as possible. And DNS and DHCP doesn't require much maintenance either, so they won't put much workload on the system administration.

There are companies (big ones) that run central DHCP and DNS, but glitches can cause all kind of trouble - like providing the same IP address to a machine in Holland and in Sweden simultaneously (yes - it has happened in reality, no joke) - and the work required to figure out what's wrong when multiple sites are involved in an IP address conflict can cost a lot. And if you run Windows you should have roaming profiles configured and a local server on each site where the profiles are stored.

Local WWW and FTP servers - can work, but watch out too since you have to check out if it's for internal or external use. Do you really need a local WWW and FTP server for each site? I would say - no. And those servers should be on a DMZ. It can of course be one server servicing both WWW and FTP. The big issue with especially FTP servers if they are for dedicated external users is the maintenance of the accounts on those servers. Obsolete FTP server accounts are a security risk.

And if you run Windows I would really suggest that you do set up WDS (Windows Deployment Server). This will allow your PC clients to do a network boot and reinstall them from an image. Saves a lot of time and headache.

And today many users have laptop computers, so hard disk encryption should be considered to limit the risk of having business critical data going into the wrong hands. Truecrypt is one alternative that I have found that works really well. But don't run it on the servers.

--
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.

Re:Cloud Computing(TM) by lukas84 · 2009-11-21 23:56 · Score: 2, Insightful

Well, it's not like i've not got any clashes with out management (or even that of some of our customers), but i've found it to be the better approach to actually talk things through in the hope of getting a better understanding of both parties.

From my technical standpoint, it's very much important that management _exactly_ knows what they're getting for their money. This also means saying "No". Yes, you can lose customers if you don't promise them 99.999999999% availability for 50$ a month, but the real question if you actually wanted those customers in the first place - they may find some idiot which agrees to work with them, but that's their loss.

Even internally, as we also run our internal infrastructure, it's important to say "no" to unreasonable tasks and stupid ideas. Either management trusts you to actually do the job they've hired you to do, or they don't - then you'll need to find a new job.

Re:Cloud Computing(TM) by gmccloskey · 2009-11-22 01:40 · Score: 2, Insightful

mod parent up.

The first step is to find out what the business wants, and how much it is willing to pay. THEN you go out to find out what tech is appropriate/affordable to do it.

Ask the heads of each office, and the main business managers what they want the tech to do now, in a year and in three years. Do you have a business continuity plan that has to be allowed for. If you don't have a BC plan, now's a good time to have one done, before you buy a load of kit that may not do the job.

Once you have a list of business needs, and put them in a prioritised list (again the managers set the priority), you go out and look at what can do the job. Assuming you find a reasonable solution within budget, you need to plan the migration.

Protip: do not attempt to migrate everything in one go. Do it in steps, with breaks in between.

Proprotip: whatever your migration, be able to revert to the original solution in less than 8 hours - ie one working day.

Migration is the biggest gotcha - plan, plan and plan again. Do a dry run. Start with the least critical services. You do have backups, right? Fully tested backups, from ground zero? You do have all your network and infrastructure accurately and completely mapped out, and all configuration settings / files stored on paper and independent machines?

Both arguments for VM and KISS have their place - only you can decide. But when you do decide, make sure it's based on evidence, and will end up making the business better.

Don't forget Total Cost of Ownership - the shiny boxes may run faster, but will you have to hire two more techs to keep them running, or a new maintenance contract?

Don't forget training - for you, your staff and the end users. If you're putting shiney newness in place, people will need to know how to use it, and do their jobs at least as quickly as on the old solution. No use putting in shiny web4.0 uber cloud goodness, if the users end up spending an hour doing a job that used to take 5 minutes, because they don't know how to use it properly, or the interface doesn't easily work with their business processes.

good luck

Re:Cloud Computing(TM) by TheLink · 2009-11-22 04:16 · Score: 2, Informative

I have vmware machines on one server at home. There are still benefits even though it's not a cluster. So it's not that stupid.

It is easier to move the virtual servers to another machine or O/S. This is useful when upgrading or when hardware fails or when growing (move from one real server to two or more real servers). There's no need to reinstall stuff because the drivers are different etc.

You can snapshot virtual machines and then back them up while they are running. Backup and restore is not that hard that way. So even if you have a single point of failure, if you have recent image back ups, you could buy a machine with preinstalled O/S, install vmware, and get back up and running rather quickly.

And when power fails and the UPS runs low on battery, I have a script that suspends all virtual machines then powers the server down. That's more convenient too than setting up lots of UPS agents on multiple machines and hoping they all shutdown in time.

DB performance sucks in a vmware guest though, so where DB/IO performance is important, use "real" stuff. Things may be better with other virtualization tech/software.

--

Too many replies beneath your current threshold

Are blades really such a good idea? by TheLink · 2009-11-22 04:36 · Score: 2, Informative

In my uninformed opinion, blades are mainly a way for hardware vendors to extract more money from suckers.

They probably have niche uses. But when you get to the details they're not so great. Yes the HP iLO stuff is cool etc... When it works.

Many of the HP blades don't come with optical drives. You have to mount CD/DVD images via the blade software. Which seemed to only work reliably on IE6 on XP. OK so maybe we should have tried it with more browsers, than IE8, but who has time? Especially see below why you don't have time:

So far I haven't seen any mention in HP documentation that the transfer rate of the mounted CD/DVD image (or folder) between your laptop to the iLO software to a blade that you're trying to install stuff on is a measly 500 kilobytes per second. But that's what we encountered in practice.

Yes you can attach the blade network to another network and install it over the network, but if you can do that, doesn't that make the fancy HP iLO stuff less important? You might as well just get a network KVM right? That KVM will work with Dell/IBM/WhiteBoxServer so you can tell HP to fuck off and die if you want.

Which brings us to the next important point: Fancy Vendor X enclosures will only work with current and near future Vendor X blades. In 3-5 years time they might start charging you a lot more to buy new but obsolete Vendor X blades. Whoopee. What are the odds you can use the latest blades in your old enclosure? So you pay a premium for vendor lock-in and to be screwed in the future.

I doubt Google, etc use blades. And they seem to be able to manage hundreds of thousands of servers. OK so most of the servers might be running the same image/thing... So that makes it easy.

BUT if you are having very different servers do you really want them in a few blade enclosures? Then when you need to service that enclosure you'd be bringing down all the different blades...

--

Too many replies beneath your current threshold

Many datacenters can't build out bladecenters by Colin+Smith · 2009-11-23 03:00 · Score: 2, Insightful

The biggest problem I've found with blades is that you can't fill a rack with them. Several of the datacenters I've come across have been unable to fit more than one bladecenter per rack. Cooling and power being the problem.

At the moment. A rack full of 1U boxes look like the highest density to me.

--
Deleted

Slashdot Mirror

Best Practices For Infrastructure Upgrade?

52 of 264 comments (clear)