What Do You Look For in a Big Iron Review?
ValourX writes "We're starting to write more reviews of enterprise-class hardware and software and although we've done pretty well with our reviews, the high-end products are a lot trickier when it comes to testing and evaluation. Obviously it is not possible to build an enterprise-grade 'your neck is on the line' production environment just for writing reviews, but maybe we can do something smaller, just for testing purposes. What do you as an IT professional want to read in a review for a server OS or a high-speed switch, or a big iron server or proprietary workstation? What tests should we run? What results and feature comparisons are going to be most meaningful to you?"
Well the 2 main issues with Big Iron Equipment is How Well it handles Load and Scalability. For Load They should max out the system slightly above the recommended specs and see how well it handles it. Most people don't care for overall benchmark but more issues that affect the user. Say it was a WebServer We don't care how many pages/second it can handle but how well we get the webpages when the system is maxed out. Do we have to wait 5 minutes and the page just pops in. Or do we wait 5 Minutes for a page to load but we see the results of it coming in. When working above the required load how much does the system heat up (causing possible failures in the future). Secondly is how well can it scale, Can Extra Processors be added on, Can you add/hotswap processors on the system. What is the Max Ram it can hold can you add more is there room to add more. How compatible is it with competitors stuff (Say an IBM Server with a Sun Storage Array) how well do they follow the standards so you are able to use the server even if the company who produced it died.
Speed (which a lot of people put there Big Irons to the test) is really not that important of a detail. A PC with a 3 Ghz Processor will out perform a Sun Fire15k with multiple processors, for any single task. But when it starts handling load the Sun Fire will handle it better then the PC. When companies decide to buy the Big Iron they want it to be an investment that can last them at least 3-4 years preferably 4-10 years. And all they need to do is add stuff to it so that it scales with the time.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
If you knew our operations guy, you would test resistance to physical attacks.
Pictures. I like hot chicks standing next to big servers. Big servers in action shots are good too.
Real-world numbers from some inductry-standard benchmarks would be good. You can get TPC-C and SPECint from most vendors, but those are run after weeks of tuning by their internal experts.
I would like to see what they get in a regular user's hands.
Phil
I guess today is a passable day to die.
Bossman Compatibility: Verifies that the hardware vendor has taken my boss's boss out to dinner and purchased suitably expensive drinks. Rating based on the number of stars the restaurant recieved, although points may be docked if the filet mignon was a little overdone. This one is related to the...
CYA Verification: Vendor must have a name recognizable to people who read periodicals such as "CTO Magazine" so, when it breaks down, I can say "who ever hear of XVY Company's gear being bad?" If the vendor is a company like Dell which also sells home PCs, this metric should also include going to my boss's boss's house and verifying that his Dell is running okay so I don't have to hear shit like "I don't know why we got Dell, my desktop at home has problems all the time, too, and it's only six years old!"
Sweetness Factor: Not as much of a factor as it once was, depending on how big of iron we're talking about. But it the thing has, say, requires a cooling tower that happens to have a waterfall built into it, that's point right there. May conflict with....
The Under-Desk Operation Profile: Since it'd take at least a month and a dozen SRs and books of useless paperwork just to get the beastie screwed into a rack at our NOC, the server must both fit nicely under the desk in my cube with all the other machines and not be too loud. Generation of excess heat is a plus since the facilities people have set 61 degrees as a reasonable temperature for my office in the winter.
Extra-App Capacity Testing: For when some moron in another department comes in and convinces my boss's boss that "all that server is doing is running the backend for our entire operation, so can we put our incredibly messy half-working app on it too and treat it like QA?" If this server can alert a Terminator unit to go to the aforementioned coworker's home in the middle of the night and slay him and his family, this requirement can be waived (oh, I wait for the day this will be waived....)
I'm sure there are a few other benchmarks you could run, but honestly these are the Big Five that I decide on.
Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
Make sure it has a good number of phaser arrays and photon torpedo banks.
when i'm choosing a big iron, i try to find one which can get the big creases out of my big pants
As an IBM employee, I want to see brain-washingly favorable reviews of IBM hardware. Especially the ones that will make me money. :-)
Cheers,
Matt
Terrorist, bomb, al Qaeda, nuclear, yellowcake, kill, assassinate. Carnivore is dead... long live Echelon.
Can the system be expanded without rebooting, can you manage it using computer operators that wouldn't trust to determine which end of a mop should be applied to the floor.
In many a large setting, a big concern is "does it play nice with XYZ." [Insert cliche about certain-hardware manufacturer that set the "random" retry ethernet window to minimum, rather than minimum+random, to achieve better performance for its cards, intentionally mucking with interframe spacing....] XYZ is going to be: Specific app or other (hardware) product. If the apps are internal (as some of ours are) then you can't help us -- but there are some fairly customizeable out-of-the-box apps that you could test against....
Basically, none of these purchases happen in a vacuum. The merits of the technology matter, but "playing nice" is a dealbreaker. If this causes ANYTHING to break, forget it for now. et cetera.
When in doubt, parenthesize. At the very least it will let some poor schmuck bounce on the % key in vi. (Larry Wall)
Can it survive a good /. ing ?
I'd be interested in how well it works after the following:
...while the system is on. ...while the system is off.
Coffee spilt in one of the CPU PSUs.
Coffee spilt on the keyboard (if present).
Coffee spilt in one of the disk system PSUs.
Swapping two of the disks in an pack...
More seriously, it would be handy to know the ratio of workload handled to watts consumed. Workload:cooling required would also be handy.
Phil
I guess today is a passable day to die.
Who asked for it and more importantly did anyone pay for it either directly or indirectly.
Help fight continental drift.
I have had problems in the past looking at various hardware and comparing the true costs of it, especially support. With the third party support companies out there (we use Terix, amongst others), there are so many options, and with yearly support contracts in excess of $100,000, for our relatively small company, mis-calculating these in a recommendation can be a very big deal.
Just my $.02... oh, also just plain reviews of support companies on different hardware would be good also.
"If voting could really change things, it would be illegal. " - Revolution Books, NY
A good cost analysis is worth a lot. Say you look at a new and shiney server system, it has the latest OS, servers, and features. But what is that worth?
If the cost of this "new" server is 5X more expensive (as a package) than another system that gives you the same functionality and comparable performance then knowing that this alternative exists and what the performance / price difference is would be valuable.
:)(smile)
I've worked with too many companies whose products *do not* scale the way they claim, or whose products will techincally scale, but are at that point virtually useless. Use bogus data, who cares, but test the data volume, throughput, storage, archival, etc. to the limits and make sure the product is still useful. This is the single biggest problem I've had with enterprise installations, and the problem as an architect is that it's difficult to test on a very tight timeline for product evaluation. I've had egg on my face more than once because I had to take the vendor's word for it.
Second, install the application yourself. Don't let the vendor do it for you. And when you install it, install it as an enterprise would. That is, if it's an n-tier application, or has multiple components, don't take the "default" installation and put all of the components on one system. Of course this will work. Try distributing the components over multiple systems like an enterprise would. Often this is where the complexity comes in and products falter.
One company I worked for purchased some software from Tivoli. After 6 months, and a team of engineers onsite from the vendor, they still couldn't get the components to talk for more than a day without problems (after weeks of installation), and still couldn't get useful data out of the database due to its size, so we took our $500mil back and bought something else. Having an evaluation that would've tested this would've saved us a bundle.
akad0nric0
This sentence no verb.
...IN SOVIET KOREA, old people iron YOU!
As someone who has to build, integrate, then deliver systems to other peoples' server floors I have some things that would be nice to know. How much power does the thing ACTUALLY use, not what the manual says, but real world usage (all you need is a clamp annmeter and a split extension cord) This test helps us determine power requirements if we deliver 100 of these, and cooling requirements.
How many shells does it hold, and is it a revolver or clip. Does it come with a silencer to avoid pesky street light cameras?
God spoke to me.
** For a server OS
How easy is it to install? How easy is it upgrade? How easy is it, if its a different architecture (ie, Windows, Linux, Mac), to migrate big programs (Exchange, databases) from one to another? How well does it gel with existing servers? Do they recognize one another? Do they acknowledge? Can they fit into existing Active Directory-type listings effectively?
Most to all shops are not created overnight. They are built on mistakes or tried-and-true methods that are (usually) quickly outdated. The problems arise when you try to "fix" the existing problems by bringing in more robust OS's and capabilities. It is the meshing of these that is more important to Network Admins that tales of how well this server did on a single machine in a non-network environment.
** High-speed switch
Does it scale (how easy is it add one to five or more on a single chain?)? How is the admin interface? Is it web-based? Console (ie, serial port) based? Does it have both in case console is all that's available? Can you break it or overrun it with traffic?
** Big iron server or proprietary workstation?
Someone else has mentioned scale so let me throw in something different: How easy is it to recover? Does it have Raid? (Well, it should obviously) Break it, remove a disk and see if you can recover from it easily. "Lose" a driver and see how quickly you can recover.
Something I'd love to see is a review that includes a call to the tech support of that server. Don't tell them you're a reviewer, just tell them you got a problem. See how quick they respond, how informative they may be, how far does it have to go before they call in reinforcements? (ie, higher level support)? Will they call on-site repair? If so, how long did you have to troubleshoot before they determined it? Sometimes a card or piece will break and front line support will make you bleed through their ignorant manuals step-by-step when its clear that Piece A is broken and need a on-site tech with experience with that hardware to come and replace it.
** What tests should we run?
Stress, along with installing/upgrading hardware.
** What results and feature comparisons are going to be most meaningful to you?
I believe that over the course of this comment writing and thinking back over my dealings on big iron hardware, that comparisons in regards to tech support, informativeness, and responsiveness are something that can immediatley be added to the review process.
Something more long-term would be how long did the server run before downtime, problems, burnouts, or hardware failures.
+ doom fps
+ Gentoo compile time
+ Overclocking possibilities
+ Case mods, preferably with blue neon lights
10 ?"Hello World" life was simple then
Reviews for this sort of equipment are pretty much meaningless. I might buy a 16-way server to run Oracle, you might buy the same system to run large scale data analysis. PCs are easy to review and evaluate, they are commodity; can be used for any of multiple purposes. When I buy a large SMP system, I am buying it for a specific purpose, and the chances are it will never be re-purposed. So before spending uberbucks on a system I want to talk to the vendors other customers who are running similar workloads on the same tin. If the vendor gives me a long list of folks who use their systems for similar applications that is usually a good sign, if they can't then I move on.
Large scale SMP systems require a slightly different mentallity than PC systems, as anyone who has managed a P690 or E10k will attest. You expect performance, you expect reliablity, you expect service, and for what you pay you better get it!
*narf!*
Nothing is more annoying than if you buy a big frame and then you find out that a silly little piece of software is no longer maintained. Or like HP announced today, that they are once again changing their HP-UX roadmap and once again proved that they can't be taken seriously if they predict anything further out than 3 months.
It all comes down to the simple fact that in the end, almost all of the big boxes are the same to the application. Sure, some have hard and some softpartitioning. Sure, you have different cpus, memory latencies and whatever - to the app it is just a bunch of system calls. But in the end, if you can't run your app on it, its useless, no matter how fast, redundant or whatever it is. We have completely moved away from selecting the box by its hardware properties. They are all sufficiently redundant and whatever. We go purely by how well the software we need to run is supported on the OS and if they have a roadmap that can be trusted.
Peter.
This is going to be harsh, but you need to hear it.
Obviously it is not possible to build an enterprise-grade 'your neck is on the line' production environment just for writing reviews
In order for the review to be accurate, that's how it has to be tested. Evaluating enterprise equipment in a non-enterprise environment with people who have no enterprise experience is pretty much worthless...and you're not going to fool anyone.
There's also no market for this sort of thing. Equipment on that level is bought because of high level executive briefings, price negotiations, migration options, and politics. Why? Because the market is so cutthroat and all the features that matter are there. The decisions are not made on whether or not a power cord was included, it was easy to unpack, the manuals were clear, how well built it looks, and how it did on SysMark SuperServerSimulator 2005...which is about the only thing all you 2-guys-with-a-webserver "hardware review" sites know how to do.
Further- often when a hardware vendor wants to get a contract, they provide a unit for evaluation.
On top of that, the major analyst firms already fill what little niche there is, and they have really big names 90% of the important people with Nice Shoes will recognize, which means even if that analyst is wrong, the decision to go with their recommendation is justifiable and won't get the Nice Shoes person fired. You'd be lucky if .01% recognized your name, much less trusted it. "Jones! Why does our website keep crashing?" "Well, we're having a lot of hardware problems." "Why did we go with ABC for our servers?" "Oh, XYZhardware.com said they were the best." "Jones, clean out your desk."
So...sorry, there's no market for what you're trying to do, and you don't have the means to do it.
Please help metamoderate.
When I hear "Big Iron" I think mainframes.
In particular, big IBM mainframes (s/3x0) running something like MVS (maybe VM at a push).
Anyone else think the term "Big Iron" is used innapropriately to describe a bunch of piddling little boxes that don't even need an air-conditioned datacenter equipped with an automatic Halon fire extinguishing system?
Simple one-day, weekend, or even weeklong reviews are meaningless in the corporate IT environment. Hell, the merits of any particular vendor's gear isn't truly relevant either. I've worked in an institutional IT environment and a corporate one, and this is how purchasing works:
1. Requirements solicitation - figure out what needs we need to fill, be it wifi net access, a file server, etc
2. Vendor research - contact the usual suspects in the field (networking, big iron servers, etc) and arrange for consultation and formal bids to be made. NOTE: this step is skipped ENTIRELY if the company/institution already has a corporate account with a vendor that provides the appropriate services that you require.
3. Formal bidding process - pit the vendors against eachother, it's fun when you get them onsite to demo their gear. Generally vendors will lower prices to sweeten their bid.
4. Award the contract to one of the vendors, or (more likely) have funding denied to you by the beancounters and end up doing a half-assed implementation of what any of the vendors was going to do.
Individual machine or software reviews are a *tiny* part of the process for securing enterprise level hardware/software services.
------- "From bored to fanboy in 3.8 asian girls" ----------
1. How much redundancy is available
a. Are there multiple fans or fan trays?
b. Are there multiple power supplies?
i. How many are needed to power the system?
ii. Can they be powered on and off individually?
b. Are there multiple CPUs?
i. Can they fail independantly, without outage?
ii. Can they be partitioned or dedicated?
c. How about multiple storage controllers?
2. How maintainable is it?
a. Hot-swapability
i. CPUs?
ii. Fans?
iii. Power supplies?
b. Manufacturer longevity
c. Product line stability
d. Off-the-shelf parts?
3. Physical specs
a. It's gotta be rack-mountable, right?
b. How many U high?
c. How deep?
d. Are there pluggy bits on the front, back, both?
e. How much does it weigh?
f. How bloody annoying are the rack rails?
g. Can you open and close it with things mounted directly above and below?
h. Can you swap out any and all parts without unracking?
i. How much heat does it generate?
j. How much power does it require?
k. Is there a maximum rack density specified?
l. Is it loud enough for OSHA to require ear plugs?
4. Expandability
a. How many net ports minimum/maximum?
b. What kind of net ports can it have?
c. How many storage thingies (hard drives, etc)?
d. Is there an upgrade path for the CPU(s)?
5. Servicability?
a. Is there a "lights out" managment board available?
b. Does it require dedicated management software?
c. Does it support SNMP?
i. Standard MIBs?
ii. Custom MIB(s)?
iii. Can it send traps?
d. Are you forced to connect a monitor/keyboard?
e. Is it supported by the obnoxious management/monitoring software of my choice?
F. Miscellaneous
a. Can it run Linux?
b. Does it force me to run Microsoft software?
c. Ok then, what the hell O/S does it run?
d. Can I have the source?
e. Please?
f. There's no SCO crap in there, right?
g. If I fill a whole rack with them, will it impress the chicks?
h. Ok, then how do I impress chicks?
i. What the hell's a chick, anyway?
I'm sure I've left out a ton of stuff, but those are some quick thoughts.
I always ask for an FMEA - Failure Mode and Effects Analysis - for typical and HA deployments. Big, expensive equipment tends to fail in big, expensive ways, and I want to know all the ways it can fail, all the potential effects of those failures, and what impact they have on my enterprise. Then, I want to know the recommended mechanisms and patterns that can be employed to minimize failure impact.
Honestly, when it comes to purchasing very expensive machines, I don't think IT departments should be looking at a journalist's review. They need to be doing the research and testing themselves.
Vendors will bend over backwards to get you to buy their big-ticket items. They will generally give you test machines and allow your engineers to hammer away. Those making the purchasing decisions will talk to their engineers, and value their opinions much higher than those of a magazine.
At least thats how it should work, and that is how it does work in top companies who rely on these machines for their entire business.
Overload the hardware as badly as you can, see how it copes (Experience: practically all OS's have a "breaking point" after which you need to restart the machine to recover fully).
Try to install faulty components, see what happens (Experience: even if the manufacturer claims failure tolerance, this is seldom the case).
Check if the iron really runs in the manufacturer's reported maximum temperature and what happens at the temperature plus couple degrees (Experience: Sun boxes keep running, HP/UX boxes immediately shut down).
Check if the system runs itself down gracefully when UPS reports power is out. Cut power entirely, see what happens.
Check if you can administer everything without touching the iron, including shutting the box down and starting it (Lights Out Management).
"Although it is not true that all conservatives are stupid, it is true that most stupid people are conservative."
Big Iron usually means redundancy and scalability. Like, how IBM mainframes really don't ever have processor faults or crashes, and don't lose data, even if the event of natural disasters (if you set up your system right). Plus, you can just plug whatever into the system, and it will all work with minimum configuration.
VMS is nearly as good; some argue it's better.
--TheOrangeSquid Is it any wonder things seem so awry? We swim in a sea of confusion and don't have to think to survive
are the manufacturers specifications.
Product cross comparison of specifications using iedntical test suite rather than manufacturers 'tuned' suites.
Real world test comparison. How well does the box do it's job when it's doing everything it will do in deployment at once.
Clear breakdown of cost so that all the 'gotchas' like proprietary cards or code that is not included, warranty, spare parts turnaround, ease of diagnosis, actual electric consumption, etc.
Now I'm the grandest Tiger in the Jungle!
To capture the essence of the enterprise, you need to hire four newly graduated students and have them write the worst program possible in Java. Don't worry, the "worst" part comes automatically. Then, apply this program to several brands of servers and see which one actually survives. That is the one you recommend in the review.
-- "Makes Little Debbie look like a pile of puke!" - Moe Szyslak
Get whatever webserver the vendor recommends, throw /. on it, find the biggest firehose you can and throw the IP of the test system on the /. homepage.
Measure the amount of sweat from the marketoids foreheads.
Dismantle the system. Without powering it down. How many components can you remove, following all procedures, before the system becomes unavailable?