Your NetApp!!! Then you are already paying through the nose for storage.
*shrug* It costs the same as other storage systems in its class.
As for IOP's, *very* few machines ever get close to pushing the IOP's for one array, let alone the storage system.
The OS drive for a Linux VM "idles" at about 7-8 IOPS. 200 of them, then, basically uses up a shelf of 10k disks.
NetApp is about the only storage solution that provides dedup in the box at no extra cost. However per GB of storage it is one of the most expensive solutions out there.
They're a top-tier vendor to be sure, but no more expensive than alternatives like IBM or EMC.
No. Intolerance for stupidity and incompetence is not a "lack of empathy". This is particularly true when multiple tons of metal and people's lives are involved. Someone who does not grasp that neutral disconnects the engine from the wheels should not be driving a car, because they are a hazard to themselves and, more important, everyone around them.
Consider this - in all the times you've driven in an automatic, how many times have you shifted to neutral - not through neutral, but to neutral?
Hundreds.
For most people that would be nearly zero - neutral in an automatic is nearly useless, you can't roll-start an automatic and unless you need a tow or a push, you aren't going to use it. Take it one step further - how many times have you shifted to neutral while in motion? For the vast majority of people that is zero.
Why would the situation be any different in a manual car ? Unless you want to make the disingenuous argument that passing through neutral in the process of changing gears is being "in neutral while the car is in motion" ?
But you know what? That's all moot, the point still stands that if he needed to shift to neutral something was seriously broken to begin with.
It's not moot. Mechanical failures happen - rarely on modern cars, to be sure, but they do happen - and not reacting appropriately to a stuck accelerator on a *freeway* is inexcusable.
Having a remote control instead of an actual human driver as somebody suggested is a bit of an overboard comment. There's still no such sophisticated form of remote control that would allow such stunt.
What aspect of the technology do you think is deficient ?
With the same analogy in mind, we should be seeing planes take-off and landing, not only being autopiloted at high altitudes at pretty stable conditions.
Modern autopilots can do landings and takeoffs.
So far, only military drones have similar control and I believe, even not knowing technical details, limitations exist. That's why actual humas still pilot military planes.
Humans are still piloting because even "cutting edge" military aircraft were all designed 20-30 years ago. This is almost certainly the last generation of military planes that will have humans riding shotgun. Human pilots are still in passenger plans because of the psychological factor, so they'll probably be around for a lot longer.
Basically its a bunch of innuendo, like he [i]might[/i] have been late on payments on the car (since proven false) or that he should have shifted it to neutral (not an intuitive action for someone who has never driven a manual transmission - and certainly a last resort that does not negate the existence of a problem to begin with).
Every automatic transmission I've ever seen has neutral. Most of them don't even require pressing the release button to move from Reverse or Drive into neutral. Anyone who doesn't understand, at the very least, that "N" means the car doesn't go, shouldn't be driving.
Who does 2GB OS installs especially in a 200+ VM environment? That's insane.
We certainly do. Why wouldn't we ? Trying to shave a few 10s or hundreds of MB off installation sizes is wasted time when your storage system can deliver similar (and more benefits) without the (expensive) human overheads.
I agree that deduplication is a nice addition to the virtual tool-set but it only seems to really ad a benefit to very specific environments.
"Very specific" ? You mean anyone doing non-trivial virtualisation ?
If I have optimized OS installs and the VMs run completely different data-sets from different organizations then the cost (both money and system resources) of deduplication seems to outweigh the benefit of saving a few G especially in a world where HDs come in 2TB sizes.
Firstly, to get sufficient IOPS, drive sizes are ~500GB, not 2TB (or even smaller for SSDs).
Secondly, the OS part of a large proportion of your VMs is always going to be identical, allowing for large savings.
Thirdly, savings aren't just in raw disk space. You also save IOPS (since the dedupe should be carried through to the cache layer) and bandwidth if you're replicating over a WAN.
Now the million dollar question to ask is how much does your dedupe solution cost?
Nothing. Our NetApp has it by default (who charges extra for dedupe these days ?).
The reason being any dedupe that is supported against a virtualization solution we have looked at costs more than just buying the frigging disk.
Except it doesn't cost any more and it saves IOPS, meaning we need to buy less disk not only for space, but for performance as well.
The level of dedupe in bulk storage is likely to be low as well, besides which the cost of dedupe on a couple hundred TB of disks is rediculas. Even for backup one has to wonder as well, tape is again really cheap, and dedupe for hundreds of TB is bloody expensive.
If your dedupe solution has differing costs depending on how much data you have, you've got the wrong solution.
I wonder how much this approach really buys you in "normal" scenarios especially given the CPU and disk I/O cost involved in finding and maintaining the de-duplicated blocks. There may be a few very specific examples where this could really make a difference but can someone enlighten me how this is useful on say a physical system with 10 Centos VMs running different apps or similar apps with different data? You might save a few blocks because of the shared OS files but if you did a proper minimal OS install then the gain hardly seems to be worth the effort.
Assume 200 VMs at, say, 2GB per OS install. Allowing for some uniqueness, you'll probably end up using something in the ballpark of 20-30GB of "real" space to store 400GB of "virtual" data. That's a *massive* saving, not only disk space, but also in IOPS, since any well-engineered system will carry that deduplication through to the cache layer as well.
Deduplication is *huge* in virtual environments. The other big place it provides benefits, of course, is D2D backups.
Note that you can't *change* the file (because that would just split the files up again), but being able to read the file (when you couldn't before) or knowing that another copy exists elsewhere can be very useful knowledge.
If you can "generate a file" that can be deduplicated, then by definition you already know about the date in that file.
Unless you "deduplicate" the CPU work, that's not going to happen. ^^
Sure it does. CPU power is generally the _last_ thing you run out of in virtualised environments, and that's been true for years.
On a modern, Core i7-based server, you should be able to get 10+ "virtual desktops" per core on average, without too much trouble. IOPS and RAM are typically your two biggest limitations.
I don't know much about the subject, so forgive me if this is a dumb question, but in that scenario, if the data for a file becomes corrupted on the hard drive, say a critical system file, doesn't that mean that all vm's using it are pooched?
Yes, but a) this is something inherent to anything using shared resources, and b) there's not a lot of scope for such corruption to happen in a decent system (RAID, block-level checksums, etc).
The answer would be "yes", to both questions. I even have a fancy-schmancy.ppt (it's all these people respond to) that shows the cost of ink vs. toner (all I have to do is fill in the make/model of the inkjet and laser printers), with yields for both. I put it in front of some "mountain" pictures (to help re-enforce the mountain of cash they'll be spending on the inkjet). Hard to put into words. Maybe there was too much info in the slides, but all they could focus on was "But, this one is cheaper right now!".
Out of sheer curiosity - since you've obviously run the numbers - what's the break even point between an inkjet and a laser ? How many pages ?
Actually - in Server 2008 R2, there's no other way to do certain options within the machine. MS has tried to force you to have your system as part of a domain, and be managed by all the gooeyness that implies, but sometimes there are needs that require that a machine not be a domain member and not be accessible via a GUI. (I know!!! Shocker!!!)
What are you trying to do ? How are you trying to do it ?
MS has actually paid lipservice to this with Server Core, but it's merely lip-service. There's a splattering of shells available to accomplish certain tasks which can no longer be easily done through APIs (since they seem to have vanished in some cases, or become largely unusable in others - try changing network configuration, for instance)
I struggle to believe there isn't a programmatic way to change the network configuration. There certainly is a commandline way (using netsh).
It's pretty simple - Because you cannot truly run "Least Privileges" nor Audit a particular user's actions in a service that runs as SYSTEM or any of its elevated cousins. This is how worms wreak havoc. Get a buffer overflow exploit on any windows service running under one of these elevated accounts, and the machine is completely owned.
Firstly, that explanation has little relevance to your original comment. Secondly, how is this worse to root on any UNIX system, or any highly privileged account on any system ? If some process running at high privileges is taken over, then the system is more vulnerable that it would be otherwise. This is a basic fact of any multiuser design, not some design flaw only present in Windows.
Why are you running your services as SYSTEM in the first place ?
Actually, that brings up a point. Since this is about security flaws in their distribution, wouldn't this make them liable if something happened to your sever? "They gave me faulty software which THEY KNEW WAS FAULTY because they wanted to charge me $xx to get the fix"...?
Only if they knew that specific fault existed and would impact you before selling it - and even that assumes the standard "no liability" disclaimers could be circumvented.
Design your user interfaces with the same care and diligence as you define your application's architecture and you won't need to fiddle with it every week.
So since the fundamental Windows UI has remained basically unchanged since 1995, I guess that means Microsoft did a good job ?
in fact the ONLY OS I have used that has remained stable in it's UI has been OSX
The OS X UI has changed significantly since its first release. Expose is the most obvious example, but there are many more.
Windows has changed radically every release.
Your bias is laughably obvious you say that Windows has "changed radically" between each release, while also saying neither OS X nor Linux have changed.
To anyone using Windows (XP, Vista or 7) right now, go ahead and open up an Explorer window, and type in ftp:// followed by any url.
I just tried this on a Windows 7 PC. An unresolvable name returns an error in a few seconds. A resolvable name, with no FTP server on the other end, produces the wait cursor, but I can click on another drive or folder and it responds in a few seconds (other Explorer windows are instantly responsive while this happens). A working FTP site (eg: ftp.microsoft.com) opens with a listing pretty much instantly.
1.) Microsoft decides it's finally time to re-design Windows from the ground up. They hold 95% of the market, so it is they who must make this change.
Why would they do this ? They already have an OS that was designed from day 1 for multiprocessor systems and is, at worst, on par with every alternative platform.
Before too long you'll end up with a hypervisor-OS that merely tells the applications which CPU it gets and where its memory is. The OS will be shoved further into the background, as it should be, and everything will be just plain cooler.
If we want efficient code, we have to figure out ways to reward the programmers that write it. I don't see any sign that people anywhere are interested in doing this. Anyone have suggestions for how it might be done?
Defining what you mean by "efficient" (and why that definition should be the primary goal of software development) would probably be a good place to start.
Completely wrong on the first two counts, and the third is irrelevant. Here is why: Grand Central Dispatch is not a "thread manager" in the sense you mean. It makes and manages threads on its own, rather than relying on the programmer to handle them in inside the application. That is what it is all about and what makes it different, as you would know if you knew the even slightest thing about it.
So you're saying GCD can take existing, unmodified applications and inherently unparallelisable problems and make them work across multiple processors ?
Therefore, my comment has everything to do with what you wrote. No, the programmer does NOT have to design the program around threading. That's the point.
Yes, they do. GCD can't magically make arbitrary code multithreaded and parallel. It can make it easier for the developer to make their code multithreaded, but it can't do it for them. Ie: the developer still has to know what they're doing, and whatever it is they're trying to do has to actually benefit from multithreading.
Irrelevant.
It's completely relevant. The best application design and threading in the world won't see much benefit if the OS scheduler is crap.
GCD is NOT just a "thread scheduler". It creates and manages threads on its own, even in applications that are not written to be threaded. Does Windows do that? Any flavor or Windows? No. They do not.
Vista's (and later) "Thread Pool" encompasses some of the functionality of GCD. Parallel Extensions is another part of Microsoft's similar technology. Or there are third-party libraries.
In fact, that is what Grand Central Dispatch (Snow Leopard, OS X 10.6) is all about. The OS handles the threads, not the programmer.
Has nothing to do with what I wrote.
Not only does it work, it is the wave of the future. Eventually, all machines and OSes will work that way because no programmer wants to jump through outrageous hoops to deal with 128 cores. Or even 4.
The programmer still has to design his programs to do useful things with those threads. That was my point. The best scheduler in the world is useless when confronted with a single-threaded application (or one that is effectively so), or a non-parallelisable problem.
Windows does just fine scheduling across multiple CPUs, has been doing it since before OS X even existed, and was designed from day 1 for it. It can't do anything about poorly written applications, however, and neither can any other OS.
Your NetApp!!! Then you are already paying through the nose for storage.
*shrug* It costs the same as other storage systems in its class.
As for IOP's, *very* few machines ever get close to pushing the IOP's for one array, let alone the storage system.
The OS drive for a Linux VM "idles" at about 7-8 IOPS. 200 of them, then, basically uses up a shelf of 10k disks.
NetApp is about the only storage solution that provides dedup in the box at no extra cost. However per GB of storage it is one of the most expensive solutions out there.
They're a top-tier vendor to be sure, but no more expensive than alternatives like IBM or EMC.
What a typical lack of empathy you've displayed.
No. Intolerance for stupidity and incompetence is not a "lack of empathy". This is particularly true when multiple tons of metal and people's lives are involved. Someone who does not grasp that neutral disconnects the engine from the wheels should not be driving a car, because they are a hazard to themselves and, more important, everyone around them.
Consider this - in all the times you've driven in an automatic, how many times have you shifted to neutral - not through neutral, but to neutral?
Hundreds.
For most people that would be nearly zero - neutral in an automatic is nearly useless, you can't roll-start an automatic and unless you need a tow or a push, you aren't going to use it. Take it one step further - how many times have you shifted to neutral while in motion? For the vast majority of people that is zero.
Why would the situation be any different in a manual car ? Unless you want to make the disingenuous argument that passing through neutral in the process of changing gears is being "in neutral while the car is in motion" ?
But you know what? That's all moot, the point still stands that if he needed to shift to neutral something was seriously broken to begin with.
It's not moot. Mechanical failures happen - rarely on modern cars, to be sure, but they do happen - and not reacting appropriately to a stuck accelerator on a *freeway* is inexcusable.
Having a remote control instead of an actual human driver as somebody suggested is a bit of an overboard comment. There's still no such sophisticated form of remote control that would allow such stunt.
What aspect of the technology do you think is deficient ?
With the same analogy in mind, we should be seeing planes take-off and landing, not only being autopiloted at high altitudes at pretty stable conditions.
Modern autopilots can do landings and takeoffs.
So far, only military drones have similar control and I believe, even not knowing technical details, limitations exist. That's why actual humas still pilot military planes.
Humans are still piloting because even "cutting edge" military aircraft were all designed 20-30 years ago. This is almost certainly the last generation of military planes that will have humans riding shotgun. Human pilots are still in passenger plans because of the psychological factor, so they'll probably be around for a lot longer.
Basically its a bunch of innuendo, like he [i]might[/i] have been late on payments on the car (since proven false) or that he should have shifted it to neutral (not an intuitive action for someone who has never driven a manual transmission - and certainly a last resort that does not negate the existence of a problem to begin with).
Every automatic transmission I've ever seen has neutral. Most of them don't even require pressing the release button to move from Reverse or Drive into neutral. Anyone who doesn't understand, at the very least, that "N" means the car doesn't go, shouldn't be driving.
Who does 2GB OS installs especially in a 200+ VM environment? That's insane.
We certainly do. Why wouldn't we ? Trying to shave a few 10s or hundreds of MB off installation sizes is wasted time when your storage system can deliver similar (and more benefits) without the (expensive) human overheads.
I agree that deduplication is a nice addition to the virtual tool-set but it only seems to really ad a benefit to very specific environments.
"Very specific" ? You mean anyone doing non-trivial virtualisation ?
If I have optimized OS installs and the VMs run completely different data-sets from different organizations then the cost (both money and system resources) of deduplication seems to outweigh the benefit of saving a few G especially in a world where HDs come in 2TB sizes.
Firstly, to get sufficient IOPS, drive sizes are ~500GB, not 2TB (or even smaller for SSDs).
Secondly, the OS part of a large proportion of your VMs is always going to be identical, allowing for large savings.
Thirdly, savings aren't just in raw disk space. You also save IOPS (since the dedupe should be carried through to the cache layer) and bandwidth if you're replicating over a WAN.
Now the million dollar question to ask is how much does your dedupe solution cost?
Nothing. Our NetApp has it by default (who charges extra for dedupe these days ?).
The reason being any dedupe that is supported against a virtualization solution we have looked at costs more than just buying the frigging disk.
Except it doesn't cost any more and it saves IOPS, meaning we need to buy less disk not only for space, but for performance as well.
The level of dedupe in bulk storage is likely to be low as well, besides which the cost of dedupe on a couple hundred TB of disks is rediculas. Even for backup one has to wonder as well, tape is again really cheap, and dedupe for hundreds of TB is bloody expensive.
If your dedupe solution has differing costs depending on how much data you have, you've got the wrong solution.
All sales literature, mind you. My personal experience with it will begin in a few months, when we get our new Celerra installed :-)
As far as I know, Celerras only do file-level dedupe.
I wonder how much this approach really buys you in "normal" scenarios especially given the CPU and disk I/O cost involved in finding and maintaining the de-duplicated blocks. There may be a few very specific examples where this could really make a difference but can someone enlighten me how this is useful on say a physical system with 10 Centos VMs running different apps or similar apps with different data? You might save a few blocks because of the shared OS files but if you did a proper minimal OS install then the gain hardly seems to be worth the effort.
Assume 200 VMs at, say, 2GB per OS install. Allowing for some uniqueness, you'll probably end up using something in the ballpark of 20-30GB of "real" space to store 400GB of "virtual" data. That's a *massive* saving, not only disk space, but also in IOPS, since any well-engineered system will carry that deduplication through to the cache layer as well.
Deduplication is *huge* in virtual environments. The other big place it provides benefits, of course, is D2D backups.
Note that you can't *change* the file (because that would just split the files up again), but being able to read the file (when you couldn't before) or knowing that another copy exists elsewhere can be very useful knowledge.
If you can "generate a file" that can be deduplicated, then by definition you already know about the date in that file.
Unless you "deduplicate" the CPU work, that's not going to happen. ^^
Sure it does. CPU power is generally the _last_ thing you run out of in virtualised environments, and that's been true for years.
On a modern, Core i7-based server, you should be able to get 10+ "virtual desktops" per core on average, without too much trouble. IOPS and RAM are typically your two biggest limitations.
I don't know much about the subject, so forgive me if this is a dumb question, but in that scenario, if the data for a file becomes corrupted on the hard drive, say a critical system file, doesn't that mean that all vm's using it are pooched?
Yes, but a) this is something inherent to anything using shared resources, and b) there's not a lot of scope for such corruption to happen in a decent system (RAID, block-level checksums, etc).
I'd think it would also be easier to carry around a small netbook than a huge stack of papers on a bus or plane.
Of course, you'd have to weigh that against spending hours squinting at a tiny screen...
The answer would be "yes", to both questions. I even have a fancy-schmancy .ppt (it's all these people respond to) that shows the cost of ink vs. toner (all I have to do is fill in the make/model of the inkjet and laser printers), with yields for both. I put it in front of some "mountain" pictures (to help re-enforce the mountain of cash they'll be spending on the inkjet). Hard to put into words. Maybe there was too much info in the slides, but all they could focus on was "But, this one is cheaper right now!".
Out of sheer curiosity - since you've obviously run the numbers - what's the break even point between an inkjet and a laser ? How many pages ?
(I've never been to Germany, but in other European countries, I've often gotten funny looks for asking for tap water.)
It's a snobbery thing. It's like going to a restaurant and asking for cask (box ?) wine.
Actually - in Server 2008 R2, there's no other way to do certain options within the machine. MS has tried to force you to have your system as part of a domain, and be managed by all the gooeyness that implies, but sometimes there are needs that require that a machine not be a domain member and not be accessible via a GUI. (I know!!! Shocker!!!)
What are you trying to do ? How are you trying to do it ?
MS has actually paid lipservice to this with Server Core, but it's merely lip-service. There's a splattering of shells available to accomplish certain tasks which can no longer be easily done through APIs (since they seem to have vanished in some cases, or become largely unusable in others - try changing network configuration, for instance)
I struggle to believe there isn't a programmatic way to change the network configuration. There certainly is a commandline way (using netsh).
It's pretty simple - Because you cannot truly run "Least Privileges" nor Audit a particular user's actions in a service that runs as SYSTEM or any of its elevated cousins. This is how worms wreak havoc. Get a buffer overflow exploit on any windows service running under one of these elevated accounts, and the machine is completely owned.
Firstly, that explanation has little relevance to your original comment. Secondly, how is this worse to root on any UNIX system, or any highly privileged account on any system ? If some process running at high privileges is taken over, then the system is more vulnerable that it would be otherwise. This is a basic fact of any multiuser design, not some design flaw only present in Windows.
Why are you running your services as SYSTEM in the first place ?
Actually, that brings up a point. Since this is about security flaws in their distribution, wouldn't this make them liable if something happened to your sever? "They gave me faulty software which THEY KNEW WAS FAULTY because they wanted to charge me $xx to get the fix"...?
Only if they knew that specific fault existed and would impact you before selling it - and even that assumes the standard "no liability" disclaimers could be circumvented.
Design your user interfaces with the same care and diligence as you define your application's architecture and you won't need to fiddle with it every week.
So since the fundamental Windows UI has remained basically unchanged since 1995, I guess that means Microsoft did a good job ?
in fact the ONLY OS I have used that has remained stable in it's UI has been OSX
The OS X UI has changed significantly since its first release. Expose is the most obvious example, but there are many more.
Windows has changed radically every release.
Your bias is laughably obvious you say that Windows has "changed radically" between each release, while also saying neither OS X nor Linux have changed.
To anyone using Windows (XP, Vista or 7) right now, go ahead and open up an Explorer window, and type in ftp:// followed by any url.
I just tried this on a Windows 7 PC. An unresolvable name returns an error in a few seconds. A resolvable name, with no FTP server on the other end, produces the wait cursor, but I can click on another drive or folder and it responds in a few seconds (other Explorer windows are instantly responsive while this happens). A working FTP site (eg: ftp.microsoft.com) opens with a listing pretty much instantly.
Where's the problem here ?
The old paradigms from the 20th century do not work anymore because they were not designed for parallel processing.
The first multiprocessor machine appeared in 1961. Do you really think computer science hasn't changed since then ?
I only wish i could find a i7 MB with a Sas controller, I may have to buy a used one :(
Here.
1.) Microsoft decides it's finally time to re-design Windows from the ground up. They hold 95% of the market, so it is they who must make this change.
Why would they do this ? They already have an OS that was designed from day 1 for multiprocessor systems and is, at worst, on par with every alternative platform.
Before too long you'll end up with a hypervisor-OS that merely tells the applications which CPU it gets and where its memory is. The OS will be shoved further into the background, as it should be, and everything will be just plain cooler.
What do you think happens _now_ ?
If we want efficient code, we have to figure out ways to reward the programmers that write it. I don't see any sign that people anywhere are interested in doing this. Anyone have suggestions for how it might be done?
Defining what you mean by "efficient" (and why that definition should be the primary goal of software development) would probably be a good place to start.
Completely wrong on the first two counts, and the third is irrelevant. Here is why: Grand Central Dispatch is not a "thread manager" in the sense you mean. It makes and manages threads on its own, rather than relying on the programmer to handle them in inside the application. That is what it is all about and what makes it different, as you would know if you knew the even slightest thing about it.
So you're saying GCD can take existing, unmodified applications and inherently unparallelisable problems and make them work across multiple processors ?
Therefore, my comment has everything to do with what you wrote. No, the programmer does NOT have to design the program around threading. That's the point.
Yes, they do. GCD can't magically make arbitrary code multithreaded and parallel. It can make it easier for the developer to make their code multithreaded, but it can't do it for them. Ie: the developer still has to know what they're doing, and whatever it is they're trying to do has to actually benefit from multithreading.
Irrelevant.
It's completely relevant. The best application design and threading in the world won't see much benefit if the OS scheduler is crap.
GCD is NOT just a "thread scheduler". It creates and manages threads on its own, even in applications that are not written to be threaded. Does Windows do that? Any flavor or Windows? No. They do not.
Vista's (and later) "Thread Pool" encompasses some of the functionality of GCD. Parallel Extensions is another part of Microsoft's similar technology. Or there are third-party libraries.
That is simply not true.
It's very true. However, this:
In fact, that is what Grand Central Dispatch (Snow Leopard, OS X 10.6) is all about. The OS handles the threads, not the programmer.
Has nothing to do with what I wrote.
Not only does it work, it is the wave of the future. Eventually, all machines and OSes will work that way because no programmer wants to jump through outrageous hoops to deal with 128 cores. Or even 4.
The programmer still has to design his programs to do useful things with those threads. That was my point. The best scheduler in the world is useless when confronted with a single-threaded application (or one that is effectively so), or a non-parallelisable problem.
Windows does just fine scheduling across multiple CPUs, has been doing it since before OS X even existed, and was designed from day 1 for it. It can't do anything about poorly written applications, however, and neither can any other OS.