Raid is toast. I dont care WHAT raid you are running, none of them can withstand a loss of 50% of the drives.
Really? I used to do that as a routine acceptance test for clusters. The only times it failed for real was when we'd screwed up something.
For that to work, you have to rigorously separate RAID mirrors into their own trays so that a whole tray failure (or cable, as you said) only takes one mirror down. For something like 10, 50, 60 you just make sure all of one side is on one array and all of the other on another (or if you have more than 2 arrays, that you separate them out into pairs with one used for one side and one for another).
Physical separation helps as well, so that you don't accidentally unplug A while starting servicing on B. That exact scenario is one of the canonical HA oopses.
I can't stress that enough. software and semi-software raid is a joke.
Not until the hardware fails and you need the data that was on there but not on the backup (or realized the backup failed a long time ago...).
For performance, yes, hardware is fastest. For reliability though, software RAID is better (hardware RAID can have interesting firmware version issues).
Old SAN / Cluster folks believe in belt+suspenders. I.e., often, use both.
Use Software RAID 1 across a couple of LUNs (or separate controllers / drive array stacks, for non-SAN environments). Build the LUNs with internal RAID (5, 6, hot spares, figure out your rebuild times, etc.)
Also - hugely common failure is that the operators aren't properly monitoring the underlying hardware RAID drive status. You need to know immediately when a drive fails even if there's RAID6 and a couple of hot spares in the array. When I worked for a VAR on clusters, I can't count the number of times I arrived and found that they'd had 2, 3, 4 failures nobody noticed, and were one more failure away from catastrophic data loss...
There is a very slight bathtub type curve - all numbers rounded, it's about 3% AFR in the first quarter (i.e. about 0.75% failures in first quarter) and 2% for drives in the 3-12 month range (i.e. about 1.5%). If I read the statistics presentation there right 33% of first year failures look to happen in the first quarter, which is detectable but minor initial higher rate. That's dwarfed by 1-2 year AFR (about 8%) and 2-3 year AFR (about 9%), but drops slightly after that.
They presented the AFRs rather than the culminative losses in an initial cohort per quarter/year, which would be slightly clarifying, but whichever way they did the analysis it's about like that.
I have worked for an OEM who installed about 30,000 drives a year; for end users with 10,000 drive environments, built out new 1,000 HDD and 600 SSD environments in the last year. I know all about static, having had the manufacturer-level training on how not to zap.
It's not just static. Some drives come with SMART errors (or bad blocks that matter), despite $MFGR assurances. Some of the failures develop in the factory and get shipped anyways as unlikely to get worse, some develop while being packaged or shipped or unpackaged. Run SMART data collection across hundred-drive collections (or thousands or more) and you get a lot of useful and scary info.
Also, there are well documented runs of drives - specific models, time ranges, factories involved etc - which all just blew up. Also happens to chips sometimes - I've been seriously bit by bad CPUs by Sun and Intel, support chips from several vendors. Also RAM going bad.
One prototype CPU literally melted the system down, all the plastic nearby inside the casing melted and puddled on the bottom of the case, the CPU label plastic was carbonized.
Doubling lifespan that way requires that you only use half the disk capacity.
I have burned out a Major Name Brand SLC SSD with a high traffic OLTP DB in eight months. I have heard the same from Large Internet Companies which tested these for internal use. There are ongoing independent reliability expert studies in FAST, HOTDEP, other conferences which are uniformly highly skeptical of vendors' claims on SSD lifetime.
If you have not actually tested the drive out to six years service, run an accellerated pilot test unit out ahead of your main prod usage, to give you the canary warning.
I've tried to do large database server farm tests on modern enterprise SSDs with TRIM, the best wear load leveling, SLC, etc. They go "poof" at moderate (few months, for my loads) lifetimes.
IOPS x Lifetime / price is a metric I find useful. Unfortunately, it makes SSD look even worse than it does just on a price basis 8-(
Not really improved. I burned out a REALLY GOOD (best available) SLC SSD in 7 months with a mirrored production workload at a previous jobsite not that long ago.
Poof. All gone.
At the FAST conference, was yet another presentation on SSD lifetime burnout mechanisms, news not actually improving in the slightest so far on life. SLC is not good enough; MLC is toast in write-intensive apps.
Phase-change memory or one of the others, with millions of write cycles per bit, may pull this out, but Flash is not proving good enough for enterprises.
The Great Zero Challenge rules specifically exclude disassembly of the drive; all the bit-recovery mechanisms discussed in the literature require you to disassemble the drive and use custom heads to scan the surface magnetism map.
I.e., the contest is totally missing the point on what data recovery pros (i.e., the NSA and so forth) said they'd do if they had to scan disks to recover overwritten data.
Oh? A plane with a single fuselage, fuselage front engine intakes, canards, a delta wing, resembles an aircraft with separate engine pods on a flat center section, underwing engine intakes, a V-tail?
There's nothing configurationally similar between those aircraft. Nothing.
There's a passing similarity with the FB-22 bomber proposal, but that didn't have canards, just a delta, and was never more than a paper proposal (no detailed design or prototype).
The technology they used to get to space was 90+% Russian
Common fallacy - they bought a Soyuz and a lot of engineering time, and the vehicles are similar in configuration and concept, but the Chinese vehicles are essentially a whole new design and used nearly no Soyuz components other than the docking mechanism and imported space suits (I think that was it).
Looks similar doesn't mean design stolen from. Chinese engineers did most of the hard work on all of the hardware with those two noted exceptions.
There are plenty of tax havens to go off to and live in, if you feel that way.
Problem is, none of them are a large, expanding, dynamic economy.
They exist for a reason, but modern economics does as well - it works, and it wins out over time at producing the most benefit for the most people (including the rich, who at times object to how it works, but who are far far FAR richer in the west than elsewhere...).
The current system is not entirely fair or reasonable by any one group's definitions of those terms, and certainly sucks in many ways. Welcome to the Real World. It sucks, but obviously less so than any other ideas we've tried so far. See similar observations about western democracy as a government model.
When you have a model that you can adequately explain and defend as holistically better, you'll get converts. I have yet to see any critic who can explain an alternate model in detail, because most of the critics don't understand economies well enough to design and engineer one. So give it your best shot. Perhaps you have the cojones than all the professional issue radicals and far-stream economics professionals lack, new ideas and the brains to link them into a system and the communications skills to explain it. Go for it!
As far as I can tell, the "science" of economics has predicted exactly zero major economic events over the course of human history. Not a great track record. Not a source of confidence. Not a science, really.
Prediction of really dynamic events - the long term weather, economics, etc - is really hard.
What you can do, scientifically, is analyze how different factors affect each other over time. You can predict that conditions are ripe for a type of event (inflation, unemployment spike, a market bubble, recession, etc). You can predict the course of an economic shift based on inputs (bailouts, government investment, policy changes, money supply changes, consumer confidence and employment, etc).
Being able to say "The bond market? It's going to collapse on Tuesday," is really hard.
Chastising economists because the economy is too complicated for us to do mid to long term projections accurately yet is unreasonable. They understand at micro, intermediate, macro, and international levels. They can show interactions and trends and make useful predictions. But they can't model the whole thing on an ongoing basis.
* There's a book with precise dimensional drawings and measurements on the Little Boy type Uranium gun type bomb. Not online, but purchasable at Amazon. It's not "a blueprint" but any competent draftsman / mechanical engineer could produce blueprints to build from, given the book.
* The dimensions and materials of all the layers of the Fat Man / Mark 1 type nuclear weapons are published in numerous sources. The precise shape of the lens in the outer layer has not been, though a rough back-of-the-envelope version of the equation for the lens shape is published. A precise and buildable lens shape would require someone with a fair talent in explosives engineering and shockwave engineering, especially someone aware of what the published equation left out, but the Fat Man design is fundamentally so brick-solid-simple that one could get the lens fairly imprecise and still have a functional weapon.
Some effort has gone into not actively publishing newer weapon design details in public. But that's not nearly the same as "they're not out". A number of more modern weapons are understood to at least close to the level Fat Man and Little Boy are. There are accurate internal component photos declassified for some weapons and parts. There are detailed hands-on descriptions of some parts, by people who worked on them. Check out the Wikipedia article on the B61 bomb, for example; the fission and fusion components were shown in a declassified film (but not the explosives to compress the fission parts).
Not for nuclear weapon design information. That's "Restricted Data", see DOE classification rules. Accessed with a DOD TS-CNWDI (Top Secret - Critical Nuclear Wepon Design Information) or DOE Q-clearance plus appropriate Sigma compartment clearances for the specifics you're looking at.
Differential backups are taking a single filesystem, seeing what changed (either at the file level (whole changed/updated/new files) or block level (changed blocks within files).
Block level deduplication is noticing that the storage appliance on which you back up 100 desktops and 10 servers has 50 copies of the same version of each data block in each Microsoft OS file from XP, 25 from Win 7, and 35 from Fedora, and only storing 1 copy of each of those blocks rather than 100 separate ones. It's returning those blocks to the usable storage pool and remapping without having to "compress" anything, not having to rewrite the backup data images, etc. It's just saying "This is block 3 of the binary for Internet Explorer 8, and I already have a copy of that", for each and every common block out there.
You still have to upload the blocks, and the system still needs to scan them to notice the duplication, but it's a lot more than "oh, compression".
The basic architecture should be cheap to fabricate in bulk. It's lines of wires, a layer running in one direction, a thin film of the memristive material, then a layer of wires on top running at right angles. Every intersection point is a bit.
DRAMs involve all sorts of careful operations to create a trench or stack, fill it with a capacitor, run the lines in and out, etc. Much more complicated on a per-bit basis. Many more things can go wrong. Memristors are pretty much the simplest to implement circuit element I've seen come along in a long long time.
The key questions are performance. How many write cycles can the fabbed chips survive before bits start going bad / getting stuck? Typical MLC fash is 10-100 thousand, very good SLC flash 100k to 1m cycles. This is not enough that you can ignore the write lifetime issues, and today's SSDs will wear out if written very actively over long periods of time.
Memristors (and Phase-Change RAM, and some of the other options out there for new non-volatile RAM) offer potentially very long life. But it's not clear if the produced chips will be 1m and up, 10m, 100m, or what.
At some point the device's overall lifetime is shorter than the wearout rate and you stop caring about wear leveling, etc. You just detect bit errors and map around them, and a few bit errors happen over device lifecycles. The wear leveling now used is a big deal on SSDs and a major factor in their performance (or not).
Also very important is how fast the chips are. Should be fast - you fire a short AC pulse down one word line, read the bits out the bit lines. Either the resistor resists or it doesn't. Word line enable transistor delays and read amp sensing delays of less than 10x transistor cycle time at a given fab size/process are likely, which is pretty good. Potentially this is faster than DRAM, more like SRAM, but not all fab / design approaches would get there (and not all potential fab processes).
Secondarily, how fast is a write cycle. SRAM writes very very quickly. DRAM reasonably quickly. Memristor? Should be fast, but there are current and material breakdown concerns.
Fundamentally, we need to see the chips. When we see chip spec sheets, it tells us how useful these are.
It could range from "replaces FLASH at certain densities or write life requirements" to "replaces all FLASH completely" to "replaces a lot of DRAM" to "becomes the only memory in use between CPU caches and hard disks". Potentially, it could be cheap enough to even replace hard disks.
We've had computers in recent memory (1980s, early 1990s) which were operating without all the data cache tiers we now have to deal with in computer architecture. Large chunks of computer architecture now is nearly all about efficiently managing the tiered data storage - CPU registers to CPU cache, CPU cache to main memory, main memory to disk. There are factors of 10 speed difference or more between each tier (more from DRAM to disk). Fast reliable nonvolatile RAM could flatten that all out a lot. FLASH isn't good enough due to write lifecycle limits. Memristors, if the performance comes in near the top of the possible range, could. Will they? I'm not working for HP or Hyundai, I don't know what they've got. I'm preparing for designing some systems which could flatten things, who knows if we'll actually get there with this tech. It could be a game changer, or it could be just another technology on the block.
The sad part is, datacenter power people are now on the "avoid stranded power" trip trying to increase power efficiency (UPSes and PDUs running at 80% are much more efficient than those running at 50%). They don't seem to understand or be willing to provision to support one leg actually failing completely.
They're handling the "one server out of tens has a power supply failure on one leg" failure, but not the "the whole rack flips to only using B power due to X"...
For backed up to tape storage? Storage replicated to another, remote datacenter? Snapshotted at regular intervals?
SAN storage? NAS? Direct attach? On arrays with 10 drives, 100 drives, or 1000 drives?
Fast SAS or FC drives? SATA arrays? 5400 RPM? 7200? 10k? 15k?
If you're paying $360/GB/yr for low end storage that sucks. For very high end, with replication and snapshots and the fastest drives and so forth, that's pretty high, but not an order of magnitude high.
Standard interfaces are great. DIMMs are standard interfaces, usually only 1 device hop from the CPU, as opposed to drives which are often 8 or more, and several orders of magnitude of slowness away. Close enables fast in computer architecture.
Regarding the moving drives/dead system question... What's the hard part? You just move the "drive DIMM"...
The hot new solid state non-volatile memory technologies are phase-change memory (PRAM), memristors, ferroelectric RAM, resistive RAM.
Some of these technologies are much more area-efficient than Flash, and will stack in pseudo-3D chips reasonably well (memristors in particular should stack in full 3-D arrays very efficiently...).
The general observation that disks have the lead right now is true, but the other technologies close a lot of the gap, and the growth curves look very similar after that. Who knows if it ever gets cheap enough to completely replace disks in our lifetimes, but there is hope of seeing that.
That does entirely change the game on system architecture. Disks are slow and far away from the CPU. Solid state memory can be as close or nearly as close as DRAM, and if it doesn't require a lot of handholding on lifecycle management (wear rates etc - Flash is horrible here) then can be used and managed as a simple byte or block array rather than the whole "filesystem" crap we now use. We still may want POSIX like abstractions for parts of storage management, but life is so much easier if the back end store is just a block array we read/write than if it's really a spinning disk, behind a cache, behind a controller, behind a SATA/SAS bus, behind a controller, behind a PCI bus, behind a southbridge,....
One of my parents was a government contracts attorney, who did it professionally and taught it part time at local law schools.
Nobody likes the US Government as a customer. It's by far the most annoying customer for any tech company. The contracts will be 2-10x as hard to administer, 2-10x as much overhead as commercial contracts, and you get sued a lot more (usually over the proposals/bidding, but it's still a suit).
But, assuming it is, can you realistically charge someone with manslaughter for deaths caused by a natural disaster?
Sure, under some circumstances, if the disaster was predictable (flood, earthquake, landslide, hurricane) and someone didn't take normal or minimum non-neglegent steps to avoid putting others in danger, or lied about being ready.
Lying about your building being seismically upgraded, for example, and then having it fall down.
The scientists' correct response is "There's a 100% chance of multiple earthquakes in this location over the next 1000 years. We have no way of knowing when or how many at this time. Live in unreinforced masonry buildings at your own risk."
Really? I used to do that as a routine acceptance test for clusters. The only times it failed for real was when we'd screwed up something.
For that to work, you have to rigorously separate RAID mirrors into their own trays so that a whole tray failure (or cable, as you said) only takes one mirror down. For something like 10, 50, 60 you just make sure all of one side is on one array and all of the other on another (or if you have more than 2 arrays, that you separate them out into pairs with one used for one side and one for another).
Physical separation helps as well, so that you don't accidentally unplug A while starting servicing on B. That exact scenario is one of the canonical HA oopses.
Not until the hardware fails and you need the data that was on there but not on the backup (or realized the backup failed a long time ago...).
For performance, yes, hardware is fastest. For reliability though, software RAID is better (hardware RAID can have interesting firmware version issues).
Old SAN / Cluster folks believe in belt+suspenders. I.e., often, use both.
Use Software RAID 1 across a couple of LUNs (or separate controllers / drive array stacks, for non-SAN environments). Build the LUNs with internal RAID (5, 6, hot spares, figure out your rebuild times, etc.)
Also - hugely common failure is that the operators aren't properly monitoring the underlying hardware RAID drive status. You need to know immediately when a drive fails even if there's RAID6 and a couple of hot spares in the array. When I worked for a VAR on clusters, I can't count the number of times I arrived and found that they'd had 2, 3, 4 failures nobody noticed, and were one more failure away from catastrophic data loss...
There is a very slight bathtub type curve - all numbers rounded, it's about 3% AFR in the first quarter (i.e. about 0.75% failures in first quarter) and 2% for drives in the 3-12 month range (i.e. about 1.5%). If I read the statistics presentation there right 33% of first year failures look to happen in the first quarter, which is detectable but minor initial higher rate. That's dwarfed by 1-2 year AFR (about 8%) and 2-3 year AFR (about 9%), but drops slightly after that.
They presented the AFRs rather than the culminative losses in an initial cohort per quarter/year, which would be slightly clarifying, but whichever way they did the analysis it's about like that.
I have worked for an OEM who installed about 30,000 drives a year; for end users with 10,000 drive environments, built out new 1,000 HDD and 600 SSD environments in the last year. I know all about static, having had the manufacturer-level training on how not to zap.
It's not just static. Some drives come with SMART errors (or bad blocks that matter), despite $MFGR assurances. Some of the failures develop in the factory and get shipped anyways as unlikely to get worse, some develop while being packaged or shipped or unpackaged. Run SMART data collection across hundred-drive collections (or thousands or more) and you get a lot of useful and scary info.
Also, there are well documented runs of drives - specific models, time ranges, factories involved etc - which all just blew up. Also happens to chips sometimes - I've been seriously bit by bad CPUs by Sun and Intel, support chips from several vendors. Also RAM going bad.
One prototype CPU literally melted the system down, all the plastic nearby inside the casing melted and puddled on the bottom of the case, the CPU label plastic was carbonized.
Mmmm.... 40+ years after going out of style as "Hopelessly Obsolete", Delay Lines return to the cutting edge.
Doubling lifespan that way requires that you only use half the disk capacity.
I have burned out a Major Name Brand SLC SSD with a high traffic OLTP DB in eight months. I have heard the same from Large Internet Companies which tested these for internal use. There are ongoing independent reliability expert studies in FAST, HOTDEP, other conferences which are uniformly highly skeptical of vendors' claims on SSD lifetime.
If you have not actually tested the drive out to six years service, run an accellerated pilot test unit out ahead of your main prod usage, to give you the canary warning.
I've tried to do large database server farm tests on modern enterprise SSDs with TRIM, the best wear load leveling, SLC, etc. They go "poof" at moderate (few months, for my loads) lifetimes.
IOPS x Lifetime / price is a metric I find useful. Unfortunately, it makes SSD look even worse than it does just on a price basis 8-(
Not really improved. I burned out a REALLY GOOD (best available) SLC SSD in 7 months with a mirrored production workload at a previous jobsite not that long ago.
Poof. All gone.
At the FAST conference, was yet another presentation on SSD lifetime burnout mechanisms, news not actually improving in the slightest so far on life. SLC is not good enough; MLC is toast in write-intensive apps.
Phase-change memory or one of the others, with millions of write cycles per bit, may pull this out, but Flash is not proving good enough for enterprises.
The Great Zero Challenge rules specifically exclude disassembly of the drive; all the bit-recovery mechanisms discussed in the literature require you to disassemble the drive and use custom heads to scan the surface magnetism map.
I.e., the contest is totally missing the point on what data recovery pros (i.e., the NSA and so forth) said they'd do if they had to scan disks to recover overwritten data.
It's hard to think of a less useful contest.
Oh? A plane with a single fuselage, fuselage front engine intakes, canards, a delta wing, resembles an aircraft with separate engine pods on a flat center section, underwing engine intakes, a V-tail?
There's nothing configurationally similar between those aircraft. Nothing.
There's a passing similarity with the FB-22 bomber proposal, but that didn't have canards, just a delta, and was never more than a paper proposal (no detailed design or prototype).
Far more likely its based on stolen US plans.
For what?
There is no US stealth fighter design with that size or characteristics.
The technology they used to get to space was 90+% Russian
Common fallacy - they bought a Soyuz and a lot of engineering time, and the vehicles are similar in configuration and concept, but the Chinese vehicles are essentially a whole new design and used nearly no Soyuz components other than the docking mechanism and imported space suits (I think that was it).
Looks similar doesn't mean design stolen from. Chinese engineers did most of the hard work on all of the hardware with those two noted exceptions.
The launch vehicle was all theirs.
There are plenty of tax havens to go off to and live in, if you feel that way.
Problem is, none of them are a large, expanding, dynamic economy.
They exist for a reason, but modern economics does as well - it works, and it wins out over time at producing the most benefit for the most people (including the rich, who at times object to how it works, but who are far far FAR richer in the west than elsewhere...).
The current system is not entirely fair or reasonable by any one group's definitions of those terms, and certainly sucks in many ways. Welcome to the Real World. It sucks, but obviously less so than any other ideas we've tried so far. See similar observations about western democracy as a government model.
When you have a model that you can adequately explain and defend as holistically better, you'll get converts. I have yet to see any critic who can explain an alternate model in detail, because most of the critics don't understand economies well enough to design and engineer one. So give it your best shot. Perhaps you have the cojones than all the professional issue radicals and far-stream economics professionals lack, new ideas and the brains to link them into a system and the communications skills to explain it. Go for it!
But not on /.
As far as I can tell, the "science" of economics has predicted exactly zero major economic events over the course of human history. Not a great track record. Not a source of confidence. Not a science, really.
Prediction of really dynamic events - the long term weather, economics, etc - is really hard.
What you can do, scientifically, is analyze how different factors affect each other over time. You can predict that conditions are ripe for a type of event (inflation, unemployment spike, a market bubble, recession, etc). You can predict the course of an economic shift based on inputs (bailouts, government investment, policy changes, money supply changes, consumer confidence and employment, etc).
Being able to say "The bond market? It's going to collapse on Tuesday," is really hard.
Chastising economists because the economy is too complicated for us to do mid to long term projections accurately yet is unreasonable. They understand at micro, intermediate, macro, and international levels. They can show interactions and trends and make useful predictions. But they can't model the whole thing on an ongoing basis.
The Wikipedia article is intentionally not useful for designing anything.
However, we do have an online textbook (at roughly upper-division engineering/physics college student difficulty level) on the subject:
http://www.nuclearweaponarchive.org/Nwfaq/Nfaq0.html
In terms of what's been published online -
* There's a book with precise dimensional drawings and measurements on the Little Boy type Uranium gun type bomb. Not online, but purchasable at Amazon. It's not "a blueprint" but any competent draftsman / mechanical engineer could produce blueprints to build from, given the book.
* The dimensions and materials of all the layers of the Fat Man / Mark 1 type nuclear weapons are published in numerous sources. The precise shape of the lens in the outer layer has not been, though a rough back-of-the-envelope version of the equation for the lens shape is published. A precise and buildable lens shape would require someone with a fair talent in explosives engineering and shockwave engineering, especially someone aware of what the published equation left out, but the Fat Man design is fundamentally so brick-solid-simple that one could get the lens fairly imprecise and still have a functional weapon.
Some effort has gone into not actively publishing newer weapon design details in public. But that's not nearly the same as "they're not out". A number of more modern weapons are understood to at least close to the level Fat Man and Little Boy are. There are accurate internal component photos declassified for some weapons and parts. There are detailed hands-on descriptions of some parts, by people who worked on them. Check out the Wikipedia article on the B61 bomb, for example; the fission and fusion components were shown in a declassified film (but not the explosives to compress the fission parts).
Not for nuclear weapon design information. That's "Restricted Data", see DOE classification rules. Accessed with a DOD TS-CNWDI (Top Secret - Critical Nuclear Wepon Design Information) or DOE Q-clearance plus appropriate Sigma compartment clearances for the specifics you're looking at.
No, it's not.
Differential backups are taking a single filesystem, seeing what changed (either at the file level (whole changed/updated/new files) or block level (changed blocks within files).
Block level deduplication is noticing that the storage appliance on which you back up 100 desktops and 10 servers has 50 copies of the same version of each data block in each Microsoft OS file from XP, 25 from Win 7, and 35 from Fedora, and only storing 1 copy of each of those blocks rather than 100 separate ones. It's returning those blocks to the usable storage pool and remapping without having to "compress" anything, not having to rewrite the backup data images, etc. It's just saying "This is block 3 of the binary for Internet Explorer 8, and I already have a copy of that", for each and every common block out there.
You still have to upload the blocks, and the system still needs to scan them to notice the duplication, but it's a lot more than "oh, compression".
Thirded. Data Domain (now part of EMC) really started the commercial use of this...
The basic architecture should be cheap to fabricate in bulk. It's lines of wires, a layer running in one direction, a thin film of the memristive material, then a layer of wires on top running at right angles. Every intersection point is a bit.
DRAMs involve all sorts of careful operations to create a trench or stack, fill it with a capacitor, run the lines in and out, etc. Much more complicated on a per-bit basis. Many more things can go wrong. Memristors are pretty much the simplest to implement circuit element I've seen come along in a long long time.
The key questions are performance. How many write cycles can the fabbed chips survive before bits start going bad / getting stuck? Typical MLC fash is 10-100 thousand, very good SLC flash 100k to 1m cycles. This is not enough that you can ignore the write lifetime issues, and today's SSDs will wear out if written very actively over long periods of time.
Memristors (and Phase-Change RAM, and some of the other options out there for new non-volatile RAM) offer potentially very long life. But it's not clear if the produced chips will be 1m and up, 10m, 100m, or what.
At some point the device's overall lifetime is shorter than the wearout rate and you stop caring about wear leveling, etc. You just detect bit errors and map around them, and a few bit errors happen over device lifecycles. The wear leveling now used is a big deal on SSDs and a major factor in their performance (or not).
Also very important is how fast the chips are. Should be fast - you fire a short AC pulse down one word line, read the bits out the bit lines. Either the resistor resists or it doesn't. Word line enable transistor delays and read amp sensing delays of less than 10x transistor cycle time at a given fab size/process are likely, which is pretty good. Potentially this is faster than DRAM, more like SRAM, but not all fab / design approaches would get there (and not all potential fab processes).
Secondarily, how fast is a write cycle. SRAM writes very very quickly. DRAM reasonably quickly. Memristor? Should be fast, but there are current and material breakdown concerns.
Fundamentally, we need to see the chips. When we see chip spec sheets, it tells us how useful these are.
It could range from "replaces FLASH at certain densities or write life requirements" to "replaces all FLASH completely" to "replaces a lot of DRAM" to "becomes the only memory in use between CPU caches and hard disks". Potentially, it could be cheap enough to even replace hard disks.
We've had computers in recent memory (1980s, early 1990s) which were operating without all the data cache tiers we now have to deal with in computer architecture. Large chunks of computer architecture now is nearly all about efficiently managing the tiered data storage - CPU registers to CPU cache, CPU cache to main memory, main memory to disk. There are factors of 10 speed difference or more between each tier (more from DRAM to disk). Fast reliable nonvolatile RAM could flatten that all out a lot. FLASH isn't good enough due to write lifecycle limits. Memristors, if the performance comes in near the top of the possible range, could. Will they? I'm not working for HP or Hyundai, I don't know what they've got. I'm preparing for designing some systems which could flatten things, who knows if we'll actually get there with this tech. It could be a game changer, or it could be just another technology on the block.
The sad part is, datacenter power people are now on the "avoid stranded power" trip trying to increase power efficiency (UPSes and PDUs running at 80% are much more efficient than those running at 50%). They don't seem to understand or be willing to provision to support one leg actually failing completely.
They're handling the "one server out of tens has a power supply failure on one leg" failure, but not the "the whole rack flips to only using B power due to X"...
For backed up to tape storage? Storage replicated to another, remote datacenter? Snapshotted at regular intervals?
SAN storage? NAS? Direct attach? On arrays with 10 drives, 100 drives, or 1000 drives?
Fast SAS or FC drives? SATA arrays? 5400 RPM? 7200? 10k? 15k?
If you're paying $360/GB/yr for low end storage that sucks. For very high end, with replication and snapshots and the fastest drives and so forth, that's pretty high, but not an order of magnitude high.
Standard interfaces are great. DIMMs are standard interfaces, usually only 1 device hop from the CPU, as opposed to drives which are often 8 or more, and several orders of magnitude of slowness away. Close enables fast in computer architecture.
Regarding the moving drives/dead system question... What's the hard part?
You just move the "drive DIMM"...
The hot new solid state non-volatile memory technologies are phase-change memory (PRAM), memristors, ferroelectric RAM, resistive RAM.
Some of these technologies are much more area-efficient than Flash, and will stack in pseudo-3D chips reasonably well (memristors in particular should stack in full 3-D arrays very efficiently...).
The general observation that disks have the lead right now is true, but the other technologies close a lot of the gap, and the growth curves look very similar after that. Who knows if it ever gets cheap enough to completely replace disks in our lifetimes, but there is hope of seeing that.
That does entirely change the game on system architecture. Disks are slow and far away from the CPU. Solid state memory can be as close or nearly as close as DRAM, and if it doesn't require a lot of handholding on lifecycle management (wear rates etc - Flash is horrible here) then can be used and managed as a simple byte or block array rather than the whole "filesystem" crap we now use. We still may want POSIX like abstractions for parts of storage management, but life is so much easier if the back end store is just a block array we read/write than if it's really a spinning disk, behind a cache, behind a controller, behind a SATA/SAS bus, behind a controller, behind a PCI bus, behind a southbridge, ....
One of my parents was a government contracts attorney, who did it professionally and taught it part time at local law schools.
Nobody likes the US Government as a customer. It's by far the most annoying customer for any tech company. The contracts will be 2-10x as hard to administer, 2-10x as much overhead as commercial contracts, and you get sued a lot more (usually over the proposals/bidding, but it's still a suit).
Sure, under some circumstances, if the disaster was predictable (flood, earthquake, landslide, hurricane) and someone didn't take normal or minimum non-neglegent steps to avoid putting others in danger, or lied about being ready.
Lying about your building being seismically upgraded, for example, and then having it fall down.
The scientists' correct response is "There's a 100% chance of multiple earthquakes in this location over the next 1000 years. We have no way of knowing when or how many at this time. Live in unreinforced masonry buildings at your own risk."