IBM Promises More Memory In The Same Space
dcallaghan was among the many readers to write with news of IBM's announcement of new memory technology. The upshot seems to be on-the-fly compression in hardware, taking the tack of RamDoubler and other software compression utilities, but moving the actual data sqashing into dedicated (fast) chips. I hope this leaks out of "server only" land soon; I'd love to have 256MB for the price of 128 -- this would be especially nice with pricey notebook memory.
Compression CANNOT guarantee anything better than 1:1 ratio - it is ENTIRELY dependent on the data.
For data compression in memory to succeed, you MUST have an option to cache the "extra" memory to a swapfile incase the prediction logic fails and you run out of physical ram. If you do not, you will tank your system, bigtime.
Sorry, but I'm very leery of any "memory compression" - it requires OS support to function. Period. You aren't going to just plug in a miracle DIMM and make it work. I hope IBM is opening the spec (it looks like they are) and that OS development people quickly embrace this, or their hardware will take a nosedive in the market.
IANAHardware Engineer, but it seems to me that RAM is already designed to do one simple thing (okay, two things: peek and poke) and to do it absolutely as fast as possible. This technology inevitably will degrade RAM performance by a finite amount. Is their chip fast enough that this degredation will be negligible? If so, then this will be Extremely Cool. If not, then no thanks, I'll just shell out for the extra RAM. Of course, the economics on a huge server with 100GB of RAM are most likely compeletely different.
Yo dawg, I heard you like the Ackermann function, so OH GOD OH GOD OH GOD
Strike 1: "IBM Memory eXpansion Technology" BiCapitalization is the first sign of bad tech - it means the marketing people got to it before engineering could get it out the door. It also boils down to yet another meaningless TLA to impress PHBs: MXT.
Strike 2: fake numbers. "as memory comprises 40 to 70 percent of the cost of most NT-based server configurations" Er, gee, not only is that an absurdly large error margin, but most servers cost, oh, we'll say $2000 and up. 40% of is $800. $800 of PC133 right now is about 640MB of RAM. Most systems in that price range have 256-384. Oops.
Strike 3: Stating the obvious "and millions of tiny transistors" Oh, and how else would you do it? An analog circuit, perhaps?!
Strike 4: Not promising: "The new technology is seamless to the end-user because the compressed data can be uncompressed in nanoseconds when needed." Call me a pessimist, but memory right now is around 6ns for PC133. Now, assuming a very conservative 2ns to decode the data, that's 8ns, which is a 25% performance hit. How many admins do you know that would take a 25% hit on performance on their servers to save a couple hundred bucks?
In short, this new tech is gonna tank.
Is a caching system. From the article:
MXT is a hardware implementation that automatically stores frequently accessed data and instructions close to a computer's microprocessors so they can be accessed immediately -- significantly improving performance. Less frequently accessed data and instructions are compressed and stored in memory instead of on a disk -- increasing memory capacity by a factor of two or more.
Note two things: They are not compressing everything. They are not replacing the actual memory.
Most of the criticisms here are based on misunderstandings of those two things.
(Note that I'm guiltless. I posted a number of times before getting around to read it.)
The cake is a pie
Five major points about system memory compression:
1. Why do you want to do it
Is it because RAM is expensive? Okay, but RAM prices would have to climb much higher to make it worth the new boards, new architectures, and intrinsic problems
Is it because your system will 'run faster with more RAM'? Don't count on it. Trading RAM latency for apparent RAM size will mean that a given apparent RAM size would run slower than when uncompressed (i.e. 64->128MB compressed is slower than 128MB uncompressed), and the performance gain would be variable for a given *true* RAM size (= larger apparent size), and may disappear in certain settings (Is 64MB->128 MB faster than 64MB compressed? Depends.)
Remember, CPUs are data hungry critters, and feeding them at one end (and emptying them at the other) is already one of the biggest challenges of modern system design.
2. Transparency is not enough. We need ultra-transparency
Remember: any general purpose compression yields variable results with different data (and changing a bit will change the 'actual size' of a block, and hence the physical location oif the bytes within the block. Compression confounds the 1:1 correspondence between the physical and logical memory address, and the relationships between different memory addresses, so we'll need to de/compress entire blocks and cache them (more on this later)
Without ultra-transparency, optimizing low level code becomes fraught with emergent effects. the most important thing about RAM is the capability for RANDOM access a lot of people have forgotten there was any other kind -- serial memory, bubble memory, etc. -- bucket-brigaded bits demanded very different algorithms for efficiency!
Think of the pitfalls of straight CPU/mobo caching designs. 'Cache thrashing' can bring some fast algorithms to a molasses crawl, precisely because caching disrupts the relationship between between contiguous bytes: A slower algorthm that reuses bytes in cache is preferred over a 'mathematically faster' one that relies on massive sequential reads. Compression thrashing can do the same, and will have (multiple) cache problems on top of this..
3. Where did I put that block?
There is no assurance that you will be able to return a block to the same DRAM location you got it -- change a few bytes, and it may be larger (in physical length) than it was, despite having the same virtual length. This implies RAM fragmentation -- and all the associated housekeeping. And where are you going to store the record keeping? In *another* local cache? In RAM?
You'll need lots of hardware housekeeping here. It's do-able, but without a level of sophistication that approaches predictive brancing and pipelining, you can count on extensive unexpected 'emergent effects' -- code that's slow for non-obvious reason, or bugs.
4. Strangely, on-chip cache may be the best place to use hardware RAM compression!
a) Hardware compression would be a small addition to the CPU circuitry, and can be run at the full multiplier speed.
b) No need for separate chips or mobo redesign
c) on-chip cache costs hundreds of dollars a meg (price a 512K PIII vs a 2MB PIII Xeon of the same clock speed) so extraordinary measures can be taken. It will probably improve chip yield, too.
d) integrating compression with the prediction/pipelining/cache management/etc of the CPU, can make it more transparent
e) L1/L2 is where you will get *huge* payoff, by 'keeping baby fed'.
f) there's much less latency (chipset, PC trace, L1, L2, L3 cache) on-chip vs. off chip, so Adding an off-chip layer adds more latency than adding an on-chip layer
5. The fundamental performance rule is: trade excess performance in one place to improve inferior performance elsewhere
RAM is no longer a source of 'excess performance', it's a pressure point. Every extra 10MBps in RAM throughput shows up in the benchmarks (unlike, say HDD busses like ATA33/66/100, where doubling or tripling speed does little for system performance.
Where's the excess speed in modern systems? It's inside the CPU -- which runs at a multiplied clock speed, has vast optimization, and is always starved for data. it's also the place where adding RAM does the most good.
However (!) using compressed on-chip cache will require an intensive study and redesign of cache theory, unless this possibility has already been explored in conjunction with the development of the current CPU features like predictive/pipelining/VLIW. 2-,4-, 8-way associative simply won't hack it when the cache doesn't have 1:1 virtual/physical data correspondence!
If you can go to bed, knowing you did a valuable thing today, you're very lucky. If you can't... it's not bedtime