IBM Promises More Memory In The Same Space
dcallaghan was among the many readers to write with news of IBM's announcement of new memory technology. The upshot seems to be on-the-fly compression in hardware, taking the tack of RamDoubler and other software compression utilities, but moving the actual data sqashing into dedicated (fast) chips. I hope this leaks out of "server only" land soon; I'd love to have 256MB for the price of 128 -- this would be especially nice with pricey notebook memory.
Ask yourself: how many people do you know that double their disk drives? Probably not very many, if any at all.
:-)
A few years ago that was all the rage. Now drives are cheap and large. I expect the same to happen with memory- with all the new technologies coming down the pipeline, would you really want to hassle with a "ram doubler"?
Even if it's in memory, you KNOW it's going to cause a bug in some program somewhere
So the compressing is done on hardware, cool.
But since once compressed, we can't compress
them any further, does this mean that any
software compression to increase mem space is
no longer usable?
I like extra MB, but let's hope that they
actually increase the physical storage space instead of just increasing "logical" storage.
Mode (3) smart-aleck mode. Press * to return to main menu.
This would rock on the PC, where our bus speeds are slowly reaching 133mhz. If you could send the data compressed across the bus, that is.
Avoiding the low PC bus speeds are what 3D cards do best. You only upload the textures, and then you just send the vertices of the polygons across every time. Hell, some newer cards are even doing the calculations on some of the vertices once they've made the jump across the bus.
This also holds true for DVD decoders.
I wonder how viable hardware decompression is? Would it be a catch all solution for (low end) replacements for all these avert-the-PC-bus hardware cards? Admittedly, I'm not in touch with any relevant benchmarks to this sort of stuff these days.
This is kind of a tangent, but would there be any issues behind this if they were using, say, LZW compression? Yes, I now they're not. But if they did, would it be kosher? Would we be allowed to use it while still feeling ethically clean?
This is similar to an issue that popped up when RMS spoke in Cincinnati. Someone asked if it was ok to use Transmeta's chips if the conversion layer was proprietary. The answer was that he had not heard anything about it (it was a few days after they made their big announcement) but he guessed (I think) that it might be ok, since hardware costs money to distribute (while software does not). My memory may be slightly flawed on this; don't quote me on it.
As free software gains, we will start encountering this questions more often, with the original, simple principle of sharing software moving into a more general ethical realm dealing with intellectual property in general.
Friends don't let friends misuse the subjunctive.
Compression CANNOT guarantee anything better than 1:1 ratio - it is ENTIRELY dependent on the data.
For data compression in memory to succeed, you MUST have an option to cache the "extra" memory to a swapfile incase the prediction logic fails and you run out of physical ram. If you do not, you will tank your system, bigtime.
Sorry, but I'm very leery of any "memory compression" - it requires OS support to function. Period. You aren't going to just plug in a miracle DIMM and make it work. I hope IBM is opening the spec (it looks like they are) and that OS development people quickly embrace this, or their hardware will take a nosedive in the market.
IANAHardware Engineer, but it seems to me that RAM is already designed to do one simple thing (okay, two things: peek and poke) and to do it absolutely as fast as possible. This technology inevitably will degrade RAM performance by a finite amount. Is their chip fast enough that this degredation will be negligible? If so, then this will be Extremely Cool. If not, then no thanks, I'll just shell out for the extra RAM. Of course, the economics on a huge server with 100GB of RAM are most likely compeletely different.
Yo dawg, I heard you like the Ackermann function, so OH GOD OH GOD OH GOD
"as memory comprises 40 to 70 percent of the cost of most NT-based server configurations"
That's because NT is bloatware. Now if everybody would run Linux, there would be no need for this technology, now would there..
I'm sorry, but I just had to post this.
There is only so much that you can work with, before you need to start throwing out the old paradigm. Silicon is dying so countless engineers develop life support for it, while claiming that things are fine.
.1 micron, you can't outrun the quantum boogeyman. Minimize, dis-consolidize, but realize that our future is not that of Silicon.
Parallel computing, genetic algorithms, and this will not solve the problem. they will only prolong the world's suffering to a time when people will have become used to Big Silicon. Then all of the Wunderkinds of the Valley will be too old to execute for gross negligence.
While this appears to be a good idea, we must understand that we can not grow too secure in silicon. When you get below
Pax Digitalia
Any real-time compression technology tends to make me want to run screaming down the hallway
This product does not promise to double your RAM, but "up to" doubling. Note that it says storage was doubled for "most applications".
On MicroSloth machines where 3/4 of the memory seems to store arrays of zeros, this could be useful. But the more memory-concious the programmer, the poorer the performance of the technology. In other words, I'm not impressed.
With the disk compression utilities, one of the biggest troubles is that the amount that can fit on the disk suddenly depends on the type of data. You get amazing compression rates if you are story textfiles, horrible ones if you are storing GIFs. (The most common compressed file when disk compression was all the rage.)
This will be similar. Suddenly the amount of memory an application uses is going to be less predictable. Perhaps this isn't much of an issue as with virtual memory, people are increasingly disconnected from application memory usage.
Because of virtual memory, this is likely to greatly improve the apparent speed of the system, at least in cases where memory is moderately tight. (<128 Meg on a windows box, for example). Disk access is something like three orders of magnitude slower than memory access. If compression avoids even a few page faults, the lower page-file requirements will more than make up for the extra time to compress.
The cake is a pie
Strike 1: "IBM Memory eXpansion Technology" BiCapitalization is the first sign of bad tech - it means the marketing people got to it before engineering could get it out the door. It also boils down to yet another meaningless TLA to impress PHBs: MXT.
Strike 2: fake numbers. "as memory comprises 40 to 70 percent of the cost of most NT-based server configurations" Er, gee, not only is that an absurdly large error margin, but most servers cost, oh, we'll say $2000 and up. 40% of is $800. $800 of PC133 right now is about 640MB of RAM. Most systems in that price range have 256-384. Oops.
Strike 3: Stating the obvious "and millions of tiny transistors" Oh, and how else would you do it? An analog circuit, perhaps?!
Strike 4: Not promising: "The new technology is seamless to the end-user because the compressed data can be uncompressed in nanoseconds when needed." Call me a pessimist, but memory right now is around 6ns for PC133. Now, assuming a very conservative 2ns to decode the data, that's 8ns, which is a 25% performance hit. How many admins do you know that would take a 25% hit on performance on their servers to save a couple hundred bucks?
In short, this new tech is gonna tank.
Software compression in RAM, or on disk? Software compression on disk is, of course, unchanged. Compressed data structures in RAM will now bloat...so you will be penalized for using them. That is why I hate forced compression.
Remember how for about a year or so manufacturers advertised that they were selling computers with 200Mb hard drives but when you looked in the small print it said "With XXX installed" where XXX was some disk 'doubler'? I guess we're going to get a year of companies misleadingly trying to sell 1/2 Gigabyte PCs claiming they're Gigabyte PCs. You have been warned!
--
-- SIGFPE
What's so new is that it's implemented in hardware. The article claims approximately 4 orders of magnitude better performance than a software solution. Of course, whether that's enough to make it worthwhile is not addressed.
Yo dawg, I heard you like the Ackermann function, so OH GOD OH GOD OH GOD
"This would rock on the PC, where our bus speeds are slowly reaching 133mhz. If you could send the data compressed across the bus, that is."
Maybe, but I'll bet there will be a fairly large impact on latency, with the overhead needed for compression/decompression. Bandwidth is more important for some kinds of memory-intensive applications, like database and photo/video editing stuff. But for everyday applications or games like Q3, the latency is going to knock your performance way down. It's one of the problems with Rambus, where the bandwidth improves but latency actually gets worse.
Call me a pessimist, but memory right now is around 6ns for PC133. Now, assuming a very conservative 2ns to decode the data, that's 8ns, which is a 25% performance hit.
Well, let's say that the stated 2:1 compression ratio is acheived. Now we're moving twice as much data in 133% of the time, which is a 33% performance gain. (2x the data in 8ns, equivalent to same data in 4ns, as compared to the original 6ns.) The break-even point for 2:1 compression is a 2:1 slowdown in performance. If the compression averages 1.5:1, then the performance must be no worse than 1.5x as slow in order to avoid degrading access time. Can the average performance cost ratio go below the average compression gain ratio, and actually increase performance? If not, then how close is tolerable? I'd say there's room for this technology to be of value, especially as stated in the article, in servers with enormous amounts of RAM.
Yo dawg, I heard you like the Ackermann function, so OH GOD OH GOD OH GOD
I use RAM because it's fast, if we didn't care about speed, we would all be writting to our hard drives and have about 2 megs of RAM and just use swap partitions for all of our memory needs. This won't last because, no matter how fast it is, it will be faster to write straight to the RAM.
Eh...
Ok. Even if it's done in hardware this kind of thing has been around for quite some time. The Hobbit processors do this, along with a slew of other embedded processors.
One thing to think about is how much faster is a hardware implementation really? Time and time again general purpose CPU's seem to kick the butts of dedicated hardware in all but the most esoteric cases (like encryption). If it's done by the CPU then the data you need is already in the L1 cache and possibly in registers, all while avoiding pain for the outer caches. Then add in architectures like EPIC which have nothing better to do with some of their units anyway...
I don't know, I don't know if IBM's claims will pan out.
sigs are a waste of space
I took a snippet from /proc/kcore and compressed it to see how well it would work.
/spare/mem
/spare/mem
dd if=/proc/kcore of=/spare/mem bs=1 count=1000000
that resulted in a chunk of kcore 1 million bytes long written to the file
bzip2 -9
that resulted in a file named mem.bz2 being 349791 bytes long.
This was on your typical RedHat system running the usual stuff.
If tits were wings it'd be flying around.
a rule-of-thumb i recall from that era was something along the lines of: 'software solutions to hardware problems are impractical'.
the fact that they do the compression in hardware may have some merit. so i did a bit of testing; i checked the sizes of
on my 32mb box: (4944k used, not counting cache)
on my 192mb box: (144872k used, not counting cache)
figures are probably quite skewed, since the core image was not a snapshot. but it looks like the actual used memory compresses better then the bit-soup that is in the dimms when the system powers up.
who knows... maybe ibm has a few tricks up their sleeves. be interesting to see some linux source to deal with these beasts... i'm assuming that it's os-dependent, and since ibm has been great about linux lately, i'd think they would release whatever kernel patches would be necessary to use these things.
--
I am interested in how big a block of memory this stuff operates on. Compressing a few K at a time, such as a disk block, may be big enough to win, but compressing a 128-bit cache line almost certainly loses. Where's the breakeven point? The hope of this technology is that if you stick it way out in L3 cache, you're usually hauling big enough chunks at a time that the decompression latency for getting the first byte is made up for by the bandwidth of getting the rest of the bytes from a smaller amount of memory.
Help, help, I'm being compressed!
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Is a caching system. From the article:
MXT is a hardware implementation that automatically stores frequently accessed data and instructions close to a computer's microprocessors so they can be accessed immediately -- significantly improving performance. Less frequently accessed data and instructions are compressed and stored in memory instead of on a disk -- increasing memory capacity by a factor of two or more.
Note two things: They are not compressing everything. They are not replacing the actual memory.
Most of the criticisms here are based on misunderstandings of those two things.
(Note that I'm guiltless. I posted a number of times before getting around to read it.)
The cake is a pie
RAM compression isn't going to deliver compression ratios as good as stream oriented algorithms like deflate and bzip2, because it has to be random-access and the compressor doesn't have as much context to look at.
But as you pointed out, memory tends to be pretty compressible because of the significant redundancy.
Since this is hardware located on the motherboard, don't look for it in the desktop too soon, unless Rambus licensing fees drive it.
Since it is on the motherboard it would probably require support at the bios level. Also because it has its own caching system, it may only work with certain CPU chips, maybe, maybe not.
The real question is, where does this technology live? Is it in the North Bridge, or in some other bizarre location? Or is it on the DIMM?
See, for it to be truly transparent (Which it more or less has to be) it's going to have to be in the chipset, or on the DIMM. Putting it on the DIMM means there has to be one of these suckers for each DIMM, but then that might be true already. It seems to make the most sense to embed it into the chipset, or to sandwich it between the chipset and the memory.
If IBM does it right, then that means that you will not need any OS support for this "technology". From the article given, it doesn't really sound like there's anything amazingly new here, but that remains to be seen. I'd want to see the patent involved before I made any calls on this. In any case, hardware data compression has been around for a long time. Heck, take as a [lousy] example the FWB Jackhammer NuBus SCSI cards for the Mac. Those did compression in hardware, though they did have a software component (the driver written to the hard disk.) I'm envisioning this as a more transparent way of handling it.
It's not unreasonable to say that IBM could get 2:1 compression ratios reliably with a good algorithm and some high-speed logic. It's even likely that it will be cheaper to buy 512mb using this technology than 1gb without it; It's even more likely that it'll be cheaper to buy 2gb with it than 4gb without it, because the larger the DIMM, the higher the cost per megabyte, once you get past a certain point (about 32mb.)
Anyway, show me the patent. I'd like to read it.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Bias Disclaimer: I think that predominantly transparent compression of a large storage area rocks. I'm also a huge fan of decicated hardware accelleration. Thus I think this is the sort of cool idea I'd want regardless of whether or not it actually works.
That said, this should provide some performance increase to any system with a swap file, like my little portable (not that it can be retro-fitted) - but I would like to know how the OS will deal with a variable memory size...
Also, I'd tend to think that the dedicated hardware accelleration in today's video cards is not especially esoteric...
A good way to get a quick feel for this is to look at portable MP3 players v's playing MP3s on a WinCE/PocketPC unit. A dedicated MP3 player gives you 64-96MB of RAM for around US$300. A WinCE/PPC unit that can play MP3s will typically cost US$600-$1000 and only include 16MB-32MB. And the battery life is 12hours v's 3 or less.
This really has nothing to do with disk caching, it compresses memory, not disk space. And it WILL increase latency because a section of memory must be decompressed before it can be sent. Even with dedicated hardware it will still take a few clocks.
A deep unwavering belief is a sure sign you're missing something...
It's not just the new drivers from nVidia. What nVidia does is DXTC (DirectX texture compression) which is a form of S3TC (S3 texture compression.) 3DFx is doing this to with FXT1 (I think that's right)
A deep unwavering belief is a sure sign you're missing something...
Opps. Disregard my previous comment. I was listening to the topic instead of reading the article. You're right about everything. I'm sorry. Forgive me. ;)
A deep unwavering belief is a sure sign you're missing something...
mmap() a large encrypted file. Start decrypting.
When you do that in the background the OS will have to start paging in the encrypted file. It will stay in memory until that memory is needed for something else. But the encrypted file is a worst case scenario for the compression algorithm.
If someone has an encrypted filesystem this will actually be an extremely common case!
Really, your "Oh we will never hit that in the real world" is exactly how programmers f*ck up time after time again. You may not see how it will happen, but it will happen eventually and people will get hosed.
Deal with your corner cases before turning your code lose on the world, please.
Regards,
Ben
My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
One of the arguments for using software disk compression back in the day was that it would increase transfer rates. It was faster to move less data across the disk bus and then process it with a fast processor than it was to move the uncompressed data directly to memory.
1) How true was this claim, and if it was true, why isn't it still true?
2) Will the same effect be apparent in harware memory compression? Furthermore, would more performance be seen if the compressor was moved onto the processor so that not as much data had to be moved across the memory bus?
Aah, change is good. -- Rafiki
Yeah, but it ain't easy. -- Simba
yeah, what he said!
I haven't had a look at the press release, only the AP article. The one place this would be useful is in the 2nd or 3rd level cache. If you can compress fast enough then the extra size of the cache would come in handy.
The reason why this technique would work on a cache but I can't see how it would work on main memory is that you don't ever see the cache, so you never make assumptions about how large it will be. So the encrypted data sitting in the cache fills it up after 1 Meg, while the empty matrix full of zero can have almost 4 Megs cached.
Cache is transparent, so you can never be bitten by it changing logical size on you. Unlike main memory.
For main memory, like sig11 points out, the OS will make assumptions that it can store exactly that much info. And since the hardware has no provisions for asking the OS to page out memory (and has no business asking, either) eventually havock will be wrought.
So I'm strongly suspecting that this will turn out to be used for 3rd or 4th level cache.