Gaining RAM For Free, Through Software
wakaramon writes with a piece from IEEE Spectrum about an experimental approach to squeezing more usable storage out of a device's existing RAM; the researchers were using a Linux-based PDA as their testbed, and claim that their software "effectively gives an embedded system more than twice the memory it had originally — essentially for free." "Although the price of RAM has plummeted fast, the need for memory has expanded faster still. But if you could use data-compression software to control the way embedded systems store information in RAM, and do it in a way that didn't sap performance appreciably, the payoff would be enormous."
That's an old idea - using transparent compression to gain more memory...
The 80s and 90s called - they want their technology back.
"Enjoy what you're doing! If it becomes drudgery, you're doing it wrong!" - Jim Butterfield
These days, with RAM bandwidth being a major bottleneck, it might actually make a lot of sense if you could do the compression / decompression in hardware between the cache controller and the RAM - you'd get more bandwidth to RAM at the cost of slightly more latency.
I am TheRaven on Soylent News
I haven't managed to find the patent application yet, but I wonder if Connectix's RAM Doubler product would be considered prior art.
SoftRAM claimed to do this, but the product didn't do anything except report to the user that it was doing something.
I didn't realize there were similar products that actually worked; I thought the whole concept was snake oil.
Since they patented it and are licensing it, it's not really free is it?
Lightly compressing RAM to make it appear larger hardly counts as a new idea... I ran a program back on Windows 3.11 that did exactly that - And while it did indeed allow running more things at once without suffering a massive slowdown, It came at the cost of making everything run noticeably (though not unusably, as with swap/pagefile use) slower.
Memory compression had one major drawback, aside from CPU use (which I suspect we would notice less today, with massively more powerful CPUs which tend to sit at 5-10% load the vast majority of the time)... It makes paging (in the 4k sense, not referring to the pagefile) into an absolute nightmare, and memory fragmentation goes from an intellectual nuissance that only CS majors care about, to a real practical bottleneck in performance. Consider the behavior of a typical program - Allocate a few megs on startup and zero it out - That compresses down to nothing. Now start filling in that space, and your compression drops from 99.9% down to potentially 0%.
Personally, I think it could work as an optional (to programs) OS-level alternative to memory allocation... The programmer can choose to use slightly slower compressed memory where appropriate (loading 200MB of textual customer data, for example), or full-speed uncompressed memory by default (stack frames, hash tables, digital photos, etc).
Just download more RAM if you need it!
Actually, I'd rate that informative, rather than funny. I've actually tried a couple of such programs back then, and invariably it was just a fancy way to slow your computer down. (Mildly.)
Basically, the way it worked was:
1. Report more RAM to the OS. That's actually what your swap file does too. Virtually any modern processor has _some_ way to pretend it has more memory than physically present, with the extra bytes being in a swap file.
2. Set aside half the memory as a kind of compressed, virtual (in memory) swap file.
So at this point, let's say your computer had 4 MB RAM (hey, back then we didn't measure RAM in Gigabytes). So now you'd only have 2 MB of it free as physical memory for your programs, and 2 MB set up as a compressed swap file. But your OS thought you have 8 MB, with 2 MB being the free RAM left and 6 MB of it being swap space.
3. However, you typically still wanted some actual swap space, because you don't know, and can't guarantee, how well that swap space compresses. If you swap out, say, a table of random numbers, you may not be able to compress at all. Funky things can happen when the OS thinks it has room to swap a page out, but it turns out that it doesn't fit there. The actual HDD swap file would be, at the very least, the safety net to catch whatever doesn't fit into that RAM buffer.
Now the thing is:
A. That virtual compressed swap space was typically faster than HDD (we didn't have 15,000 RPM drives with huge caches, back then), but, here's the important part, _much_ slower than just plain old free RAM. ("Free" as in "available to the OS as it is.") Even the page fault itself, never mind the compression, was _much_ slower than the few cycles required to just read a memory page.
Compression didn't make it much better. Almost any decent compression algorithm is fast when deconpressing, but slow when compressing. When handling a page fault in that context, you had to do both. Compress the page you want swapped out, and decompress the page you want swapped in. Not only that took time, but it was CPU time. Unlike IO time, which happens on DMA in an ideal world, and lets your CPU schedule some other task in that time.
B. However, now you had less free RAM _and_ were encouraged to load more into it. If you had 5 MB of memory in use on the above described computer, without RamDoubling scams, you'd have 4 MB of physical memory in use and 1 MB swapped to disk. With such a RamDoubling scheme, you had 2 MB in actual normal RAM, and 3 MB swapped out.
In almost all cases, the "ram doubling" inherently increased the number of pages swapped in and out per second. In some cases, dramatically. (E.g., Java's GC didn't play nice at all with swapping anyway. It already tended to push everything else out. Play with it in even less space, and things could get funny.)
So a lot of the time, sometimes even most of the time, all you'd get for your efforts was slowing your computer down. And a useless number telling you "now you have 8 MB RAM!!!!11oneeleventeen", but not what the cost there is, or even what it really means.
A polar bear is a cartesian bear after a coordinate transform.
You just use a hole punch on your page file, and you can write to it from the other side!
sudo ergo sum
If you do it in hardware, and have the data decompressed or compressed between the cache controller and the memory controller then it might work better - you'd gain a bit of latency, but you'd get more throughput (because you'd need less bandwidth to transmit the same amount of uncompressed data between CPU and RAM) which might make up for it in a lot of cases, particularly on chips with SMT support.
The biggest problem I see with doing it in software is that, for it not to be horrendously slow, you need to keep the compression code (and data) in cache at all times. This means that you are reducing the amount of cache available to all other programs, which means you are going to be fetching data from RAM more often, which eliminates much of the use for this.
I am TheRaven on Soylent News
>If you had RTFA, you'd have found that the difference they've made is they've developed a compression scheme that doesn't have the huge performance penalty that old techniques had.
Back then, the real Stack (not Microsoft's poorly implemented and unstable clone) didn't have a huge impact on the performance either, as it used a big cache and had significantly less amount of data to transfer from/to a harddisk which back then didn't shine bandwidth-wise.
The reason stack died isn't the preformance hit. The reason stack died is a combination of :
- Microsoft managing to instill paranoia about RT-compression thank to their double-crap
- huge drops in storage price wich made on-the-fly compression irrelevant
- newer data formats which are hard to compress any way (Stack could be efficient back then when most graphics where RLE-encoded bitmaps. Now that everything is stored as JPEGs and MP3, there's not much an additional layer of compression could do).
Ultra-fast compression algorithms like LZO aren't something new, and could easily be implemented in a hardware chip for even faster performance.
Such compression *could* have been useful a decade ago, when PDAs still had limited memory and did cost a lot.
Now, with the price drops of memory and the increased popularity of solid state memory (micro-SD have just insane capacity these days), it hard to be short on memory even on embed device.
So it's nice that they have gone through all the technical difficulties to have real-time compression at RAM-level of bandwith.
But they developed it a decade too late to have any marketable product.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
It appears that the key phrase here is "embedded systems".
FTA, they appear to be making use of the regularity of certain patterns of data found commonly in embedded systems, and tailoring their compression algorithm to it.
I'm not sure that it is really a great feat to engineer a special-purpose compression algorithm that out-performs general-purpose algorithms.