What about using the performance monitor counter registers? I just thought of this earlier today, so I haven't had the time to investigate further, but as far as I know, those registers should only change if performance monitoring is actually set up via the event selection registers; otherwise you can just store a value into them and read it back later. Depending on the CPU model, you can store either 32 or 40 bits per register. They aren't included in context switches, so they should never be written to RAM, and access to them from user space can be disabled (in fact I think it is by default). But I don't know if they survive when the CPU is put to sleep, or if they get overwritten for any other reason.
Assuming they can be used to store information, we could put the AES (or blowfish or whatever) key into them and use it to regenerate the round keys when needed, erasing them from RAM again after a short timeout. Or the registers could be used to store the key for a simpler cipher, which is then used to encrypt the round keys in RAM. Even just putting random bits into the PMC registers and XORing them with the round keys would provide some protection, and would be very fast. There's generally quite a bit of redundancy in the round keys, though, which might give an attacker enough information to figure something out if, say, every 128 bits of the round key table is XORed with the same value.
Obviously this breaks the use of performance monitoring (although some CPU flavors have 18 or more PMC registers, so it should still be possible to use the ones that aren't storing crypto stuff), and will have a performance impact on encrypted disk I/O (regenerating or decrypting the round keys before you can encrypt or decrypt the disk data). But the performance impact should be MUCH less than disabling the memory cache. And I don't think very many people actually use the performance monitoring features of the CPU on their personal machines, so that shouldn't be much of a loss (and anyone who does use them can just skip this idea).
Back in my days as a sysmom (a.k.a system admin), I used to joke that "time rm -rf/" was a good disk performance benchmark. One day one of my coworkers had a machine he needed to reinstall, so he brought me back the numbers.:-)
This was also from the days when there was no rmdir(2) system call - instead rm ran/bin/rmdir whenever it needed to remove a directory (rmdir was setuid root so it could unlink the directory). So after it had cleared out/bin, rm was unable to remove any more directories. He ended up with a skeleton of the root filesystem, with most of the directories but none of the files.
What about using the performance monitor counter registers? I just thought of this earlier today, so I haven't had the time to investigate further, but as far as I know, those registers should only change if performance monitoring is actually set up via the event selection registers; otherwise you can just store a value into them and read it back later. Depending on the CPU model, you can store either 32 or 40 bits per register. They aren't included in context switches, so they should never be written to RAM, and access to them from user space can be disabled (in fact I think it is by default). But I don't know if they survive when the CPU is put to sleep, or if they get overwritten for any other reason.
Assuming they can be used to store information, we could put the AES (or blowfish or whatever) key into them and use it to regenerate the round keys when needed, erasing them from RAM again after a short timeout. Or the registers could be used to store the key for a simpler cipher, which is then used to encrypt the round keys in RAM. Even just putting random bits into the PMC registers and XORing them with the round keys would provide some protection, and would be very fast. There's generally quite a bit of redundancy in the round keys, though, which might give an attacker enough information to figure something out if, say, every 128 bits of the round key table is XORed with the same value.
Obviously this breaks the use of performance monitoring (although some CPU flavors have 18 or more PMC registers, so it should still be possible to use the ones that aren't storing crypto stuff), and will have a performance impact on encrypted disk I/O (regenerating or decrypting the round keys before you can encrypt or decrypt the disk data). But the performance impact should be MUCH less than disabling the memory cache. And I don't think very many people actually use the performance monitoring features of the CPU on their personal machines, so that shouldn't be much of a loss (and anyone who does use them can just skip this idea).
Back in my days as a sysmom (a.k.a system admin), I used to joke that "time rm -rf /" was a good disk performance benchmark. One day one of my coworkers had a machine he needed to reinstall, so he brought me back the numbers. :-)
This was also from the days when there was no rmdir(2) system call - instead rm ran /bin/rmdir whenever it needed to remove a directory (rmdir was setuid root so it could unlink the directory). So after it had cleared out /bin, rm was unable to remove any more directories. He ended up with a skeleton of the root filesystem, with most of the directories but none of the files.