MIT Software Allows Queries On Encrypted Databases
Sparrowvsrevolution writes "CryptDB, a piece of database software that MIT researchers presented at the Symposium on Operating System Principles in October, allows users to send queries to an encrypted SQL database and get results without decrypting the stored information. CryptDB works by nesting data in several layers of cryptography (PDF), each of which has a different key and allows a different kind of simple operation on encrypted data. It doesn't work with every kind of calculation, and it's not the first system to offer this sort of computation on encrypted data. But it may be the only practical one. A previous crypto scheme that allowed operations on encrypted data multiplied computing time by a factor of a trillion. This one adds only 15-26%."
Why not just encrypt the database files on HDD and memory directly? That way database can still act really fast and you can use any existing database software.
Mine too... Perhaps AC isn't the way to go.
This is not really the first practical such system, nor have all previous systems been a trillion times slower. As seems to be a pattern with MIT press releases, the press release makes exaggerated claims, but the paper itself is actually quite good and gives proper credit where it's due, discussing a number of previous systems that implement related functionality, and some existing algorithms from the literature that they borrow and implement directly in CryptDB.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
Order Preserving Encryption, how is it implemented? The paper page 4, simply lists that it exists and has a pointer to an article somewhere that I have no access.
I'm not understanding how this hides "known plaintext" attacks. Perhaps its not intended to. Like I said, I have no access to the footnoted OPE article. So, lets say you got a medical database of private health care info, where the diagnosis is a column. If you can sort it, all the folks with "aids" sort at the top, right above the "alcoholism" diagnosis, with the "worms, intestinal" and I suppose the "zoophilia" people at the bottom.
I suppose, the solution, is unless there is a business need to sort by diagnosis, you don't use OPE for that column, you use DET or if no need for "group by", then RND.
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
"MIT is overrated because I can't get into MIT."
Yeah. Keep telling yourself that.
Performing searches -- and other operations -- on encrypted data has some big potential to help protect our privacy in an age where people are losing control of their personal information. Since the data is *never* decrypted by the server, no information will be leaked even if the communication channel between the client and the server is compromised, or the server itself is hacked. If the query is also encrypted before sending, which is the case for most schemes, the server does not learn anything about the operation or the contents of the data. It makes central storage of data a LOT less risky (take note Sony).
In addition to searching, researchers have had some success in doing other operations on encrypted data, such as multiplication and addition. This means that if you encrypt the value 20, and multiply it by the encryption of 2, the decryption of the result will be 40! Pretty amazing, if you ask me. While processing power is still a major problem for most of these schemes (they are far from ready for production-level data volumes), the next 10 years in this field will surely be very exciting.
What about crafting a sql query for timing attacks?
Mod parent up
I'm a dreamer, the world is my playpen. But hey, I'm a serious person, I can't dream all the time.
The case they have to the p vs np problem seems pretty trivial to me.
http://www.claymath.org/millennium/P_vs_NP/
So you solved the other four?
"A previous crypto scheme that allowed operations on encrypted data multiplied computing time by a factor of a trillion. This one adds only 15-26%."
So, basically you're saying that both schemes slow things down, right?
--PHB
Everything except the Riemann hypothesis. Fuck complex analysis.
Have to try it out first though, but this is one of the hardest security problems to solve: How do you trust others to maintain the servers your database is on. Even if there are attacks against this, it seriously raises the bar for attackers. Especially if your attacker only has access to your database for a brief period of time. I can see attacks against this if your adversary can analyze you queries over a long period of time.
While this work is good, it has severe limitations that undermine their assumptions and considerations of their threat 2. If all architectures are compromised and the data is unencrypted at some point (on the client), the attacker can simply modify client side javascript to relay sensitive information to a third party.
I really wish there was an open review process for these works as big names on publication seem to result in reviewers jumping to conclusions about the technology.
Reading throught the motivation for homomorphic encryption, isn't the same achievable by packaging encrypted data input on disk, the decrypted data only in memory, and the encrypted results back on disk, and create a single virtual image with the public key stored in the virtual image and shipped to an untrusted party? So long as the untrusted party can't get to the public key, the decrypted data is not accessible, no?" Of course, one would package the encrypted input, the sofware to decrypt it and the results all in one virtual machine image, and the only time any decrypted data is processed is when it is actually in memory. Another variant could be that even RAM has the encrypted data but there is a "memory bridge" that decrypts memory and feeds the CPU decrypted data but the result written back to memory and the disk is encrypted using the public key which is also volatile within the virtual machine.