MIT Software Allows Queries On Encrypted Databases

← Back to Stories (view on slashdot.org)

MIT Software Allows Queries On Encrypted Databases

Posted by Soulskill on Monday December 19, 2011 @09:59AM from the cutting-out-the-middleman dept.

Sparrowvsrevolution writes "CryptDB, a piece of database software that MIT researchers presented at the Symposium on Operating System Principles in October, allows users to send queries to an encrypted SQL database and get results without decrypting the stored information. CryptDB works by nesting data in several layers of cryptography (PDF), each of which has a different key and allows a different kind of simple operation on encrypted data. It doesn't work with every kind of calculation, and it's not the first system to offer this sort of computation on encrypted data. But it may be the only practical one. A previous crypto scheme that allowed operations on encrypted data multiplied computing time by a factor of a trillion. This one adds only 15-26%."

68 comments

Min score:

Reason:

Sort:

Why? by InsightIn140Bytes · 2011-12-19 09:59 · Score: 2

Why not just encrypt the database files on HDD and memory directly? That way database can still act really fast and you can use any existing database software.
1. Re:Why? by Niobe · 2011-12-19 10:05 · Score: 4, Informative
  
  Reasons I can surmise:
  1 no decryption operation required on server
  2 the data can stay encrypted in transit
  1+2 = more security than on-disk encryption
2. Re:Why? by Anonymous Coward · 2011-12-19 10:06 · Score: 1
  
  Because you don't want to let the server ever see the decryption key.
3. Re:Why? by Anonymous Coward · 2011-12-19 10:07 · Score: 1
  
  Because in the traditional way, the DBAs can still see the contents of the DB.
4. Re:Why? by Anonymous Coward · 2011-12-19 10:07 · Score: 5, Informative
  
  Because you want to run your database in the Cloud(tm) for reliability purposes, and you don't want the provider to peek at your data.
5. Re:Why? by Rary · 2011-12-19 10:08 · Score: 5, Informative
  
  Why not just encrypt the database files on HDD and memory directly? That way database can still act really fast and you can use any existing database software.
  A few key phrases from TFA: "...a trick that keeps the info safe from hackers, accidental loss and even snooping administrators ... a useful trick if you need to perform operations on health care or financial data in a situation like cloud computing, where the computer (or the IT administrator) doing the calculations can’t always be trusted to access the private numbers being crunched".
  
  --
  "You cannot simultaneously prevent and prepare for war." -- Albert Einstein
6. Re:Why? by Kaz+Kylheku · 2011-12-19 10:08 · Score: 3, Informative
  
  Because the database is on a remote server, and that is where the queries are executing!
  The model you're describing is that of the database running on the local machine. Data is encrypted between the database server and disk, but not encrypted in the database and not between the database and client. So the database is just a stock program running SQL queries or whatever in the usual way.
  But what if the database must be a remote server? That's how most people use databases, for the purpose of sharing data among many people, scalability, and availability.
  If the data in a database is naively encrypted, then the server cannot perform complex queries. The client must download entire tables, decrypt them, and perform the joins locally. Or so you would think.
  This is the part that these researchers seem to have attacked, from my understanding: somehow get the server to do useful queries on encrypted data without decrypting it without the monstrous overhead of the naive solutions.
7. Re:Why? by zill · 2011-12-19 10:14 · Score: 1
  
  Even if every row is individually encrypted the number of rows and the table layouts can still be leaked. Also by encrypting each row you've basically disabled all the relational operations.
  
  If you meant encrypting the entire database with one key then when the database is compromised all your data is compromised. With CryptDB only the data of currently logged-in users are compromised.
8. Re:Why? by InsightIn140Bytes · 2011-12-19 10:15 · Score: 1
  
  You can, however, connect to the server via ssh tunnel and then make a database connection from there. This way the data is encrypted with remote locations too.
9. Re:Why? by Kaz+Kylheku · 2011-12-19 10:18 · Score: 4, Insightful
  
  Sorry, I don't see how that helps. The idea is that no program on the database server has the key to actually decrypt the data.
  The problem isn't only that you don't trust the network in between, but that you don't trust the database server admins.
10. Re:Why? by Anonymous Coward · 2011-12-19 10:21 · Score: 0
  
  You could indeed instead have your data on encrypted block storage.
  But the point is that in the Cloud(tm), storage is dirt cheap and bandwidth is horribly expensive, so you want the server to do the computation directly to save transfers.
11. Re:Why? by gman003 · 2011-12-19 10:23 · Score: 1
  
  It's for The Cloud. It lets you have the database hosted by people you don't fully trust, without compromising security.
12. Re:Why? by Obfuscant · 2011-12-19 10:53 · Score: 2
  
  This is the part that these researchers seem to have attacked, from my understanding: somehow get the server to do useful queries on encrypted data without decrypting it without the monstrous overhead of the naive solutions.
  I looked through the first few pages of the article. It is very much like how Unix passwords work. You don't decrypt the password in /etc/passwd to see if the user can log in, you encrypt his entered password with the same salt and see if there is a match. The trick is that here the DBMS is not doing the encrypting, there is a proxy that takes the performance hit, allowing the DBMS to run at full speeed.
  The text comparison (LIKE) is done by encrypting each token in the DB text and allowing a token equality comparison. You can't, therefore, do a "LIKE 'Boston%'" to find things like "Bostonian".
  For comparison operations ( select * where salary > 60000) the encryption used maintains order. The encrypted value of 59,999 is less than the encrypted value of 60,000, e.g.. The paper seems to imply that the equality encryption ( cleartext always encrypts to the same ciphertext so an equality of ciphertexts means equality of cleartext) is optional. In reality, order always means equality. I.e., if I search for $val>$x-eps and $val<$x+eps (where eps is the epsilon, or smallest interval in $x) the only answer can be where $val == $x.
  Hmmm. Just saying that, I realize that, unless the encryption of data in the DBMS is highly dependent on the actual data in the DB, eps must be the smallest step in the encrypted data, and since order is preserved, the only "encryption" is thus an offset (add or subtract a constant). Thus the DB encryption of data must be dependent on the range of data. I wonder if there is any useful information that can be extracted from that fact?
  For corporations, this system would be great. If the DBA didn't pre-define the salary column to be comparable, then nobody could do a "where salary > 100000" to find all the highly paid employees (or "bonus > 1000000", either).
13. Re:Why? by Gription · 2011-12-19 10:54 · Score: 1, Insightful
  
  Can you say, "Off shoring" anyone?
14. Re:Why? by Shifty0x88 · 2011-12-19 11:09 · Score: 1
  
  We still have to trust the person we gave a password to, so that they can access the plaintext information in the first place, and as you probably know, they are the worst with securing passwords and are probably using computers that are not secure (not up-to-date, has malware/virus on their machine). =(
  Plus you now have to (in some cases) re-write your tables to make up for the limitations of CryptDB: they referred to storing dates as separate fields quite a few times in their PDF(if comparing dates is important to your DB); add more hard drives: storing the now inflated DB because of keys, and encrypted data > plaintext data; and possibly servers to make up for the increased complexity of getting data out of that DB: we now have an unknown number of encryption layers as well as encryption schemes to decrypt each time, as well as encrypting any information we add back to the DB, and then securing it all up again(wrapping up the onion).
  So for the cloud, this means giving Amazon or whoever you are using to host your information a lot more $$$$$$$$... It solves a lot of problems but we still have that at least one person who holds the key, and who we must trust implicitly... hope they don't get fired...
  Just my 2 cents, but it doesn't mean much
15. Re:Why? by Synerg1y · 2011-12-19 11:24 · Score: 1
  
  None of this matters because...
  A. guy (OP) at the very top of this chain doesn't have a clue wtf he's talking about
  I guess the scenario they're talking about is a machine they can't trust NOT to be compromised, thus loading plaintext in memory (how currently on the fly works), that's a client to database interaction. Decrypting the data nowadays works just fine, no idea here either, old hardware perhaps? They're talking about accessing the data w/o ever decrypting it, thus there is nothing to steal at any point "the onion model". What I don't get is, this data needs to be presented at some point, and maybe the slashdot description is misleading here, how would this do anything to an admin's ability to access the database? It can thwart something like a malware program or a virus, but the admin or application is decrypting the data into viewable plaintext at some point. So I guess the point of doing it this way is performance and memory protection? What else? I feel I'm missing something big here as I think I saw a $20 mil price tag. The concept in itself is cool.
16. Re:Why? by jedwidz · 2011-12-19 11:24 · Score: 1
  
  There's a grain of a good idea there though. We could have an escrow-like system where two cloud providers are involved, with network connectivity between them that is fast and cheap. One provider is used for storage only, and one for processing only.
  By combining storage and session keys, unencrypted data would only be exposed briefly on the 'processing' provider while it's being queried. And where feasible, store individual values encrypted end-to-end, and only decrypt them on a trusted machine.
  Not perfect of course, but perhaps a useful compromise. It might also be possible to offset the extra bandwidth costs by combining the best deal on storage with the best deal on processing.
17. Re:Why? by Anonymous Coward · 2011-12-19 11:25 · Score: 0
  
  If it's sensitive enough to warrant encryption, what the Hell is it doing in the cloud?!
18. Re:Why? by gman003 · 2011-12-19 11:50 · Score: 1
  
  We still have to trust the person we gave a password to, so that they can access the plaintext information in the first place
  The point of it is that you have the passwords, and all "The Cloud" has is the theoretically-useless-and-indistinguishable-from-garbage ciphertext. You can tell The Cloud to perform certain operations and retrieve the data, but there's (theoretically) no way for the Cloud to know what, exactly, they have.
19. Re:Why? by lgw · 2011-12-19 11:55 · Score: 2
  
  comparison operations ( select * where salary > 60000) the encryption used maintains order. The encrypted value of 59,999 is less than the encrypted value of 60,000,
  
  I've never understood this bit. If, without the encryption key, I can compare two pieces of data to see which plaintext is less than then other, that seems like a huge hole. For normalized data in the DB, if some of the plaintext is known or guessable, I can probably guess all the values (since normalized values are generally represented by small integers). Heck, if I have "less than", can't I find the plaintext result of subtracting one plaintext value from another, without the key? That's effectively the same as decrypting English text.
  
  --
  Socialism: a lie told by totalitarians and believed by fools.
20. Re:Why? by Thiez · 2011-12-19 12:02 · Score: 1
  
  What's the problem? Encryption tends to be pretty damn reliable, as long as you don't leave the keys lying around there isn't really any objection to having the encrypted data in the cloud.
21. Re:Why? by marcosdumay · 2011-12-19 12:09 · Score: 0
  
  It is in the cloud because it is encrypted. Thus the sensitivity of the information doesn't matter.
  
  --
  Rethinking email
22. Re:Why? by Obfuscant · 2011-12-19 12:19 · Score: 1
  
  how would this do anything to an admin's ability to access the database?
  It wouldn't. The admin would have the permissions to do a "select * from ..." query. It would, however, present him with only encrypted data as a result. The decryption of the data itself takes place only on the client (or proxy), which we assume is outside his scope of control.
  Now, that does raise the question of leaks from the "comparision" columns. That is, those that are encrypted with an algorithm that always returns the same ciphertext and the ciphertext has the same ordering properties as the clear. It would seem that if he knows his own salary, for example, he could figure out all the rest, except that he won't know which row applies to him because the "employee name" data is also encrypted and he doesn't have that key, either.
23. Re:Why? by ChatHuant · 2011-12-19 13:36 · Score: 1
  
  If, without the encryption key, I can compare two pieces of data to see which plaintext is less than then other, that seems like a huge hole.
  Read the GP a bit more carefully (more exactly, the part where it says "The encrypted value of 59,999 is less than the encrypted value of 60,000"). Both the value in the query and the contents of the database are encrypted, and the operator of the database can't read either. The operator can not compare two values without the encryption key - all he sees is an unknown query containing an unknown value, and a number of resulting records, also with unknown values.
  
  I'm not sure whether the query itself is encrypted - but I assume it is, because otherwise it would allow the (untrusted) database operator to run a kind of traffic analysis - in the example, if you ran a number of "where salary > X" queries with different values, the operator will notice that some records are always returned, and some not, and extract some ordering information.
24. Re:Why? by Obfuscant · 2011-12-19 14:11 · Score: 1
  
  Both the value in the query and the contents of the database are encrypted, and the operator of the database can't read either. The operator can not compare two values without the encryption key -
  
  Well, if the article I read was correct, that's not true. The "operator" (DBA) certainly can read the contents of the database. When he created the database, he was told that the "salary" column contains data that needs to be compared, so it will be encrypted with an algorithm that allows comparision of values.
  He can easily extract that column and compare the values contained therein, which will sort in exactly the same manner as the cleartext.
  The only question is, can he get any significant information from that data. He can certainly do a distribution analysis and probably a rudimentary guess at what the values are close to. (If you know the maximum salary -- CEO, e.g. -- then you should be able to make some pretty accurate guesses at the other numbers.) He might call up HR and ask them what his salary is supposed to be, and then attempt to identify the HR query to locate the ciphertext version that corresponds to his salary by watching the encrypted queries as they come through. ("While I'm on the phone, here's the only query that included what I think is the salary field in the response...")
  This is based on number theory. If a set of numbers with a minimum difference of 'eps' (for integers, eps==1, e.g., for salaries, eps is likely to be 0.01) is encoded in a way that maintains both the properties of "same cipher every time" and "maintains sort order", then you should be able to get a lot of information out of a large enough sample.
  For example, if three real salaries are $1, $2 and $3, then the cipher versions must have as many possible values between $1 and $2 as between $2 and $3, otherwise you could not encode all the possible values. Either you can do this analysis, or the client does some shuffling of data based on probabilities of occurance and doesn't need to encode all possible values. I don't know which.
25. Re:Why? by KazW · 2011-12-19 14:42 · Score: 1
  
  comparison operations ( select * where salary > 60000) the encryption used maintains order. The encrypted value of 59,999 is less than the encrypted value of 60,000,
  I've never understood this bit. If, without the encryption key, I can compare two pieces of data to see which plaintext is less than then other, that seems like a huge hole. For normalized data in the DB, if some of the plaintext is known or guessable, I can probably guess all the values (since normalized values are generally represented by small integers). Heck, if I have "less than", can't I find the plaintext result of subtracting one plaintext value from another, without the key? That's effectively the same as decrypting English text.
  Incorrect. The GP was speaking about an integer comparison, not a string comparison. Integer sorting is useful for sorting data records, sorting data by the calculated result of text data wouldn't be so useful.
  
  --
  Geeks don't grock information, they grep it.
26. Re:Why? by Kjella · 2011-12-19 15:37 · Score: 3, Insightful
  
  Well strictly speaking, they don't need to know. The DBA - as in the person that makes sure the database is running, upgrades are done, backups are made and so on is often not really supposed to be privileged to all the information in the database. Probably the same kind of place you won't let your developers see production data, the development server has a different encryption key and the production key is set once during install, backed up in a safe and the production application server logged to hell and back including remote logging and audits. The only access anyone is supposed to have to the system is through the application that's enforcing permissions, logging and all that. I've only worked in relatively low-security environments but I'm perfectly aware that "SELECT * FROM [table]" circumvents anything and everything the application does to protect the data. In many environments that's fine and an accepted risk, if you're managing the database you should be sufficiently trusted to not go poking about. But I can easily see situations where that's not the case, without everybody jumping up and down about outsourcing. It's nothing personal in that they don't trust IT, but just like you in accounting don't want one person who can put in an invoice, approve it and take delivery you don't want one person from IT with all the keys to the castle. That this is the practical reality many places is because there hasn't been any other convenient enough way, it's not by design.
  
  --
  Live today, because you never know what tomorrow brings
27. Re:Why? by Anonymous Coward · 2011-12-19 23:14 · Score: 0
  
  +99 Insightful
  this is a major issue, not only for the organisation, but for the DBA - who might be suspected of acting on possessing this information, even if s/he did not.
Re:MIT is overrated by Anonymous Coward · 2011-12-19 10:05 · Score: 4, Funny

Mine too... Perhaps AC isn't the way to go.
a little bit strong claim by Trepidity · 2011-12-19 10:07 · Score: 4, Informative

This is not really the first practical such system, nor have all previous systems been a trillion times slower. As seems to be a pattern with MIT press releases, the press release makes exaggerated claims, but the paper itself is actually quite good and gives proper credit where it's due, discussing a number of previous systems that implement related functionality, and some existing algorithms from the literature that they borrow and implement directly in CryptDB.

--
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
1. Re:a little bit strong claim by martin-boundary · 2011-12-19 10:35 · Score: 1
  
  A trillion maybe not, but certainly much, much, much, much slower than an unencrypted calculation. If you factor in cache effects, CPU stalls, the need to do a shitload of work just to unencrypt each value before using it, then performing a simple arithmetic mean over the encrypted rows of a database table could easily add up to hundreds of thousands of wasted cycles per item, compared with doing the same calculation on an unencrypted chunk of memory.
  There's just no question about it. It's going to be dog slow any way you look at it.
2. Re:a little bit strong claim by Anonymous Coward · 2011-12-19 10:36 · Score: 5, Insightful
  
  It's a fundamental tension between the scientists and the PR departments. I see this where I work (at a DoE national lab). Basically, we scientists publish cool results, and submit them to the PR department as candidates for press releases. The PR department of course tries to jazz it up as much as they can. So we go back-and-forth with them for a bit, trying to compromise on something is isn't factually wrong while still being accessible to the general public, and giving people a good feel for why our work is important.
  
  Then the press release is interpreted by media outlets, which dumb it down even more and stretch the claims even further. After even just 2 or three levels of this, honest sensible papers turn into grandiose hyperbole. A nice theoretical result on metamaterials becomes "scientists invent invisibility cloak"; work on new semiconductors becomes "world's fastest transistor"; and a paper on tentative correlations between X and Y becomes "X causes Y!" Believe me when I say that most scientists are embarrassed when they see their results exaggerated and misinterpreted like this.
  
  This is not meant to excuse such behavior. Some PR departments are better than others. At some institutes there is too much pressure from on-high to be seen in the media as being innovative, revolutionary, and all that other buzzwords. But at the end of the day, scientists have to have the courage (and the authority) to prevent press releases from going out that are so stretched as to be factually incorrect.
3. Re:a little bit strong claim by reve_etrange · 2011-12-19 10:39 · Score: 2
  
  It seems typical of most universities' press releases. They have PR divisions which troll the research faculty for new developments they can turn into whiz-bang popularized "articles."
  I think that it's sort of the paradigm for how things are done at most large institutions: the researchers can't be bothered or don't have time to write popular accounts, do extraneous paperwork or file patents, so others are made to do it for them. The result is extraordinary claims in the press releases at best, and serious clerical mistakes or invalid patents at worst.
  
  --
  .: Semper Absurda :.
4. Re:a little bit strong claim by icebraining · 2011-12-19 10:52 · Score: 2
  
  the need to do a shitload of work just to unencrypt each value before using it
  I think the point of the system is that you don't need to unencrypt the values at all to perform the calculations. It's homomorphic encryption.
  
  --
  Dilbert RSS feed
5. Re:a little bit strong claim by Anonymous Coward · 2011-12-19 11:42 · Score: 1, Funny
  
  I can't see any Christian data center allowing this homo-encryption. Hate the sin, love the sinner, blah blah blah.
  Now we see what Alan Turing was really up to when he invented the computer!
6. Re:a little bit strong claim by Trepidity · 2011-12-19 13:38 · Score: 1
  
  Yeah, I agree on the last point, though pragmatically I made somewhat of a distinction between tenured and untenured faculty. If you're untenured at a place like MIT, there's huge pressure to get publicity and do Earthshattering Research, so I can cut overhyping some slack. I hold tenured faculty to a higher standard, though, because they don't have to overhype their research to keep their job. Looks like in this case one of the faculty co-authors is untenured, so maybe should get some slack on account of his journeyman status.
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
7. Re:a little bit strong claim by martin-boundary · 2011-12-19 14:20 · Score: 1
  
  Sure, but you can never fully support the 4 arithmetic operations homomorphically, as then your encryption map would be an isomorphism, ie the "encryption" would be trivial to break.
  So something must always give, at best you might have a system where some operations work for some set of numbers, but will not work for all numbers. If you don't know anything about what the encrypted dataset contains, you won't be able to know for certain if your homomorphic calculation is even correct. And if you restrict to fewer than the 4 operations, your system will be severely limited.
8. Re:a little bit strong claim by Anonymous Coward · 2011-12-19 14:58 · Score: 0
  
  U work at doe national lab? I find it unlikely! Based on ur comment! Maybe as a janitor... Or at ORNL. Snort.
9. Re:a little bit strong claim by Anonymous Coward · 2011-12-20 01:36 · Score: 0
  
  You could have several columns for the same data under different homomorphisms, I think
10. Re:a little bit strong claim by martin-boundary · 2011-12-20 10:56 · Score: 1
  
  Yes, but how is that going to be useful (aside from having different encryption for the columns)? You'll have to anticipate the usage for the columns in your schema design, eg column 1 contains numbers that can only be added together but not multiplied, column 2 has numbers that can only be multiplied together but not added, etc.
  One reason why full arithmetic is important is that the designer doesn't have to know what exactly the user will want to do with the data later on.
Order preserving encryption by vlm · 2011-12-19 10:28 · Score: 3, Interesting

Order Preserving Encryption, how is it implemented? The paper page 4, simply lists that it exists and has a pointer to an article somewhere that I have no access.
I'm not understanding how this hides "known plaintext" attacks. Perhaps its not intended to. Like I said, I have no access to the footnoted OPE article. So, lets say you got a medical database of private health care info, where the diagnosis is a column. If you can sort it, all the folks with "aids" sort at the top, right above the "alcoholism" diagnosis, with the "worms, intestinal" and I suppose the "zoophilia" people at the bottom.
I suppose, the solution, is unless there is a business need to sort by diagnosis, you don't use OPE for that column, you use DET or if no need for "group by", then RND.

--
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
1. Re:Order preserving encryption by Tacvek · 2011-12-19 11:59 · Score: 1
  
  The scheme they use is reference 4, which is available online at http://www.cc.gatech.edu/~aboldyre/papers/bclo.pdf (found with simple Google search for the paper's title: Order-Preserving Symmetric Encryption)
  My bigger issue is with a chosen plain-text attack. If a column is currently stored in say DET, and you have full view of the database, and you arrange for say an insertion of a row with some specific value, now you know all the rows with that value. Even if you can only arrange for a "select * from diagnosis where ICD9Code='042'" , even if it has a couple of unrelated joins, there will only be a couple of strings, and you can probably narrow it down to the one string you want. With that you now know which rows indicate a diagnosis of AIDS. A serious information leak. Combine that with a few other chosen plaintext queries, like one involving the patient Joe Smith, and before long, you have enough to check if Joe Smith has AIDS!
  
  --
  Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524
2. Re:Order preserving encryption by Anonymous Coward · 2011-12-19 13:26 · Score: 1
  
  I recall an enterprise database system which (used to, not sure if it still does) allowed you to do an "explain" of a SQL query against a view even though you had no access rights to the underlying tables. The "explain" output showed estimated row counts of intermediate results even though some of those results would have been eliminated because the view limited your access (such as by checking your role in the organization and only letting you see your region's sales numbers). As the statistics got better, it was sometimes possible to reasonably infer interesting stuff using clever queries with joins and the like based on estimated row counts of interesting intermediate results.
3. Re:Order preserving encryption by Fnord666 · 2011-12-19 14:32 · Score: 1
  
  My bigger issue is with a chosen plain-text attack. If a column is currently stored in say DET, and you have full view of the database...
  If I understand correctly, one of the functions of the layering of crypto is to prevent an attacker from having a full view of the database. DET would be layered below an RND outer layer, preventing you from gaining that view.
  
  --
  'The tyrant will always find pretext for his tyranny.' - Aesop's Fables
4. Re:Order preserving encryption by Anonymous Coward · 2011-12-19 17:19 · Score: 0
  
  I'm betting the ones diagnosed with a sexual preference for animals would appreciate keeping that encrypted, if only because of the consequences to their animals in a world unenlightened about such a co-orientation.
5. Re:Order preserving encryption by Tacvek · 2011-12-20 14:17 · Score: 1
  
  The design is that once any legitimate query needs access to the DET layer, the proxy has the database replace the whole column with the one with RND stripped off, leaving just DET. They mention that it would be possible to re-encrypt back to DET if after a long enough period of time if no further queries occurred that needed it.
  It would completely kill performance if they always restored the RND layer, since much of the overhead of the system comes from pulling and decrypting columns, so that operation must be as infrequent as possible. When the columns are already using the required level, the overhead is limited to the latency of the proxy, plus the time required to encrypt the constants in the sql and translate the table and column names to the obfuscated ones.
  Thus in steady state, the amount of work done by the database server itself is nearly identical to what would be done on the same database without encryption. That is the key to the speed.
  
  --
  Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524
Re:MIT is overrated by Unoriginal_Nickname · 2011-12-19 10:31 · Score: 2

"MIT is overrated because I can't get into MIT."
Yeah. Keep telling yourself that.
very useful technology by Anonymous Coward · 2011-12-19 10:47 · Score: 1

Performing searches -- and other operations -- on encrypted data has some big potential to help protect our privacy in an age where people are losing control of their personal information. Since the data is *never* decrypted by the server, no information will be leaked even if the communication channel between the client and the server is compromised, or the server itself is hacked. If the query is also encrypted before sending, which is the case for most schemes, the server does not learn anything about the operation or the contents of the data. It makes central storage of data a LOT less risky (take note Sony).
In addition to searching, researchers have had some success in doing other operations on encrypted data, such as multiplication and addition. This means that if you encrypt the value 20, and multiply it by the encryption of 2, the decryption of the result will be 40! Pretty amazing, if you ask me. While processing power is still a major problem for most of these schemes (they are far from ready for production-level data volumes), the next 10 years in this field will surely be very exciting.
And.. timing attacks? by Anonymous Coward · 2011-12-19 10:50 · Score: 0

What about crafting a sql query for timing attacks?
1. Re:And.. timing attacks? by Anonymous Coward · 2011-12-19 11:02 · Score: 0
  
  Why are you so gay?
2. Re:And.. timing attacks? by maxwell+demon · 2011-12-19 11:15 · Score: 1
  
  Can you even formulate a query if you don't have the key?
  
  --
  The Tao of math: The numbers you can count are not the real numbers.
3. Re:And.. timing attacks? by kvvbassboy · 2011-12-19 12:22 · Score: 1
  
  No. Encrypted queries operating directly on an encrypted database. Sounds really rad! A snooping third party will only see random gibberish.
4. Re:And.. timing attacks? by Obfuscant · 2011-12-19 12:29 · Score: 1
  
  Can you even formulate a query if you don't have the key?
  As DBA:
  show databases;
  ...list of databases returned.
  use patientData; show tables;
  "Database changed" ... list of tables returned.
  select * from patients;
  ... encrypted data returned. Damn...
  The DBMS is unmodified. It's the data that's encrypted in a way that SQL can deal with it. Mostly. Redefined SUM and other math ops.
5. Re:And.. timing attacks? by maxwell+demon · 2011-12-19 19:29 · Score: 1
  
  In that case, if I were using such a service I'd make sure that my database name and tables are generated from the real ones with a salted hash (where the salt never leaves my system). Therefore the DBA would not see a database "patientData" but a database "A3FE5653A554ADEC" or similar. Of course if the customer is a hospital, they could guess that it contains patient data. But then, it also could contain staff information, financial data, data about contracts with suppliers of medical equipment, ...
  
  --
  The Tao of math: The numbers you can count are not the real numbers.
Informative by Esteanil · 2011-12-19 11:02 · Score: 1

Mod parent up

--
I'm a dreamer, the world is my playpen. But hey, I'm a serious person, I can't dream all the time.
meh. by Anonymous Coward · 2011-12-19 11:13 · Score: 0

The case they have to the p vs np problem seems pretty trivial to me.
http://www.claymath.org/millennium/P_vs_NP/
Re:MIT is overrated by Anonymous Coward · 2011-12-19 11:27 · Score: 0

So you solved the other four?
so, basically... by mtrachtenberg · 2011-12-19 11:39 · Score: 1, Flamebait

"A previous crypto scheme that allowed operations on encrypted data multiplied computing time by a factor of a trillion. This one adds only 15-26%."
So, basically you're saying that both schemes slow things down, right?
--PHB
1. Re:so, basically... by Thiez · 2011-12-19 12:05 · Score: 1
  
  Yes. What is your point?
2. Re:so, basically... by bytesex · 2011-12-19 22:43 · Score: 1
  
  It's probably that the website with pictures of his dog backed by mysql is blazing fast and doesn't need this, and that therefore he thinks that *nobody* will need this.
  
  --
  Religion is what happens when nature strikes and groupthink goes wrong.
Re:MIT is overrated by Unoriginal_Nickname · 2011-12-19 11:50 · Score: 1

Everything except the Riemann hypothesis. Fuck complex analysis.
Looks very useful by Sean · 2011-12-19 18:07 · Score: 1

Have to try it out first though, but this is one of the hardest security problems to solve: How do you trust others to maintain the servers your database is on. Even if there are attacks against this, it seriously raises the bar for attackers. Especially if your attacker only has access to your database for a brief period of time. I can see attacks against this if your adversary can analyze you queries over a long period of time.
Severe Limitations not handled by Anonymous Coward · 2011-12-20 03:27 · Score: 0

While this work is good, it has severe limitations that undermine their assumptions and considerations of their threat 2. If all architectures are compromised and the data is unencrypted at some point (on the client), the attacker can simply modify client side javascript to relay sensitive information to a third party.
I really wish there was an open review process for these works as big names on publication seem to result in reviewers jumping to conclusions about the technology.
Alternative to homomorphic encryption by Anonymous Coward · 2011-12-22 08:22 · Score: 0

Reading throught the motivation for homomorphic encryption, isn't the same achievable by packaging encrypted data input on disk, the decrypted data only in memory, and the encrypted results back on disk, and create a single virtual image with the public key stored in the virtual image and shipped to an untrusted party? So long as the untrusted party can't get to the public key, the decrypted data is not accessible, no?" Of course, one would package the encrypted input, the sofware to decrypt it and the results all in one virtual machine image, and the only time any decrypted data is processed is when it is actually in memory. Another variant could be that even RAM has the encrypted data but there is a "memory bridge" that decrypts memory and feeds the CPU decrypted data but the result written back to memory and the disk is encrypted using the public key which is also volatile within the virtual machine.