Microsoft To Provide New Encryption Algorithm For the Healthcare Sector
An anonymous reader writes: The healthcare sector gets a hand from Microsoft, who will release a new encryption algorithm which will allow developers to handle genomic data in encrypted format, without the need of decryption, and by doing so, minimizing security risks. The new algorithm is dubbed SEAL (Simple Encrypted Arithmetic Library) and is based on homomorphic encryption, which allows mathematical operations to be run on encrypted data, yielding the same results as if it would run on the cleartext version. Microsoft will create a new tool and offer it as a free download. They've also published the theoretical research. For now, the algorithm can handle only genomic data.
Continuing the fine tradition of not RTFA around here, I didn't read the research paper but I did skim wikipedia's entry.
Nowhere do I see any mention of authenticity. This is as important as confidentiality and integrity. I'm not saying there isn't a solution (I'm not a cryptographer) but I wonder if anyone has any insight or links to a solution if it exists.
Here's the scenario. Homomorphic encryption lets us keep the data constantly encrypted, maintaining confidentiality. Ok, that's cool for data breaches, we stay much better protected from loss of confidentiality.
But what if a malicious actor purposely performs an operation on the data? Changing genomic data in this case might mess up diagnoses/research, etc. Future applications could be stuff like medical billing -- if its easy to tack on another bill, even if you don't know previous bills because its encrypted? Is there any mechanism that checks that the operation we perform on the encrypted data was authorized, i.e., that I am a manager allowed to do the operation and I specifically consent to performing the operation? Typical integrity checks wouldn't catch this; integrity is correctness of the data, which means it will only verify the computation was performed correctly and then move on. Authenticity is a different issue.
I would suspect Microsoft Research thought of this. My question is: is there a countermeasure that can be described as part of the algorithm? Or is the countermeasure "be careful with any software that uses this algorithm, make sure it checks authenticity before applying operations!". If the solution is for developers to be careful, I'm not convinced the algorithm made anything better. Many developers do not know cryptography and may assume safety, or may not have the time and resources due to a manager driving a hard deadline; in these cases, "we use MS's algorithm!" can get advertised without any increase in safety (and possibly even a decrease, as some might look to this as a crutch and reason why they can cut corners...).
Your initial thoughts are wrong.
This is a type of encryption algorithm known as homomorphic encryption, which allows one to do operate on encrypted data without decrypting it.
This has no bearing on the strength of the encryption against an adversary.
Practical homomorphic encryption (like this MSFT product) is based on simplified encryption (to make it more practical, duh). AFAIKT in this case the MSFT product is based on a derivative YASHE (yet another somewhat homomorphic encryption) scheme. This is a bit more like steganography than pure encryption as it "hides" the encryption in a ring and requires lattice theory to generate a unique decryption (meaning you can only perform a few addition/multiplication operations before you have to re-decrypt, re-encrypt). Although theoretically, you can make this encryption "strong" by selecting different parameters (and introducing more overhead and lower error bounds), at some point there is a fundamental limit related to the entropy of the data set itself (which for medical-like data is pretty low entropy).
And then there is the (in)famous sum-product puzzle, which although is kind of an interesting puzzle in that in illustrates how seemingly impossible obfuscation can be removed by the most innocuous oracle queries.
What will break this type of encryption is not brute force, but say on medical data examining distributional anomalies to make a dictionary of sorts. Also since this appears to be some sort of "ECB-like" encryption (most data is encrypted the same way so you can operate on it), we all know how weak that can be in some situations...
This is why in most medical research, data must be de-identified, not merely encrypted. Not that fixes things by a long shot, but it's better than simply encrypting and hoping...