Building Better Systems

#4: Alex Malozemoff – New attack on homomorphic encryption libraries: what does it mean?

Episode Summary

Principal Researcher, Alex Malozemoff, walks us through what homomorphic encryption is, what CKKS is, and how a recent new attack on CKKS will impact progress on homomorphic encryption.

Episode Notes

Principal Researcher, Alex Malozemoff, walks us through what homomorphic encryption is, what CKKS is, and how a recent new attack on CKKS will impact progress on homomorphic encryption.

You can watch this episode on our Youtube channel.

On the Security of Homomorphic Encryption on Approximate Numbers by Baiyu Li and Daniele Micciancio

Episode Transcription

Intro (00:02):

Designing manufacturing, installing and maintaining the high-speed electronic computers, the largest and most complex computers that are built.

Joey (00:22):

Hello. Welcome to another episode of building better systems, the podcast, where we explore tools and approaches to make our software more correct and secure today, we're doing an episode that's a little different, we're exploring a more timely topic. We're exploring a paper that recently came out in the homomorphic encryption space called On The Security of Homomorphic Encryption on Approximate Numbers. Uh, the paper is by, Baiyu Li and Daniele Micciancio. So joining us to discuss this, uh, new paper today is Alex Malozemoff, a principal researcher at Galois focusing mostly on secure computation. Thanks for joining us today, Alex. Thanks for having me. So today we're going to talk about the recent new attack on the CKKS homomorphic encryption scheme. But before we dive into the details of the attack, can we talk a bit about what homomorphic encryption is?

Alex (01:12):

Sure. So homomorphic encryption is a form of encryption, which allows one to compute over the cipher texts. So with normal encryption, you encrypt a message and then you can decrypt the message and get back that same message with homomorphic encryption. You can encrypt a message and then compute in theory, whatever you want on that encrypted message and then decrypt the results and get back what would have been the computation on the original message.

Shpat (01:41):

And when you say compute, it means you can manipulate it. You can learn things about it without learning things about it. Um, you can get statistically relevant information out of a set of those messages. What does computing on data that's encrypted mean?

Alex (01:56):

So, uh, yeah, so in, in theory it means you can do any type of computation that you would do like written as a C program. Say, um, more formally you can view computation as what's called an arithmetic circuit, which is basically additions and multiplications. So you can do any sequence of addition and multiplication on one or more ciphertext.

Shpat (02:22):

Why would you want to do that as opposed to encrypted something, could transporting it or storing it in encrypted mode. And then when you need to either manipulate, compute on it, decrypted, why, why would you go for something like homomorphic encryption?

Alex (02:38):

Sure. So the advantage here is you are never revealing either the input or intermediate values in the clear. So, um, a potential use case where this is I'm a company and I want to do heavyweight analysis. I don't have the necessarily the resources for it. So I wanted to do that all in the cloud, but I don't trust the cloud provider with my data. So in theory, if homomorphic encryption reached a point where it was sufficiently performance, I could encrypt my data, send it to the cloud provider, the cloud provider can then do the computation. I tell it to do on that data, get a resulting ciphertext, send me back the cipher text. I can decrypt it and, um, see the results. And now the cloud provider never sees any of my data in the clear that, so that's kind of like the, uh, one of the most compelling use cases for homomorphic encryption.

Joey (03:38):

So that sounds incredibly promising. And there's a number of groups implementing libraries of homomorphic encryption primitives, right?

Alex (03:47):

Yeah. So I would say over the last, let's say 10 years, I guess homomorphic encryption has only really been around since I think 2008, but so, uh, but, but in the last say 10 years, there's been a huge number of libraries or a huge number of schemes and performance improvements to these games and libraries, um, as well as various startups and, and major companies, um, pushing a more free version. Um, so it's definitely a rapidly growing space in the crypto world.

Joey (04:18):

And C K K S, which we're here to talk about today is a scheme. So it's not any specific implementation, but it's a way of doing computations to accomplish homomorphic. Yeah,

Alex (04:31):

Correct. It is a particular form of homomorphic encryption that has certain properties, which I see more we'll get into. Um, I think it was published in 2017, so it's fairly recent, but due to various, um, performance characteristics, it has, uh, it has seen widespread adoption throughout the community. Um, there is an ongoing standardization effort in homomorphic encryption and CPTs is one of the important schemes that is trying to be standardized. So it's, it's definitely a very important, um, definitely a very important homomorphic encryption scheme.

Joey (05:09):

And so the paper that we're here to talk about, if I understand this correctly, there is a, there's a, these researchers have found essentially avenues of attack or break this scheme and a number of open source libraries that implement this scheme. Is that right?

Alex (05:27):

Correct. So they are able to, they have a theoretical attack on the scheme. So on the kind of pen and paper description of homomorphic encryption scheme, and they demonstrate its practicality by taking things three or four, um, opensource, homomorphic encryption libraries that implement this game and demonstrating that the attack works. So then they, so they both show that the attack is theoretical attack on the scheme and is practical to implement and deploy.

Joey (05:59):

And has this attack fully rendered the scheme useless or is it limited the situations and under which it's safe to use the scheme?

Alex (06:09):

So I think it's fairly subtle, but I would lean towards the latter, although you would have to make assumptions that I would argue you shouldn't be making. So the attack in particular depends on the adversary, seeing the decrypted results. So the adversary is allowed to, um, encrypt whatever they want, which normally in homomorphic encryption, you can, it's a public key encryption scheme. So anyone can encrypt something if they have the public key. Um, but only the secret key holder can decrypt this particular tech relies on the attacker being able to see the decrypted results. So, uh, at a first glance, it seems okay, maybe this isn't too realistic because why would the results be available to the attacker? Uh, but there, you can imagine many scenarios in which this would be the case, if the results are part of public research. And so those results are published somewhere. Um, any type of scenario where the results of the computation are made, made public, uh, could potentially be an Avenue to launch this attack.

Joey (07:22):

So if you're going to use this scheme, you would need to be very certain that this was not going to be a problem. So it's really raised the bar for the guarantees you need to have in order to apply it.

Alex (07:33):

Yeah, exactly. It's no longer a cryptographic guarantee on the scheme that you can rely on and deploy in any scenario you want, you would have to have some other measures to control the disclosure of the decrypted values. And generally that's not a very satisfying kind of cryptographic guarantee.

Joey (07:53):

So even at a performance cost, it might end up being preferable to pick something that gives you less risk at the end of the day. Yes.

Alex (08:01):

So the paper presents, I believe some heuristic approaches to addressing this attack. Um, there is no, uh, cryptographically secure approach to address this attack. Um, but you can address it, heuristically at least for now at a cost of performance. So there is, there's a bit of a trade off.

Shpat (08:25):

What do I need to be doing as if or who cares, who should care about this, uh, from, especially from the people who might be listening and why?

Alex (08:37):

Um, so I think there's various people that can care. Um, definitely if you are looking at using homomorphic encryption in particular CKKS scheme, it's worth being aware of this attack. And if you are deploying CKKS, seeing at least initially whether, um, required parameters or the attack are relevant for your deployment scenario, um, I think more generally it's, um, it's an interesting case study and the mismatch between security definitions and the intuitive guarantee you'd want out of a crypto scheme. So any crypto scheme that's proved against a security definition is proved against that particular definition, but that does not mean there couldn't be an attack that is not captured by that definition. That is still a realistic attack. And this is an example of that because the CKKS scheme is proven secure under what's called IND-CPA, which is indistinguishable, indistinguishable against chosen plaintext attack. It's a very common, uh, security notion for encryption and the scheme is secure in this setting, but because of the way CKKS works, it's a so-called, um, approximate, uh, scheme.

Alex (10:05):

So it doesn't work when you decrypt to get the exact results of the computation. You get an approximation of the result on the combination and that, so there's some error added to the result of the computation. Um, and that error is a function of the secret key. And so if you can see the results of the decrypted value, you're seeing some error term, which is a function of the secret key. And if you have enough of these, you can learn the secret key, um, in terms of your standard encryption notion, the scheme is fine, but there is a intuitively reasonable attack outside of the security notion that the notion doesn't capture. And it's important to keep this in mind when thinking about the guarantees of a cryptosystem is it's always secure to a particular definition. And so you can't just see a proof or you can't see the claim that a scheme is provably secure and necessarily be fine with it. It's provably secure against a particular definition. And it's important to also understand the definition of what the definition is claiming.

Joey (11:24):

So in one sense, this can feel like a bit of a blow to the homomorphic encryption community since people were starting to use this particular scheme more often. But in another sense, I think there's a really optimistic view where part of why the cryptography community is so strong is this constant re-evaluation of everything that people start to centralize around. And this is why we see NIST being so eager, for example, to standardize around post quantum is to really put the on someone's back.

Alex (11:55):

Um, and every time we get a reminder of what you just mentioned, this idea that a proof isn't enough, you need to really evaluate what the proof is saying in the context that it's being used. Uh, I think every time researchers see that it serves as an additional reminder, we've got to be careful about this. Let's think about now everybody that's using this kind of proof of just kind of scheme and evaluate them one by one. Um, so it definitely is, is overall to the benefit of this community that this kind of thing is going, and this, this happens regularly. Um, I mentioned this IND-CPA notion, indistinguishable and chosen plaintext attack, um, that for a while was the standard notion for encryption. And then people realized, well, it's actually not secure if the adversary is able to feed, uh, ciphertext to say, uh, the decryption, um, procedure. And so they came up with, uh, IND-CCA, indistinguishable against chosen ciphertext attack, which is stronger than the IND-CPA notion and something very similar is happening here where for this particular scheme, because the output of the decryption is an approximation of the correct value. The, uh, chosen plain text attack notion is insufficient. And these, these definitional issues come up every once in a while and the, in the crypto world. And yeah, it definitely points to this ongoing, uh, hardening of the, I guess the security claims.

Shpat (13:28):

That's definitely seeing the glass half, half full.

Alex (13:32):

Yeah. So, uh, I mean, this paper just came out recently. Um, so there's still hope that this new security notion that the authors propose can be achieved and it can be achieved relatively efficiently. So I wouldn't rule out the possibility that a variant of CKKS with similar performance, um, could be made secure against the stronger notion and those be resistant to, to these style of attacks. Um, so there's still, it's still early. Uh, I think it's too early to rule out that, you know, don't use CKKS because of this attack. Um, I'd be cautious of using it with the heuristics due to the reasons I mentioned. You really would have to have a good understanding of what you'll be doing with the outputs. Um, but I wouldn't rule out CKKS. We're good because there's still potential that someone could patch this attack in a cryptographically secure way.

Shpat (14:32):

So this is the first time I ever heard of CKKS. And I've been around the term homomorphic encryption for a few years now. Um, if some of these things are ironed out, but who, and in what ways do you think the scheme will be used? I know the paper mentioned in addition to the open source library elements, uh, machine learning on encrypted data. Is that something that this is promising for? Why and why not? What are the others?

Alex (15:01):

Yeah, so that's the main application of CKKS. Um, and the benefit there is machine learning as is, is approximate. There is there's error in your machine learning model. And so, uh, why worry about getting a completely now, um, accurate result using another homomorphic encryption scheme when there's already air in the model? What if we could make things much more efficient and add a little bit of homomorphic encryption error, um, and that's basically the ideal use case for, for CKKS where the computation itself already into introduce error. So sure. You're adding maybe a little bit more or different type of error using CKKS, but you gain a lot inefficiency.

Shpat (15:51):

What would that mean for ML, uh, you know, algorithms and schemes, does it mean that you can essentially feed a machine learning algorithm or, you know, call it whatever encrypted data without you ever seeing it, and then you get up, you get kind of a prediction engine or whatever that is out the other end.

Alex (16:10):

Yeah. So a classic example is like neural networks. You could imagine encrypting your, the weights of your neural network so that you can evaluate the neural network on some input, um, on some encrypted input. And that could allow someone to deploy an encrypted or, um, homomorphic encryption model of a neural net on a cloud, and then feed it data that worrying about leaking the neural net to the cloud.

Shpat (16:40):

That's amazing. Well, that was very informative. Thanks for taking the time to talk to us. Um, and, uh, yeah, we'll see you next time. This was a little bit of a different episode of building better systems, um, and see you all later. Thanks again for having me. This is fun.