Making Data Better
Making Data Better is a podcast about data quality and the impact it has on how we protect, manage, and use the digital data critical to our lives. Through conversation and examination, George Peabody and Stephen Wilson look at data's role in risk management, at use cases like identification, lending, age verification, healthcare, and more personal concerns. Privacy and data ownership are topics, as are our data failures.
Making Data Better
EP13: The Future of the Cloud: Confidential Computing with Mike Bursell
How do we protect data while it’s actually in use? And how can we prove it?
Up until now, that’s been nearly impossible. We’ve addressed securing data in flight and at rest. When done right, they are strong protections. But what about when the data (and code is data, too) is actually in memory or being processed? And how much more complex does that become when everything is in the cloud?
Confidential computing addresses this weak, and hitherto unsecured, concern.
In this episode of Making Data Better Steve and George speak with Mike Bursell, executive director of the Confidential Computing Consortium and author of Trust in Computer Systems and the Cloud, a brilliant examination of trust in digital system.
Confidential computing gives enterprises a way of securing their operation’s data and code in the face of rising threats and compliance demands. It provides attestation, the ability to prove in an auditable fashion how data has been handled, right down to the metal.
Mike’s clear that this is the future of computing. We already know the future is in the cloud. Now it needs to be secured. He’s hardly alone in that view. Indeed, Microsoft already uses confidential computing to protect its multi-billion dollar payment processing transaction system.
So take a listen. Confidential computing makes data better.
Welcome to Making Data Better, a podcast about data quality and the impact it has on how we protect, manage and use the digital data critical to our lives. I'm George Peabody, partner at Lockstep Consulting, and thanks for joining us. With me is Lockstep founder, Steve Wilson. Hi, Steve, Greetings, George, how are you? I'm very well. I'm glad to be on the road a little bit outside of my normal office down in New York City now To tee up our conversation today. I want to acknowledge the fact that you schooled me on the fact that privacy and security are very different concerns, but there is a linkage I did come up with connecting privacy and data breaches that might merit consideration by those tempted who continue to be tempted to vacuum up every data crumb, and that is if you're scrupulous about data privacy. You collect only what you need and nothing more, which means it won't even be there when you get breached. And for more on that, take a listen to our discussion with Michelle Thinnerman-Dentity in episode 11. To our discussion with Michelle Thinnern and Dendity in Episode 11.
Speaker 2:Yeah, we had a great time with Michelle and I'm hoping that our guest today has overlapped with Michelle, the definitive privacy engineer. But, as discussed with Michelle, this is one of these areas where privacy and security overlap so nicely, because it doesn't matter how you look at data, it's best not to stockpile the stuff without a good reason I mean, that's privacy 101, and it's best not to stockpile the stuff without a good reason. I mean that's, you know, privacy 101, and it's also the security risk that everybody's running by having excessive data. You know they talk about it as the new toxic waste. I'm actually one of those people who, like the data, is crude oil metaphor. It's very much contested, like all metaphors, but I do love the image of a data spill being much like an oil spill and the cost and the angst that it takes to clean it up and, I think, the regulatory imperative that comes with. You know the risks that go with this stuff crude oil or data or whatever. So, look, I can talk some more about context, because this podcast is all about exposing and exploring the multiple facets of data quality methods to secure data. There's an orthodoxy.
Speaker 2:We talk about securing data at rest using encryption, cryptographic scrambling of data tokenization one of Georgia's favorite pet topics, a really important technique. Data, of course, is vulnerable in flight, whenever it's moving through a network, when it's moving from place to place. One of my pet issues is the data supply chain and the transformation steps, the processing steps, evaluated steps that data goes through from point to point through the network, increasingly complicated and increasingly needing security in motion or data in motion. So, look, we've got that covered, haven't we? We've got data at rest, data in motion, pretty well covered by conventional cryptography, but, as mentioned and we don't want to get too technical, I guess but when you encrypt data, when you scramble it, you necessarily make it useless or you put it beyond conventional use, and one of the challenges that we've had for a long time is how do you protect data while it's in use? We had a great conversation a few episodes back with Mark Anjuna. Mark Bauer of Anjuna, mark Bauer of Anjuna Gosh. That's a furphy, isn't it? I've known the guy forever.
Speaker 1:I think they're probably Dave. When Mark thinks the name of the company is his last name, it's a good way to go actually.
Speaker 2:A lot of people think that my surname's in lockstep. So there you go, without further ado. The latest weapon that we have, the latest piece of our data protection arsenal, is confidential computing, and we are delighted to talk to, shall we say, mr Confidential Computing. I'm sure he's too modest to take that because it's a huge effort, but we're delighted to have Mike Birtle, security consultant author. We could talk about his wonderful book, but he's the executive director, for today's purposes, of the Confidential Computing Consortium.
Speaker 3:Welcome, mike. Thank you very much indeed, steve, and nice to speak to you. And speak to you, george, and I'm going to start off with just a quick sort of expansion before we go anywhere, which is that, although it's the one I care about the most, confidential computing isn't the only technology to help with privacy and data and use protection. There's a bunch of them, called sometimes privacy enhancing, so privacy. I'm a Brit so I pronounce things weirdly privacy enhancing technologies or PETs. There's things like fully homomorphic encryption or secure multi-party compute or zero knowledge proofs and stuff like that, so they're kind of it falls in a family. But certainly the thing I care about the most and I think which is most usable in most use cases is absolutely confidential computing. So I probably shouldn't be sort of marketing other technologies, but I'm just going to put it out there and be entirely honest for you.
Speaker 1:It sounds to me as if you've been taken to task for not mentioning those techniques.
Speaker 3:No, no, no, no, no, no, absolutely not. They're actually. They can be really interestingly complementary. I suspect we're not going to have time to go into that in detail today, but there's actually some really interesting use cases where you might use two of them or use them at different stages in a process or in a system or stuff like that. So, yeah, no, I'm just being a good citizen.
Speaker 2:You are being a good citizen, mike, and being very humble. We could play good cop, bad cop, because I actually do think that what you're doing is going to be more impactful and more practical than some of those things that you mentioned. Homomorphic encryption I get it, but it's a difficult thing. It comes with tremendous constraints.
Speaker 3:I agree, we're really well made about PETs.
Speaker 2:Mike, tell us more about your background and how you got to where you are today. How?
Speaker 3:long have we got? So, uh, yeah, my, my undergraduate degree is obvious is obviously in english literature and theology. And I well, exactly, yeah, it's kind of weird, right, uh. And uh, I took I took an mba along the way as well. And then I that was kind of later on, but I left university not quite sure what I was going to do, ended up doing something called electronic publishing, which was kind of later on, but I left university not quite sure what I was going to do Ended up doing something called electronic publishing, which was kind of CD-ROMs, and then morphed into the web.
Speaker 3:And then the web got big. So I moved on, because obviously you wouldn't want to be spending too much time doing anything. And then I became a software engineer not a hugely good, but okay and then moved into security, became a product manager, then became a security architect, and I've kind of moved between those roles forever, so for 25 years or so now. What I've always found interesting is being on the intersection, on the boundary between the business side and the technical side. I can go deep, low level thread modeling or whatever on one side, but on the other side it's about how does this interact with risk and the business and strategy and all those sorts of things, and I think that it's really important that we've got people in the cybersecurity realm who can look in both directions and talk in both directions, because too often the security people are seen as the people who say just say no right, and that's no good for anybody. So I've really enjoyed that sort of stuff and so I did all those things with a whole bunch of people like Citrix, twice, intel, red Hat, some other folks, smaller players.
Speaker 3:And then I created an open source project, co-founded it with a friend of mine at Red Hat, and then we did a startup which failed in confidential computing around this open source project. And just after that, the Confidential Computing Consortium which was set up in 2019, and we'd been a member of, said look, we're looking for an executive director. This is beginning to get big. We need to professionalize the consortium. We need someone who can help run it. You know this stuff. Would you be interested? I joined as executive director in about April 2023. I've been doing that ever since. I do some other stuff on the side as well, but that's my main gig.
Speaker 1:I write a book. You're right, raoul PAL. Yeah, you wrote a book Trust in Computer Systems and the Cloud. And I have to say, how often do I get to read a book that mentions the movies, war Games, the 17th century philosopher, thomas Hobbes, and includes its own playlist? So cheers to that. So, mike, I got to say, one of the things I really appreciate about the book is that you've taken that very human concern, trust and examined it deeply in such a way that you take us through the processes of really saying what are the implications of trust? What do you need to do to be able to claim trust in digital systems? Your definition that trust is the assurance that one entity holds that another will perform particular actions according to a specific expectation. Gosh, those are a lot of human words there and, as I say, to express those in digital terms, particularly in the cloud and a multi-vendor cloud, what was the inspiration for writing this book?
Speaker 3:Oh, I got really annoyed. That's a really simple answer. So I've been kind of thinking about this stuff and authority and how authority works in the digital world and the non-digital world. I've been thinking about that for 20 years or so. And I went to a conference and someone had a I can't remember who it was gave a talk and I can't remember what the talk was, but it had the word trust in it and I thought this person does not know what they're talking about. And then it occurred to me that nobody really knows what they're talking about because there's no definition, there's no sort of framework for us to talk about this stuff. People talk about zero trust, and every time I talk to someone about zero trust at a conference or in a seminar or whatever, we all were saying different things using different words.
Speaker 3:I thought, well, we need some sort of agreement on what this might mean and what could lead us to or things that are bad about it. And so I thought I'll write a book with a framework trying to bring it all together, and I don't care if the framework's wrong. At least we as an industry have a starting point. And I looked at stuff like open source and community and trust. I looked at stuff like yeah, zero trust architectures. I looked at stuff like crypto and cryptocurrency and blockchain generally and trust, and I looked at hardware, routes of trust and all of those sorts of things and tried to put it all together and walk through what a trust chain might look like and just give people something to look at and argue about and say, at least we know what we're talking about. And that was why I wrote it, just because I was annoyed and I'd been thinking about this stuff for 20 years and I thought, well, there's probably nobody else who's got as much in their head about it and I'll just dump this in a book.
Speaker 1:And I love your reaction to being outraged. Write a book. That's terrific.
Speaker 2:But should we move on to confidential computing? The?
Speaker 1:thrust of the book points to the value of confidential computing.
Speaker 3:It does.
Speaker 3:It does Absolutely Because, as Steve mentioned kind of at the beginning of the podcast, we've been able to do data in motion, data in transit, network encryption for ages and protect data on that. We've been able to do it in storage and on databases or on hard disks or wherever. That's fine when it's data in use has been much more difficult Because the basic classical model of computing and virtualization, which is how the cloud works, is that if you own the machine, if you have control of the machine, or if you've compromised the machine, so if you pwn it or own it or operate it, so if you have administrative access, kernel access or hypervisor access, you can look at and mess with any data or any applications on that machine. It's just that simple. That is literally the way that virtualization works and that's kind of fine for maybe my holiday synapse, that's kind of great.
Speaker 3:But if I'm a multinational company or even a small company with customer data or patient data or intellectual property or credit card information or research data or cryptographic keys or anything, basically how happy am I, how happy are my regulators and auditors about that data, those applications, being on the cloud where anyone who has access to that machine can look at it. Now I'm not saying that Azure and AWS and Google and IBM and folks are going to look at this stuff, but all you have is organizational process and legal ways to stop them. You don't have any technical way of stopping them, and that's just not good enough.
Speaker 2:You're on a tear and good on you. There have been technical responses to your point. You make a very good point. We've had things like hardware security modules in the cloud, which is really just leasing space on a TALIS box. Not a bad idea, but very difficult to make it work.
Speaker 1:We've had multi-party computation.
Speaker 2:So there are promises. I mean, I've spoken with some of the big three-letter cloud providers who promise, and I believe them, that thanks to multi-party computation, we, the cloud service provider, can't unpack your data, even if we wanted to. But along comes confidential computing, which I think I want you to talk to the issue about how do you tame the technology and make those technology tricks available in the cloud. But first, we started this podcast and we reached out to you, mike, because we found a shocking lack of awareness of confidential computing as a brand or as a concept. I think it's the future of the cloud, I think it's the most important thing I've seen for a long time. So talk to us about the awareness problem and what do we do about that? It is surprising.
Speaker 3:The reason that the Confidential Computing Consortium was created was largely to deal with this issue, right? So the charter of the CCC, as we call it, not the Chaos Computer Club this is Confidential Computing Consortium is to promote usage of confidential computing and encourage open source implementations around it as well and projects, and it's a part of the Linux Foundation, so it allows a safe space for even competitors to get together and talk about this stuff and work out ways to market together or do technical stuff. So Intel, arm, amd, nvidia, huawei they're all members of the top tier, the premier tier, and they can all have these conversations and we can do stuff together. But, as you say, there's this shocking lack of awareness of the technology. Shocking lack of awareness of the technology, and it's kind of weird because you can use this technology in all of the major clouds and most of the minor ones these days, because it just requires them to have systems with a specific set of chips. There's a whole bunch of AMD chips, whole bunch of Intel chips, there's a whole bunch of Nvidia chips. These days, there's some ARM-based chips coming soon as well, and to turn it on and why people don't know about it is kind of weird. Actually, using it has only just become easier. I think that's fair to say as well. So the maturity of the technology for just your standard user is only just coming on.
Speaker 3:And there's one other really interesting and quite complex piece, which I'll try and explain simply, which goes by a nice word, which is attestation. Without attestation you've got a problem. So let's say I want to use one of these chips and I'm going to create what's called a T, a trusted execution environment. It's a way to do your computing using these chips, so that anybody who has access to the machine can't look into it, even if they're admin, kernel, hyper, all that sort of stuff. So it's protecting your data and your application. So you want to use one of those.
Speaker 3:So I say to you George, george, george, you are my cloud provider, can you set one of these up for me? And you say, of course I can. There we go and I put all of my stuff into your, into your T. And I suddenly think, wait a sec, the whole reason I'm using this thing is I don't trust you, george. I don't trust you not to look at my stuff. So how can I trust you to say you set this thing up to protect me correctly and that's oh dear.
Speaker 3:That sounds like a way, and so the answer is I can actually ask the chip who's created this for me, rather than you, george to give me a measurement, a cryptographic measurement, of this thing. It's kind of like a signed thing to say this has been set up correctly, and I then need to check that that's correct. And if it is correct, well, that's good, because it means that George kind of messed with it. So you need a way to do attestation, and that is tricky, and if you don't do it just right, you've kind of lost the whole point, and part of the problem here is that one of the things is that I've got to make sure that the person who's checking the attestation for me is also not.
Speaker 2:George. Thanks, mark. You sort of read our mind that TEE is the Trusted Execution Environment which, as I understand it is more and more a built-in feature of those big processes that you're talking about.
Speaker 3:It is, yeah, absolutely a built-in feature of those big processes that you're talking about. It is, yeah, absolutely the ARM, and.
Speaker 2:Intel Attestation, so that we have an independent assurance of the state of the software, the state of the keys, the configuration. Nice, if you think that you're running Elliptic Curve 256, is it really the algorithm properly implemented?
Speaker 3:Even more than that, I would say. Not only has it been set up correctly, but the software that I think I put in it is what is what I expected, which means that I can start doing some of that multi-party computation and collaboration that we talked about before, which is really interesting.
Speaker 2:I wanted to underline the two things that I think are so important and complicated, things like having a good chip and having good attestation of things that are just so difficult to manage. So the consortium comes in, the CCC comes in to manage some of that. It's a great story, but tell us more about what's required on the enterprise side to tool up for this sort of thing. I mean, how do I engage with this if I might already have my workloads in a particular cloud service? How do I? Is it much of a pivot to take advantage of confidential computing?
Speaker 3:And I think this is one of the areas that we're trying to help people understand what needs to happen. And the answer is it depends how you want to do it. If you want to sort of put existing workloads into a TE, and as long as you've got an attestation mechanism, it's actually fairly easy to do that. If you want to be building a new application from the ground up to take advantage of these new technologies, then it's going to be a bit more complex and there are open source projects to do this, lots of startups, there's other people doing it. So, for instance, microsoft has moved all of its credit card processing $22 billion worth a year for Azure into confidential computing. Good example there's one of the big multinational anti-human trafficking agencies is using confidential computing to keep data safe and so that it can't be changed as well. So this is one of the things that putting your existing stuff in it kind of works, but once you start thinking about how this stuff could be used, it kind of changes the way you think about computing. The primitives that you have to play with to use data, to share data in ways that you can be sure that it's correct, makes you start thinking. Well, actually I should probably be architecting my applications differently. And then, once you start doing that, obviously it's a longer journey.
Speaker 3:So everyone's talking about AI at the moment, right? What does AI have to do with confidential computing? Well, there's kind of two areas where you might want to use it. First one is at sort of the end. So you've got your model, it's all trained, you want to use it, right? So I want to engage with an AI model which is hosted by I'm going to pick on George again by George, right.
Speaker 3:But I may be asking questions of that model that I don't necessarily want George to know about, right, it might be stuff about domestic violence, for instance. Or it might be a proprietary model which is hosted by George, but it has information that's proprietary to me. So not only do I not want him to know what I'm being asking of it, but I also don't want them to know what the answers are. I also don't want him to change the data in that model, right? And if I run that model in a confidential computing, a TE world, then that's the sort of isolation that allows me to have those assurances.
Speaker 3:So that's kind of one end, but also at the other end, there's a lot of concern at the moment about what data you're training your models on right, and one of the things you can do with confidential computing and attestation very, very important here is to track what data's being used and prove that, when you get to the actual usage of the model, that is the only data that's being used. You can also use it for things like supply chain, to track exactly what's gone into your supply chain. You talked about supply chain earlier, so it just makes you think about it in rather a different way and that's really exciting, but it takes companies, particularly in an economic recession, a while to think about how could I be doing new stuff in different ways? So you know we at the moment we're very soon the next month or so going to be publishing a set of use cases, a white paper of use cases from some of the members to give people an idea of how they might use these technologies.
Speaker 2:Thanks for speaking to the mental models too, because I think that computing, you know cloud is necessarily a fuzzy idea. I mean it's in the name, um, and virtualization, I think has led people to be I don't want to say lazy, but pretty, pretty laid back about where the data is and, yeah, who else have got access to it. We tend not to think about that anymore. I don't want to go to ownership per se, but I think control and, you know, leasehold over data, I think is now. But we are entering into an area where the mental models of who's got access to my data and how can I think about that where confidential computing is breaking ground.
Speaker 3:If you're a hospital and a pharmaceutical comes to you and says a company comes to you and says, look, we want to run some models and modelling on your patient data, give us all your patient data. You're going to say I don't think so. And HIPAA or whoever's regulated is going to say I don't think so either. Right, but if the pharmaceutical company can come up with an application and prove to you that application won't exfiltrate your data it'll only do certain processes on it and then run that in a confidential computing environment, then you can be sure that the data you're sending will only be used in the way that you expect it to be. So you can actually this is the sort of the collaborative and you could combine that with other data from other hospitals and not worry about the wrong people seeing it or getting leaked to the wrong people. So this is why things kind of change about how you think about it. You could use this for fraud management. You could use it for oil and gas exploration. We're seeing people talking about this for space usage. You've got microsatellites or edge use cases where you want to combine this stuff but making sure that it can't be tampered with or can't be seen where it shouldn't be. It's just new ways of thinking about it and that's why I get so excited about it and what it does and you used the word really early in this podcast risk, steve and that's why I care about it, because this allows you to change your risk profile.
Speaker 3:Before putting stuff in the cloud set you up with a whole bunch of sets of risk that George is going to mess with it if he's the cloud provider. I keep picking on you, george. Sorry, if that machine is compromised, there'll be problems. There's no perfect security, but it raises the bar for that particular set of risks and allows you to think you know what, maybe there's stuff that I could put in the cloud for that particular set of risks, and allows you to think you know what. Maybe there's stuff I could put in the cloud now that I could only do internally before, because my risk profile has changed because of these security technologies.
Speaker 1:I'm excited to hear about this, Mike, because it strikes me actually that I'm hearing the inklings of competitive differentiation based on security. Competitive differentiation based on security. I'm hearing that there are opportunities for re-architecting business processes which would give the architect and its compliance a competitive differentiation over another company or another provider and it's been so rare company or another provider and it's been so rare. Security has always been this sunk cost as far as enterprise is concerned. But, Steve and I, as we talk about how do we make data better, how do we put an economic model underneath data drive, investment to drive a marketplace, to drive really strong businesses around the data provenance, data custodianship, as opposed to today's current one, which is either internet advertising on the one hand, or hoovering up every data crumb, repackaging it and selling it back out.
Speaker 3:So I saw you talk about the vacuum cleaner problem, which is kind of a weird one, but what I mean by that is that I don't know anybody who buys a vacuum cleaner because they want to. Everyone buys a vacuum cleaner because they have to right. It's just one of those things. It's got old, you've moved to a new house, you have pets now whatever, but you buy it because you have to, and security is kind of like that. Often we need to change security from being a vacuum cleaner to something which adds value to you, allows you to create value, and this, I think, is a technology which absolutely allows that. Steve, you mentioned earlier on supply chains and I just thought it might be nice to come back to that again briefly, because we've had just recently this, this xz problem or xz, if you're, if a commonwealth person right, and people may not be aware of that, but it was a is an attack on basically the linux uh operating system and ecosystem supply chain by people taking a long time to sort things out and doing long-term attacks what we call an APT, I guess if it was a. It's over two years these things have been happening. We need to think about how we imbue trust into the supply chain and confidential computing does not fix all of the problems, but it allows you to fix some of the problems because it allows you to build your piece of the supply chain in a Confidential Computing environment where you can know what the build environment is. You can prove that it was that assurance, that attestation, through into the supply chain through your S-bombs or whatever you're using to do that, and allows you to start having more assurance in particular points in the supply chain. And I think that we've got a long way to go on supply chains. There's things like Toto which are helping us with this.
Speaker 3:But if we come back to trust and the book, you gave the definition. But there's three coloraries, one of which is that trust is always contextual. So when I do a build and I sign that build as the maintainer or whoever it is, am I signing to say, yes, this was the software I think it was that has been built? Am I signing to say, yes, the build system is what it should be? Yes, the hardware is what it should be. Yes, the implementation is correct. Yes, the underlying cryptographic algorithms, for instance, are correct. You need to contextualize every single thing there, otherwise you start having gaps, and once you can contextualize that and then you can put assurances in place that that has been checked and that flows up or down the supply chain. Whichever way you look at it, things start changing and you start to differentiate again, george, to your point, the offering that you provide.
Speaker 2:Yeah, you're telling the story behind the data, or you're providing infrastructure that allows stakeholders to tell the story behind their own data.
Speaker 3:Yeah, and the applications as well. What is data? What is application? At this point, you know, everything is data. It's all ones and zeros until we start going quantum. But that's a whole bunch of other questions, right?
Speaker 2:It's going to change our mental models.
Speaker 3:Oh yeah, I know my brain already hurts whenever I think about it. Yeah, so, as you can tell, I'm really passionate about this stuff and for me, it's this combination of security, trust, risk and privacy which you brought up right at the beginning or privacy right at the beginning, george, about how these interact and what it allows us to do when we think of them in new ways that allows us to build new things, differentiate our businesses, our projects, uh, whatever we're doing and I think that's really exciting, and, as he says at the moment, it's why would you use these technologies In 5, 10 years? It's going to be. Why wouldn't you? Right, it's going to be. If you're not using this, I'm suspicious of you, and that's in the same way that it used to be. Why would you bother with HTTPS? Now, you would find it very difficult to find a website which doesn't have HTTPS and if it doesn't, you start worrying. And that's where we should be with computer.
Speaker 1:So, mike, let's leave it there. But before we let you go, what do you see as the next frontier in handling data?
Speaker 3:It's being able to trace it and its supply chain and having a knowledge of who's interacted with it, when and how, so that we think of data not just as contextless but full of context, and I think that changes the way we think about a whole bunch of stuff. I think open source is taking over the world, but the world hasn't quite understood the business ways to understand that yet and we're beginning to address this, and I think this is one of the ways that that happens too Well that's terrific.
Speaker 1:I know, Stephen, I couldn't agree with you more.
Speaker 2:Tell us about your next steps. The organization has a bit of an event coming up, I think next steps.
Speaker 3:The organization has a bit of an event coming up, I think, yeah. So we've got the Confidential Computing Summit coming up in June, I believe early June, in San Francisco. In fact, I'll be speaking at RSA in San Francisco in just a couple of weeks' time, so maybe people will make it to that. But yes, the Confidential Computing Summit San Francisco would love to see you there. Come and talk, ask us questions and we'll take it from there.
Speaker 2:And here's another plug. We'll see you between RSA and Confidential Computing Summit. We'll see you at Identiverse for a panel on confidential computing and the future of identity. Indeed, I'm really looking forward to that. This is hot stuff, mike. Thanks for sharing, thanks for demystifying and hopefully we're raising awareness of this thing. It's the future of the cloud. Thank you, thank you.