Building Better Systems

#13: Rod Chapman – It's Either Automated or It's Wrong

Episode Summary

Rod Chapman explains his recent verification of TweetNACL using SPARK/ADA. We discuss how every aspect of his proofs are automated, how the correctness proofs actually enabled better performance after compilation, and higher confidence in some otherwise risky-seeming optimizations.

Episode Notes

Watch all our episodes on the Building Better Systems youtube channel.

Joey Dodds: https://galois.com/team/joey-dodds/

Shpat Morina: https://galois.com/team/shpat-morina/

Rod Chapman: linkedin.com/in/rod-chapman-7b60266

https://github.com/rod-chapman/SPARKNaCl

Galois, Inc.: https://galois.com/

Episode Transcription

Intro (00:01):

Designing manufacturing, installing and maintaining the high-speed electronic computers, the largest and most complex computers ever built.

Shpat (00:21):

Hello everyone. And welcome to the building better systems podcast, where we chat with people in industry and academia that work on hard problems around building safer and more reliable software and hardware. My name is [inaudible] Moreno

Joey (00:34):

And I'm Joey Dodds.

Shpat (00:36):

Joey and I work at Galois it's a research and development lab that focuses on high assurance systems development and broadly hard problems in computers.

Joey (00:46):

And today we're going to be talking with Rod Chapman. Rod's had an ongoing involvement and an influence on the spark ADA community where he's helped develop both tools and proofs in that ecosystem. Rods applied from a verification to some really important systems. And notably he's done that really practically. And cost-effectively in this episode, we're going to talk about how spark aid has been used to correctly improvement, critical systems, such as air traffic control software. And we'll also talk about just how successful those efforts have been. Ronald, go ahead and explain some of his more recent effort verifying tweet NaCl. Also in the spark ADA ecosystem, tweet NaCl is a tiny implementation of cryptographic library. So it's very complex. He'll also explained how he was able to use spark a tool, which was traditionally intended for primarily proving correctness. Uh, he'll take spark, but he was actually able to improve the performance of that tweet NaCl library once it was compiled.

Shpat (01:42):

And just a side note for those of you who are watching the video, um, this conversation was recorded earlier in the pandemic when trimming my beard somehow became optional for a while there. So please excuse the cave man look,

Joey (01:57):

Thanks for joining us. Right. Okay.

Rod Chapman (01:58):

Thanks for the invitation.

Joey (02:00):

So we'll start off with our normal question, rod. Uh, what is your approach to building better systems?

Rod Chapman (02:06):

That's a good question. Um, my, my background is in building mostly safety and security critical embedded systems. Uh, for many years, I, I worked for a company that used to be called Praxis here in Barth is now called ultra and, and I, I work independently at the moment, but I guess the company and the approach was, was I guess, best known for it's what I would call up sort of pragmatic approach to formalism, where we were known for trying to apply more formal methods, um, to system development. I mean, it's all routine, I suppose, in a really trying to do a mature approach to requirements engineering, um, for site particularly safety insecurity. You're trying to get that non-functional stuff nailed early in the project,

Shpat (02:52):

Your approach to building better systems start sounds like starts with specification.

Rod Chapman (02:57):

Indeed. I mean, requirements engineering and specification is, is fundamental. If you're going to build something right. You know, you've, you've got to know what it is you're trying to build and possibly more importantly why you're trying to build it. And the other thing that we find is very important is understanding the domain knowledge is trying to formalize and write down all the assumptions and sort of tacit knowledge about the operational environment of the system, because very often people get that wrong and then systems fail when they, when they hit the real world as it were, um, you find out you've, you've got something well. Um, and that's, that's pretty difficult that involves a lot of work in invalidating that stuff with, with real system engineers and users and so on.

Joey (03:41):

Yeah, there's a few things I want to dive into there. One is that you represent this idea and this is gonna maybe seem surprising to some people that you represent this idea of we're going to start with formal methods and build in formal methods from the start, including, um, you know, including mechanical specifications, um, including even things to check these specifications and to consider that pragmatic might seem surprising to some people who think of horrible methods is something that's expensive and unreasonable to apply in most production settings.

Shpat (04:16):

Yeah. In other words, this sounds very expensive, right? Sit down, write all the domain knowledge, write the technical specification and then start working. I mean, we, we got stuff to do here.

Rod Chapman (04:27):

Yeah. I it's, it's a fair point and it depends on the system. It depends on how, you know, how long

Shpat (04:33):

Well, and just, and to be clear, I, you know, I'm being facetious because I know you've actually managed to accomplish a lot of things in real systems with that stuff. So, so I'm curious about your take on that.

Rod Chapman (04:45):

It's clearly this kind of approach is perhaps not appropriate to very fast moving industries. If you want something out next month, this may not be for you, but typically we're building systems, which take a fair amount of time to build in the first place, but also systems which are going to be around for 20 or 25 or even 40 years. We look at, um, you know, long lived embedded systems in the nuclear industry or the rail industry. These things have a very long lifespans. So actually taking your time to get them right. Is, is, is okay. So, I mean, for example, on the, the certificate authority that we built for the Mondex system, we probably spent a year doing requirements engineering and foremost specification work before we write any code at all. And that was, so that was formal requirements. Formal spec had architecture design, basically, you know, sort of building a bunch of stuff so that we knew it was going to work.

Rod Chapman (05:37):

And we understood what we were building. We probably did about a year with that before we, we got near writing any code, but that system, you know, when we built it, it was designed to have a lifetime of 20 years, but maybe so, so that's not outrageous. I mean, it's also worth saying that the, the team that are doing the formal specification requirements work is actually quite a team in comparison to the overall size of the project. Um, you know, that formal spec work might only be 10 or 15% of the budget for the whole project in terms of people, um, and time. And it's okay to admit that those people are particularly skilled at what they do. They combine a very interesting skill set, which is the, the ability to do requirements elicitation, to actually talk to people and resolve conflicts and the code of equal help, the, this domain knowledge stuff from the background, from, from users, but also the ability to write it down formally is a very interesting combination of skills. So, so these people aren't, you know, easy to find or train, but they exist, but you only, you don't need a huge number of them. I think we found, as I said, it's not like everybody on the project has to have that skill set and that's okay. Um, to be in that kind of mode. So yeah, it's appropriate for some projects, certainly not for all. And that's perfectly reasonable and pragmatic, I think.

Shpat (06:59):

Well, it sounds like, you know, when you're done with these things, you can really sleep well at night because you know, you've done everything to make sure that these things don't fail.

Rod Chapman (07:07):

Yeah. We hope we get it right. And clearly, I mean, perhaps another myth with this is that people say, oh, well, our, you know, our system is going to change and we have to have lots of, we have to cope with change. So we can't do four more methods, but we don't find that's the case. Um, the formal specifications we found over the years, or as amenable to change as much as any other notation. And of course this stuff changes. We don't write it down once and then it's all perfect cause that never ever happens. And of course we iterate and we do incremental style development, um, just as is common on all projects. And this formal stuff does get updated and changes as a project goes along as requirements change and other things change, but we don't find this material any more difficult to change in any other, but certainly it's sometimes easier to change in code for sure, because the formal material tends to be a lot more concise than source code. So we, we maintain these, these design artifacts throughout the lifetime of a project just as you would with anything else. Um, I'm just a bit worried. There is a persistent meowing in the background because my cat's trying to get in. We can't hear it. Should I let the cat in or not? Um,

Shpat (08:20):

Um, that's up to, well, that's dangerous having a cat. I think that's dangerous, but it always

Rod Chapman (08:25):

Happens in these conversations. The cat will invade.

Joey (08:29):

That's part of why she fights in the office. Right. That's

Shpat (08:30):

Exactly right. I was going to say

Joey (08:32):

Because of his cats only trying to board the bus right now,

Shpat (08:39):

It's completely up to you.

Rod Chapman (08:40):

Okay. If I, I think he's gone away, so let's press on.

Joey (08:44):

So to summarize what you said, basically you've worked in this environment where at the outset of the project, uh, a small team of reasonably expert, um, in proof engineers, in some sense, specification engineers, I guess, cause they're not even doing proofs yet at this point, architects, um, are, are setting out to understand the, the goals of the project and the knowledge necessary to undertake the project and to record that, um, in a precise, mechanical language, basically.

Rod Chapman (09:18):

Yeah, that's a reasonable characterization. I think it's worth mentioning as well with reviewing this material. We do have experience of having, we have taught customers to read the formal specifications and the skill set. You need to read this stuff and understand it is very different from the skillset. You need to be able to write and construct this stuff. Um, so we have different training courses, for example, for reading and writing or language like Zed. Um, and we have successfully trained customers, you know, on another big project that will probably come onto, um, part of the UK, the UK air traffic control infrastructure. We trained air traffic controllers to read Zed so that they could check what we were writing and say, yeah, that's what I want because these people is the, you know, the buck stops with the air traffic controllers because they're the actual users. So we trained them to read the read spec, um, to validate with them that what we were writing really was what they thought they were getting and that's possible with, with, so, so teaching people to read this stuff is okay. And even a notation like Zed, which is famous for being, uh, yeah, it's a tricky notation. So we say it's, it's, it's, it's, you know, it's, it's not perhaps the most readable notation in the world, um, is certainly possible. Yeah.

Shpat (10:33):

I'm not familiar with that. Could you briefly tell us about it?

Rod Chapman (10:37):

That is a formal specification language for particularly excels at functional behavior. It's based on set basic maths, it's set theory, uh, relational algebra and functions with a kind of overall structuring mechanisms called schemers, but it does use a particularly mathematical notation. It's very concise, but there's an awful lot of mathematics. It looks like maths and it tends to, that tends to put people off. It is very, very powerful in terms of what you can say. It's expressive power is very high. It is perhaps held back because, because of its very expressive power tool support for it is not as advanced as we would like, which is a shame. I'm sorry,

Shpat (11:16):

But it sounds like you got, you got non-experts to read it and to compare it to essentially what they do every day. How was it?

Rod Chapman (11:25):

I think that worked reasonably well. As I said, I wouldn't describe this week. It was non-experts so they are, they are not software engineer experts, but they're experts at air traffic control. And that matters because these people really are the domain experts about how they work, how they, you know, how they actually do their job of controlling aircraft above the UK. They are the domain experts and that is crucial to access their knowledge, to get them to understand what we're, what we're proposing to do and what we're proposing to build. Um, so they, they have an enormous role to play. It was really, really important in that project, for sure. Otherwise we we'd produce something that was all formal and beautiful, but you know, th th they wouldn't use it and they wouldn't actually accept it at the end of the day. So it's really

Joey (12:10):

Well. Yeah. So I think it's implicit in a lot of what you're saying is this idea that you can take something and experts can produce it in this case, this formal model, but then a lot of other people can need to be in, can be able to consume this stuff, because also it doesn't sound like just the air traffic controllers in this example, we're consuming it, but also that the architects are handing off artifacts to engineers that are not as expert in building up the architecture, but that are able to actually do implementation based on these specifications. So that group also needs to understand

Rod Chapman (12:48):

That's right. So, so we, we on say that the air traffic control project, we run a kind of development pipeline where the requirements are spec is being done for, you know, two builds down the line, a build is being implemented and abilities being tested and accepted. So there's a kind of a three or four stage pipeline to it. So when the formal spec stuff is done, it gets handed over to the development team, and it actually goes in two directions. At that point, there's a development team that will read the formal spec and implement code essentially. Um, but there's also a verification team who read the formal spec and based on the formal spec, do all the analysis to construct essentially construct test cases and test scripts and validation and all the stuff you'd expect. And that's done by an independent team, if you like.

Rod Chapman (13:32):

So there are people who need to read that material, and that's a much bigger team. Clearly the development team, uh, the D the, the, the air traffic control project is called effects. So the effects development team probably peaked at over a hundred people, uh, split across kind of development and testing verification. So you've got their report of about a hundred people who need to be able to read that formal notation and either produce, you know, to do the analysis, to produce test cases and scripts and so on, or read the thing and be able to produce code to actually implement it. And that takes us into the kind of implementation pipeline. Um, so the formal spec becomes the kind of gold standard. If you like that, it's what the tests are based on. It's what the code is based on. And it's what the users have actually agreed, like wants us to build. And it has a really important kind of central point of the process if you like. And then we try and carry on that theme, I suppose, and do, do the implementation in a formal style as well, where we try and again, use formal languages and tooling, um, at the implementation stage as we go on towards, towards, you know, actual code and deployment as well.

Joey (14:44):

It sounds like that part is useful, but not required. In other words, if you do this careful mechanical architecture step, you're likely going to get benefits from that, whether you sort of go prove your implementation equivalent to that or not, you can further the benefits by doing formal engineering as well, but the specs stand alone as a, as a useful artifact that can be used on a software engineering,

Shpat (15:13):

As long as the engineering team is looking at it carefully while implementing.

Rod Chapman (15:17):

Yeah, you could. I mean, they, the, the, the, the upfront requirements and formal spec work has enormous benefit in just, you know, I mean, I mean, we've always said to people, you know, if you're going to have a problem, would you rather have it now, or in six months or a year's time, um, because of the cost of the cost of getting it right by fixing a few lines of Zed is pretty trivial. The cost of fixing it in six months or a year's time, and having to refactor thousands of lines of code is poor and thousands of test cases, you know, the, the, the cost can be orders of magnitude higher. Uh, if these things come out late in the process, so it's just standard lean, or people call it shift left, or lean engineering or correctness by construction, or whatever, this kind of effort on getting it right upfront pays off because of knowing that we're building the right thing, the formal implementation style, I think is interesting.

Rod Chapman (16:07):

Cause it, it, it ensures that we don't drop the ball in implementation by, you know, I mean, you can have a great formal spec and then do a hopelessly defective implementation of it that will not satisfy anyone if it just doesn't work. So, and, and, and it ensures that our refinements, if you like from spec to code is actually strong or sound, or at least the code is, you know, free from large classes of D you know, common or garden defects that you would expect in software. Um, so where the spec jumps from the mathematics of pure integers, where we jumped to a computer and we're jumping into the world of [inaudible] 32 bit, you know, signed bounded integers, there's a huge scope there for screaming up, making a mess by overflowing and all the things we know can go wrong in software and the formal tools at the implementation level, kind of keep us honest at that level.

Rod Chapman (17:02):

And they also forced us to ask hard questions, because when you do jump to a constraint, computer implementation in a programming language, the verification tools will say, oh, well, what about, what about there? You might divide by zero and you think, oh yeah, I don't know what to do about that dividing by zero. So you look at the spec and the spec doesn't say, so you go and talk with the requirements people, and you say, what about this case? Cause if this and this, and this happened at this point, we're going to divide by zero. And typically the requirements people will say, oh yeah, we didn't think of that. Well done. Oh yeah, we'll go and ask the customer and you go and ask the customer. And lo and behold, you found a hole. You found an incompleteness bug in the requirements, and they can be quite difficult to find sometimes. So the, the formal implementation style kind of keeps us honest and forces to ask hard questions, which again is a good thing basically, um, in, in, in, in general. So it does fill in some gaps where otherwise, you know, defects could fall through the fall, through the post

Shpat (18:02):

Speaking of the customer. Um, I know you said, you know, this is a relatively small overhead to go after this specification, um, kind of upfront costs of formality, um, when your experienced, what's been, um, has it been hard to get customers and the end, the end user on board with spending that year or doing that? I assume part of that is, you know, something's been, it's going to be used for 20 years, then they're like, do whatever you need to do, but

Rod Chapman (18:34):

It can be a very tough sell to, to, to, to, um, if you're talking to a couple of you, that's used to very fast moving development, trying to, you know, pitch this kind of approach to them is quite difficult. If it's an organization that's coming to, you know, high assurance, either critical or security related development for the first time, it can be a very hard sell to say, Hey, we need to change the way you're thinking about this. Um, in some regulated industries, uh, so, you know, traditional, uh, industries, which are traditionally having regulatory structure, so typically civil aerospace or the rail industry, or the nuclear industry, this kind of approach is more accepted as being the right way of doing things because the cost of failure is so enormous either in terms of lives or my ability, or, you know, financial damage that the systems are very long lived, you know? Um, so there were some industries where it's, it's an accepted by working some industries it's still seen as being rather strange, and it can be very difficult, um, uh, a difficult approach to, to, to, to, to bring to other businesses. Sure.

Joey (19:44):

Yeah. So the re the regulation seems like a big factor. That's, what's the, that makes people want to work in this kind of way. And I, and, and even the cost of lives and, um, the cost of, you know, the financial costs of the side in a lot of these situations, it's really hard to even update the code if it's just like a little annoyingly wrong, right? Like if it, if it crashes every so often and it kind of, you restart it and you growl at it, even that is incredibly costly to fix that, and you can't do it right away. And a lot of these settings, right.

Rod Chapman (20:18):

Regulation or regulate tools are a double-edged sword. If you like, they are, sometimes they are good and help us to work in the right way. But yes, they are renowned for being slow moving and, and yeah, creating new for it's very difficult to make changes to a project. And because of the cost of having to fix things sometimes could be, can be very expensive and time-consuming, um, uh, different industries have different regulators and they have their own way working and their own way of doing things depending on which industry or country you're in and which, which, um, legal setup. So regulation is a tricky thing to deal with. We'd obviously we'd rather people chose to go down this path because they think it's a good idea. Not because they think some regulators going to hit them with a stick. You know, we, we prefer carrot to not stick with this approach to kind of lead people in this direction, which has worked on some occasions very, very well.

Rod Chapman (21:15):

Um, and as I said, certifications, we really, we, we, we've struggled to bring this approach to, to, to bear on industry. Um, but we continue to try. I mean, as I said, the company that I used to work without training is still very much in business and doing this kind of work very successfully. So there's, you know, that they've done so successfully for 25 to 30 years now, um, and are still doing it. So that's a good sign. Um, I think, um, the, that it's, it's, it's paying off and they're a profitable business. So that's a good thing.

Joey (21:46):

If I'm someone listening to this podcast, and I hear that this idea of formal specifications, and it makes sense to me, and I want to try things out, um, what are ways that I could do that? Um, the follow-up question is going to be on the people, the side of the people building formal methods tools, like what, what can we do to close that gap a little

Rod Chapman (22:07):

So how to get into it? I eat, I, I certainly, I probably advise, don't try and get into Zed immediately. Cause it's, it's, it's not the first language straight. I think there are, there are formal spec languages and these days, perhaps more, more, much more recent contemporary approaches, which have much better to all support, um, and are more approachable, perhaps are a little bit more domain specific. They are very, very, very good at certain types of problems. So, um, in control engineering control theory, there were superb tools. I mean, this kind of formal stuff is very well known as you guys know in hardware engineering in, in, in building, um, VLSI and FPGA designs and circuits, this kind of formal modeling stuff is it's just standard practice these days. So that's, that's one industry that's moved in that direction. So there were languages like alloy, which comes from, from MIT, which has a nice, easy to approach model of the universe, where you can write things and a very good tool support that gives you a lot of bang for the buck.

Rod Chapman (23:06):

We've got really good tools now, particularly tools that generate counsel examples of just awesome is basically you press the button and the tool says, no, it's wrong. And here's how it goes wrong. It gives you a, a test case or something to say in this case, this would be wrong. And you go, oh, of course you get what I call the head slap moment where the tool says, no, it's wrong because here and you go, don't like Homer Simpson and realize you've got it wrong in a really tangible way. So, so, you know, probably the biggest advance we've seen in tool support in the last few years. And these thoughts is this thing called counterexample generation, which is fantastic. Um, so tools like alloy have a good, you guys have the Quip toll system, which is awesome, but for a particular domain of things like specify and cryptographic, primitives, and protocols and formats, that's a really, really good, good system.

Rod Chapman (24:04):

We great tool support. Um, and there are probably other examples out there which are, um, you could cite as well that I can't call it. You think of, but that there were ways to get into this kind of thing. And the other thing I'd suggest people try is there's quite a lot of so-called lightweight, formal methods tools out there. So tools that aren't promising to kind of prove everything, or make some grand statement about your, your thing or your code, but there are tools out there that will check certain important properties of your code. So for example, there's something called a bounce were bounced so-called bounded model checking tools for programming languages like C or Java, which do a great job of finding some of the bugs. They don't promise to find all the bugs, but they're very good at automatically finding some of the bugs and typically giving you a counter example. They'll say there's a buffer overflow here. And if you put in this test data, you know, if X equals this and Y equals this, you've got a buffer overflow down there, those tools are really good. You get a lot of bang for the buck, from them without having to do too much work. You throw your code at them and, you know, go and get a cup of coffee and come back and see what it says. So they can be pretty, pretty impressive. I think, as a kind of entry point for this kind of material,

Joey (25:13):

Then the follow-up question was, and I, and I think in some sense, you've answered this quite a bit already is what can we do for, for, from the tool development side? And I think a lot of what I'm a lot of what I was hearing is that helping people understand how something like alloy can apply to their domain, almost out of the box is going to be critical and not forcing a lot of work between a really general purpose tool and the domain it needs to be applied to is a lot of the legwork that we can do on the tool side to help people use these approaches.

Rod Chapman (25:44):

Yeah, I think that, that the technology underpinning the tools has come on in leaps and bounds in the last decade or so with, you know, research on theory of improving and model checking and all the underlying technologies. It's amazing. I think perhaps what we need from tea to help adoption needs. It's more about user experience these days. So we need tools that have, you know, a really good, robust experience with the user nice user interface. So I say council example generation is fantastic, but also just stuff like tutorials and training material, you know, give me an example of how this technology and this applies to my application domain, something I can start with and play with, and you have to inject bugs. I can try and inject a defect in it and see what the tool says. So material that people can play with to get going to kind of get over the hurdle of getting into a new tool or a new technology.

Rod Chapman (26:40):

And then, you know, that that takes effort to produce, you know, I've been there with tools and as a tool developer I've been there and you think, oh, we've got this great technology. Why won't anyone use it? Oh, that's funny. Um, and it's about having the whole package. It's about having really good user experience, training material. Yeah. Tutorial, material, online stuff that people can access and support and you know, and all that stuff as well. Um, and that's, that's, you know, part and parcel of maturity in the market, I think for these things. And it's, it's, I mean, it's something that, you know, your, you guys have done and, you know, countless times, um, I've been there as well and, and other companies have been around that cycle.

Shpat (27:19):

Yeah. I mean, you hit the nail on the head and we've been hearing that loud and clear, uh, you know, over the, especially over the last couple of years, I want to switch topics a little bit and bring up tweet and ACL. Um, I was hoping you could tell us a little bit about that. Um, it's, it's a very impressive project, so I'd love to hear kind of an overview of it from your side.

Rod Chapman (27:46):

So yeah, this has been my kind of lockdown project, or I've been unable to get out much. Uh, and I, I needed a side project if you, like, I gotta do something or I've been stuck here in my, in my office for the last year. So let me give you some background. Tweet. NaCl is a compact reference implementation of the NaCl cryptographic library. Uh, it was designed by the designers of NaCl, um, a guy called Dan Bernstein and his colleagues. The interesting about tweet NaCl is it's written in C, but it's designed to be an extremely compact implementation of that cryptographic library. In fact, so compact, it was designed to that. You could actually tweet it, you know, a hundred tweets, which sounds like a strange idea, but there you go, as, as is common with a lot of cryptographic code, it uses there's quite a lot of rather advanced mathematics going on under the hood.

Rod Chapman (28:39):

There's, you know, all the implementation of this stuff could elliptic curve, cryptography, and there to make it go fast. There's quite a lot of mathematical trickery going on. Also, it has this rather strange coding style where it's written to be Constantine execution time. So the execution time of the algorithms does not vary based on the input data, like the data you're encrypting or the key that you're using or whatever. So this is a cryptographic property that is desirable. That means it's written in a rather weird style. If you read the code, you just look at it and go, huh, what, what is this doing? Because it's written in this bizarre style and you think, well, what am I just putting it statement there? And the answer is all they're not allowed to put an if statement in, cause that wouldn't be time constant. So I looked at this sometimes and I was amazed at how compact it was and stuff, but it's extremely difficult to understand how it works and why it works.

Rod Chapman (29:36):

So what I decided to do was to try and rewrite it in a formal language. And the formal language is this is programming language called spark, which by the way, it's spelled with a K on the end. Um, and it's not a patchy spark. It's got nothing to do with big data or Hadoop or any of that stuff. We were using the name first, like 30 years ago. Right? So it's a long story. Um, and it's not spark with a C on the end either. That's another long story. Um, so spark is, uh, a formal programming language. It's rather odd in that it's a real programming language that you can compile with GCC and run on a real target, but it's formal. And we have very strong verification technology for it. We have a full-blown program verification system based on the whole logic and theory of improving and all that kind of stuff.

Rod Chapman (30:21):

The language spark is actually a subset of ADA, ADA 2012. Um, and it has con it uses contracts. The other big thing about it is it's got contracts embedded into its core. So we have preconditions and post conditions of assertions in the language, which are used by the proof tools to prove things about your code. So what I decided to do was to rewrite tweet NaCl from the see original into spark, and then see if I could prove stuff about it. And see, for example, if I could prove it would never crash or never raise an exception, we call that pro posh, flashy, programming, language, people call this type safety or memory safety. And basically it means your program never crashes or never raises an exception and never kind of goes nuts. It never has an undefined behavior, which, you know, things you don't want to happen in your program.

Rod Chapman (31:09):

So I tried to do that and, you know, I'll fast forward a bit and I was successful. I managed to get this to work and then prove it was really typed safe. And, and the proof is also automatic and you just press the button and the proof tool grinds away for a couple of minutes and then says, yes, and it's proven something like 1500 verification conditions. Um, that show you the code basically is, is kind of bomb-proof and can't do anything stupid. And I also proved them correctness properties of the mathematical stuff on pass on. I had to prove if you correct those things as well while I did it. So that all went well. And that was about nine months ago. And then much more recently, what I tried to do was to actually test it for performance to see if the formal spark code could actually match the C in terms of execution, time efficiency.

Rod Chapman (31:59):

Cause that matters to people with particularly cryptographic stuff. You want it to go pretty fast. So I wanted to see how that would come out as well. Did match imagine I'm pleased to say eventually, eventually the answer was yes, initially it was a bit, the spot code was a bit slower than the C. Um, and that's not unsurprising. Um, there were certain things about spark spark, for example, for, for proof spark rewards, a more functional programming style. If you're used to programming in something like Haskell or ML is one, it sounds surprising. It's very elegant. We employ a functional programming style that comes with a certain penalty in terms of execution time. We're we tend to be copying data from one function to another, a bit more, but it wasn't that far off. And I was quite surprised. So I then set about doing, uh, quite a detailed analysis of the execution time and seeing how I could improve it by doing some transformations on the code, if you like.

Rod Chapman (32:58):

And while I did that, for every transformation that I made, I was running the proof tool to make sure I hadn't made a mess of it. And the proof tool, my basic golden rule was that whatever I did to the code to make it faster, the proof tool had to check it and tell me everything was fine. Otherwise I wouldn't do that transformation. So I kept proving it over and over again, every time I fiddled with it, and there was spaces about bunch of tricks that I did, um, and having concluded that work now, the spot code is actually running appreciably faster than the C. Um, and that's quite a surprise because of some mathematical tricks I was able to exploit. We've actually got it to be quite a bit faster now, which is quite nice because people tend to assume that this formal stuff is great, but oh, it's too slow.

Rod Chapman (33:45):

Um, and people come to us and say, well, we'd love to do spark, but I'll know this code has to go really fast and we have to write it in C. And the answer to that is now you can get this stuff. This formal stuff can be really fast. In fact, it can be faster because the programming language has certain intrinsic properties that make it highly. It also, it's highly amenable to optimization the optimizer in GCC, for example, looks at spark code and says, oh, this is nice and simple. It's got all these lovely correctness properties are great. I can optimize all these function calls. So for example, all that function calling and returning and copying overhead actually all disappears because the functions don't have side effects and there's no aliasing in the language and so on. And so on, the compiler takes advantage of that and actually generates really cool efficient code, which is a really nice kind of synergy of the formality and the component of technology working together. Yeah,

Shpat (34:37):

That's a, that's an amazing kind of result from, from all of that. Um, and what's, what's interesting. Well, there are some benefits you get by just using spark. Um, and then there's benefits just going all the way back to specification since it's a constant and it's not changing, you can do those uptime optimizations without wearing in a way. So you can really dive in and then you have this prover just checking constantly and telling you, Hey, you've actually ruined it or not.

Rod Chapman (35:07):

Um, in, in 20 or 30 years of programming, because it's 30 years, it's more like 14. I've been programming for nearly 40 years. Cause starting in my bedroom as a teenager with a home computer in the 1980s, I learned to program in assembly language. You know, I have an acute understanding now of how brilliant I'm not as a programmer. And knowing that there's proof tool is kind of saving my bacon all the time. Cause it spots mistakes in what I'm doing and, and helps me prove that I haven't mistakes, you know, with, with checking that the types of all still, okay, and it's not overflowing and I'm introduced to bathroom overflow or whatever. It's really nice to have that tool backing me up. Um, and this kind of proof based refactoring we call it is a really interesting concept because the proof, obviously it's just in a, in a team-based software environment, if we're working in a big team, obviously you have a continuous integration kind of thing going on.

Rod Chapman (36:04):

Whether the continuous integration stuff is continuously checking the proof of the software all the time and knowing that that proof is always going to happen for every single change gives you a real sense of confidence that you've, you've not introduced some gray area, stupid mistake. Um, and you get the feedback within minutes, you know, on a, on a decent computer these days, you know, you're that proof tool is going to come back and, you know, point out your mistake within, uh, within a matter of seconds or minutes of you, of you doing anything. And that's really, really nice to know. So it, it's, it's a really nice way of working. Um, and, and something, we, you know, when we're doing this stuff with big teams, it just becomes normal practice. All the developers do the proof stuff all the time and that's, you know, just standard discipline for the teams working on the, you know, the big projects with spark all the time.

Joey (36:57):

This gives a really different perspective in when, when we look at adding formal methods to a project or having formal methods in a, in a project and writing mechanical specs, what a lot of people might hear is that it's going to be something that holds them back and that every time I want to do something, I'm going to need to worry about the proofs. Um, it sounds like when applied correctly, such as in this situation, yes, you did need to write the proofs, but in the end it really liberated you and let you try things that you would have maybe been afraid to try otherwise, or that trying would've been prohibitively expensive in the sense that you would have had to heavily test create new tests every single time you, you made a speculative performance change to your code.

Rod Chapman (37:45):

Uh, particularly, I mean, some of the transformations I did in the, in the NIC elementation were really quite risky in terms of things like changing the layout of data and squishing data into more bit things or fewer bits than you would have otherwise. And things like this. These are changes where you could really easily make a mess. And the proof tool just saves me from, from getting that wrong, which is really, really nice. And I, you know, I using spark, I have that confidence and I, I am acutely worried if I have to write C and I know I haven't got that kind of technology backing me up, oh, I'm acutely aware of how error prone I am writing code. Um, so it, it, it is a nice feeling. And I mean, you talk about writing proofs and that's not really what I'm, well, it kind of isn't it, isn't what I'm, what I'm writing is source code.

Rod Chapman (38:36):

Remember all of this proof stuff is automated. I just push the button and it runs. So what I'm really writing in the code, I think is like preconditions and assertions to say, you know, at this point in the calculus should be true. Um, so what we do with on teams is we say to people, well, the proof has to run, but be fully automated. It's got to just run on its own on the machine. There's, there's no kind of maintaining a proof script or a sort of different artifact for the proof. It's either gotta be fully automated or it's not right. So with big teams, we set a target, we say to people, well, 99% of your proofs have to be fully automated, or you don't commit your code. So we set a target for automation that the teams need to hit. And then they get into that habit of hitting that target.

Rod Chapman (39:24):

And when they hit 99%, we CA we turn the dial up. We say, okay, now it's 99.5. And we, we, you know, we, we go a little bit further and on some projects it hits a hundred and once it hits a hundred, we stay there. Once you get complete automation, we just say, right, that's it, we're staying automated. Everything proves every time or you've broken it. And if you break it, the answer is to write a simpler program. That's usually what you need to do. If you've written something that the prover can't handle, it's probably too complicated, which means it's probably wrong anyway. So go back and write a simpler program and the proof system will handle it. And it's okay again. So it's a nice way of looking at it. You know, it's a different way of looking at the problem, but we found it works very well.

Joey (40:11):

And this, this sort of circles back around to your original point, which is while some people might hear right. Simple program, right. A simpler program as, as right. A slower program. The reality of the proof tools is that the simpler program in the end might also end up being faster, which is maybe counterintuitive, but it sort of requires an understanding of compilers and what compilers work well on at the end of the day.

Rod Chapman (40:36):

Yeah. When I say simpler, I don't necessarily mean lacking functionality. That's not what I mean. I mean, the program can be perfectly simple, but it can be structurally simple, um, or easier to understand, uh, or easier to prove in a way that yeah, you get all the functionality you want and possibly more performance because the compiler can take advantage of simplicity. That's a good thing, you know, and, and some famous engineer once observed, you know, simplicity takes work. It doesn't, it's not easy, uh, to be simple, sometimes achieving simplicity actually requires a bit of thought and a bit of energy to be put in. And that, that can be a really good thing. Um, so it's a, an interesting exercise. If you like to kind of simple, simply simplify your software to the point where the proof tools will, will really do well on it is, is a, is an interesting exercise.

Joey (41:28):

So all engineers want formal specs. All engineers likely want to be writing simple programs at the end of the day, but the realities of the day to day demands on engineers often pull them away from that direction. And what I'm hearing a bit about tools like spark is that they serve to pull engineers back in this direction of, in a sense where they want it to be in the first place, which is writing simple code and writing code that is doing a well-defined thing. Is that an accurate statement?

Rod Chapman (42:05):

That seems perfectly reasonable to me. Um, and yeah, the, the tools do keep us honest. Um, it's, you know, I, I don't curse any more when the tools spots a mistake in what I say in what I've done, I don't curse the tool. I say all things, goodness, it's probably that mistake. And that means I'm not burdening that mistake, or one of my colleagues who has to spot it by reviewing the code and, or, you know, and I have to fix it later or something. And if the tool tells me I've got it wrong, I fix it immediately. And I think few that was good. You know, it saved me again. And I liked that a lot with the tool. Um, I mean, if people are interested, the spark NaCl code is on get hub in my account. It's, it's, I've released it. It's permissively licensed.

Rod Chapman (42:48):

So you can see, and actually in the GitHub repo, you can see the steps of the simplifying transformations that I made. If you'd like to get the performance out of it, for example, you can see all the commits to the get hub repo, that show how I changed it to various stages. And there's a, uh, uh, a blog entry I wrote recently about how I did it, if you like the kind of goes into it. So if you want to see it really well, I would say anyway, I would say that obviously, if you want to see a bit of really well written spark code, go and look at the spark and ACL sources and see how it fits together, because it's, it's, you know, I'm, I'm pretty proud of it. So like, I don't mind if people want to have a look, I'd be all with it.

Shpat (43:30):

Yeah. We'll, we'll definitely include a link to both of those things, actually. I'm sure a lot of people will want to look at it.

Joey (43:36):

That's about the end of the time that we have today. Thank you so much for joining us rod. It's been a wonderful discussion.

Rod Chapman (43:41):

Okay. Thanks very much for the chance to chat today. I really appreciate it.

Shpat (43:45):

Thank you. Um, this was another episode of building better systems. We'll see you all later.