Building Better Systems

#6: Dan Guido – What the hell are the blockchain people doing, and why isn't it a dumpster fire?

Episode Summary

Dan Guido, CEO of Trail of Bits, walks us through how they work with customers to make long-term improvements in security and software quality. He also describes what blockchain has done right, and how the rest of the software world should learn from them.

Episode Notes

You can watch this episode on our Youtube Channel.

https://youtube.com/c/BuildingBetterSystemsPodcast

Joey Dodds: https://galois.com/team/joey-dodds/

Shpat Morina: https://galois.com/team/shpat-morina/

Dan Guido: https://www.linkedin.com/in/danguido/

Trail of Bits blog: https://blog.trailofbits.com/

Galois, Inc.: https://galois.com/

Episode Transcription

Intro (00:00:02):
Designing manufacturing, installing and maintaining the high-speed electronic computers, the largest

and most complex computers ever built.

Joey (00:00:22):

Hello everyone. Welcome to the building better system podcast, where we explore tools and approaches that make us more effective engineers and make our software safe and reliable. My name is Joey Dodds. I'm a researcher focusing on providing assurance for industrial systems.

Shpat (00:00:35):
And I'm Shpat Morina, Joey and I both work at Galois an R and D lab focused on computer science and

hard problems in computer science. And today we're joined by Dan Guido.

Dan (00:00:47): Hey, there. Sorry.

Joey (00:00:52):

No, that was, I love the enthusiasm. Dan works for a company called Trail of Bits. Trail of Bits is a company similar, but also incredibly different to Galois. And one of the things we're really looking forward to with Dan and this came across as we prepared with Dan, is that Dan and trail of bits approach formal methods and better systems in general with an incredibly open mind, which enables them to tackle a huge range of problems in, in a wide with a wide variety of approaches.

Joey (00:01:25):
So we're really looking forward to talking with you today, Dan.

Dan (00:01:28):
Great. Cool. Thanks. So I'm happy to be here.

Joey (00:01:32):
So at a high level, can you tell us a bit about your approach to building better systems?

Dan (00:01:38):

Uh yeah. Sure. So I'll, give a tiny little bit of background about Trail of Bits because I think that kind of informs the approach, right? Like with trail of bits we're typically working with commercial firms, we're working with technology companies, people in defense and finance where they need results right now. It is not really okay to suggest a complete rewrite of a lot of the systems that these companies use. You know, we just got off our, well, we're still currently working with zoom and you're not going to get through to engagement with them by offering them formal verification and formal methods and saying that, Oh, we're just going to chop this thing into bits and pieces and rewrite whole parts of it.

Dan (00:02:14):

And everything will be great. You have to kind of work within the constraints of a lot of these companies and recognize that security is not the top priority. It's kind of like keeping your users happy and making

sure you've got a functional product that pulls in, you know, a billion dollars a year or whatever it might be. So for a lot of our work, a lot of the work that we do with firms tends to be very educational in nature first off because the engineers we work with aren't experts in these kinds of systems and we're only there for a small period of time and then it also needs to be rapid. We need to get results quickly. So a lot of the methods that we ended up using are very lightweight, they're very high level but they ended up giving greater assurances out the other end more so than if we just for bugs with our bare hands. So yeah, I'm really happy to like talk through that and a few other topics during this interview, I think there's an area of work. The trail of bits does, that's kind of a bizarro world that we'll get into a little bit. We do a lot of work in blockchain and a lot of work on smart contracts, which it's not [laughing]

Shpat (00:03:18):

I look forward to getting into that. But it's saying, yeah, it sounds like when it comes to building better systems, what you're saying is like you don't start with security you're a better systems for you maybe systems that are actually usable by people.

Dan (00:03:35):

Yeah. I mean, so we take a holistic approach to it. Like we have a product manager and a UX designer on staff and they consult with us sometimes. So we talked through kind of a user abuse related issues, privacy concerns just like, you know, unknown ways that people can, can work with your product and shoot themselves in the foot because you allowed them to. Right. But from another perspective, yeah, like a lot of the root causes of software insecurity, aren't really going to be just that. You're not doing testing and verification correctly. It's going to be a lot of stuff like what is your good commit hygiene look like? How many branches of code have, which one do you ship from? Do you have a master branch at all? It's going to be stuff like the tools that you provide your developers and the tools that you chose to use, and whether you've migrated to ones that enable you to get visibility and understanding of what you've, what you've, what you've built. There are a lot of really easy leading indicators for whether you are going to have a secure product and it doesn't take an expert sometimes to figure them out.

Speaker 4 (00:04:36):

I'm curious when people call up trail of bits and they say help help with our security or help with these challenges we're having today, expect those holistic answers. Does it sometimes feel like it's coming out of left field and you're like, look, you've got to look at, you know, from the top down, how your engineers are working, are they expecting that?

Joey (00:04:54):

No. So, you know, there's a really rich history of companies that are doing code reviews like that, that, that hire us to do security review their software. Kind of all starts with like at stake and found stone back in the early two thousands, late nineties where you know, originally I think those companies wanted to come in and say, Hey, just give us commit access to your code and let's go fix it. And they had to figure out what they could sell and what they could sell as well. We'll just pile a whole bunch of bugs into your bug tracker. So that's kind of like a little bit of the lineage that I come from. I used to work for ICIC partners, which itself was an offshoot from at stake. And, and that like influenced a lot of the way that I run trail bits now, but always involved in that process have been like, you know, for instance, whenever we report a bug to a client, we always give a short and long-term recommendation.

Joey (00:05:43):

We want to tell them how to patch an individual bug, but we also want to tell them about what process they use that made that bug and how they can fix the process. There needs to be this continuous improvement in the software development life cycle, which I think lot of firms lack, especially on the more operational side of things. Like Galois, that's not your, your deal, right? You're, you're all strategic. Like, you know, I, I, I know that you guys really focus on the, on the root cause, but out of the whole industry of people that are doing more of that, like code review type, like two week long projects where it's just a flat out race to hunt for as many bugs as possible, a lot of those long-term recommendations get lost. No one's thinking about SDLC, nobody's thinking about organizational kind of issues that prevent producing secure code to advocate for, you know, not only looking at the symptoms and

Speaker 5 (00:06:37):
Just disappearing, but you know, getting to the root cause and explaining maybe

Joey (00:06:42):

Ways to avoid that. Yeah. 100%. So like, you know, a lot of those things I mean, we got on this because you were asking, you know, when people walk in our door, do they do they, do they ask for that? Like, is that what they're looking for? And a lot of times like, yes, like everybody wants to be a good engineer. At the end of the day, engineers are motivated to produce quality work. Nobody shows up to the office in the morning and says, I just want to turn in a pile of garbage today. Like that's not usually the carrot approach works so you don't need a stick as often as people might use it. So a lot of the engagements that we ended up performing are focused more on kind of a training exercise. We want people to be better engineers when we leave, because they're ultimately, who's going to have to carry this forward.

Joey (00:07:28):

We're in a very similar box to you know, the ICIC partners of old were many of our projects are two, four, six, eight weeks long. So we don't have time to like fundamentally address all of these issues on our own while we're consulting for a company, there needs to be some element of ownership and there needs to be a like a directional change in terms of how the company produces software after we leave. I mean, it really depends on what the company is looking for, right? Like we've had people ask us about regulatory compliance stuff. It's gotta be driven from two different sides, right? The company that shows up at our door is gonna have a list of concerns. And then we're going to have expertise about what we see and you have to kind of merge those two visions and figure out what a good prioritized backlog of things to work on. Ends up being, they can't all be driven by us, but it probably all shouldn't be driven by them. So you ended up trying to play matchmaker in the middle. It makes sense.

Speaker 4 (00:08:26):

Yeah. Yeah. I mean, I can see how we, we see this with formal methods tools all the time. A huge stack of bugs in the bug tracker can be more paralyzing than helpful. So I can see how your customers would really be, find value in concrete recommendations for what, what do you do today and how do you think about your engineering going forward?

Joey (00:08:49):

Yeah. And that's actually been something we've had to warn people about when you know, we think about the upper end, what a productive engagement looks like. Sometimes you can just put people on a treadmill that, that, you know, they're not going to lose 50 pounds like next week. So there's a, there's a limit. What I, what I would rather do is kind of engage with the company more frequently over a long

period of time than to have an enormous monolithic engagement that produces a thousand bucks. So there have been times when I've said, you know what, this project is too large. Let's, let's break it into smaller parts and let's like address one let's, you know, first iteration do you manage third-party dependencies? Do you know which ones you have, have you evaluated the risk of any of them? How do you test them? When do you patch them? Like that kind of stuff. And then maybe the next round will be okay, let's talk about attack surface. Let's talk about, you know, compiler tools, whatever and get it done in slices because no, one's going to do this all at one time.

Speaker 4 (00:09:52):

Yeah. And, and even if you get to a point where you're, where you've, you've found that, like, let's, let's, let's imagine you could find all the bugs and you could say, well, do all these patches and your code is going to be good. The bugs, the bugs come back and, and a lot of times the bugs were never there in the first place. And it seemed to have resulted out of the evolution of software being, being quite challenging rather than, you know, everybody starts with, Oh, I've got my super clean, you know, highly documented code with all these we won't call them in various, but I understand what each system does. And then as that evolves things really start to go off the rails.

Joey (00:10:25):

Yeah. I think a mantra that we try to use a trail of bits is that we never want to find the same book twice. So there are ways to do that. There are ways to get that done, even in a short project. I'll give you an example. I mean, we always try to get evidence out the other end. Like I don't want to get into, like, it, it feels bad to me when I get hired to do a security review and I find a bunch of bugs, then they asked me a question like, Hey, so is my software safe now? And I can't give them an answer. I can't give them a measurable answer. Right. So I want to derive evidence from the process that we use and you know, reasonable answer might be like, w we, we had a client last year that showed up to our door with a 250,000 line rust code base, enormous.

Joey (00:11:10):

Right. It's just like massive on the scale of rust applications that are out there. Probably one of the largest, if not the largest rust application in the world. And we've got two weeks to review it. So where do you spend your time? Cause like, in, in this case, the application was like every part of the application was extraordinarily serious. There were multiple complex components, there was a virtual machine, a compiler, there were network protocols and the client wasn't even slowing down for us. Like they were somebody that was developing code while we were reviewing it. Even in the span of that two weeks. And they had hundreds of developers working on this code base obviously to get to 250,000 lines of code. So, you know, where do you start on an engagement like that? And the trail of bits approach would be that, okay, so let's just focus on the unsafe stuff.

Joey (00:11:57):

We want to get measurable outcomes out the other end, let's understand the state of unsafe code and try to provide a measurable metric that shows how much we could reduce it. And if we can't reduce it are there ways that we can get high coverage testing across all of it being used? During those two weeks, we ended up splitting up our team where half of them went to find bugs and understand architecturally where risk was pooling. And the other half wrote a tool. We wrote something called Sedera file. So Sedera file is like a divining rod to find unsafe code. So walks kind of a stack trace and it

figures out exactly what unsafe code you've imported from what libraries and exactly what gets called inside of them. And then it helps walk you through a custom fuzzer definition.

Joey (00:12:44):

So that now the evidence we get at the end of the project is okay, we've done a security review. We found 20, 30, 40 bugs, whatever. So we've provided immediate value to you, but the long-term value we're going to give is that as long as you run this tool, you will get high coverage, phys tests for every single use of unsafe code through the whole code base. And that's something that you can, you can like take to the bank. So that's a really good example of like how we try to do these traditional like application security reviews, a little bit different to provide lasting evidence that the code is safe.

Speaker 3 (00:13:18):
And it sounds like some something that they could continue to do without you,

Joey (00:13:23):

Yeah. Easily. I mean, a lot of what we ended up doing is stuff that fits into a continuous integration pipeline. Not all, not, you know, these, the evidence we extract out the other end, doesn't always have to be a CIA tool, but in this case, the CI tool was like the right thing to do. And the organization was competent at using it. So it did provide,

Speaker 3 (00:13:45):

I want to change the topic a little bit. I know that you're basically neck deep in what you've called blockchain stuff. Yeah. And so, so, so one of the things, when we were chatting before actually you said it, and I put it in quotes because I'm gonna pose the question that you actually formed yourself earlier, but what the hell are the blockchain people doing and why isn't it a total dumpster fire?

Joey (00:14:15):

Yeah. Great. So right. Blockchain. so trailer bits is kind of on top of the world when it comes to blockchain stuff we got into real early started a little bit before the Dao hack happened and then threw a bunch of resources at it because we thought it was fun. You know, we looked at it and we saw there's all these brand new compilers as a brand new runtime environment. And there's money basically. Like people actually care about security here. They see the risks they're willing to spend in order to get somewhere. So we thought these were kind of all the things that interest us and that we could, you know, make a good market out of. So we did that. What we saw was a Greenfield, right? We saw that usually when you walk into an industry, there's all kinds of existing norms and practices and tools and techniques, and it's all been decided and you kind of have to work really hard to change the way people work.

Joey (00:15:10):

In this case, nothing existed. There were no real security experts. There were no real security, security tools. People didn't even know what bugs could exist, not just like how to fix them. So this was exciting. And since then we've grown a team of about 20 people that work in blockchain security related areas, sometimes on smart contracts, other times on blockchain clients, themselves asset custody systems are kind of a fun offshoot where it's, you know, how do I store a billion dollars in cryptocurrency somewhere? Because if somebody gains access to it for a millisecond, I irreparably lose all of it. So fun stuff. Right. But what I think a lot of people hear when they, when they hear blockchain is they hear

these massive hacks. They hear like you know, that, that everybody's getting hacked all the time that like massive amounts of money have been lost.

Dan (00:16:05):

And that's true, like for the most part. And you know, you, you think like, okay, so these guys must not know what they're doing and you look at it and they really don't. So the, the, the smart contract language that most people programming is called solidity and solidity, unlike every other modern language seems to have reinvented all of the vulnerabilities that were wiped out of languages like rust or, or go or Swift. It's the only language I've seen developed in the last, like 10 to 20 years that has integer overflows that has uninitialized variable issues that has like variable name shadowing issues. My favorite one about solidity is actually that Null dereferences are exploitable which was really useful, really screws with people. So like apparently somebody thought it was a smart idea to store the owner of a contract at the zero memory page, which means that if you can get an null dereferences, you can change who owns that contract to someone else which is like, who owns money and like smart contract land. So it's a, it's a complete mess. So it's a complete mess from like all the foot guns that are in it. It it's like they, they changed the way that like, you know, parenthesis and plus, and minus, and everything else gets evaluated. They have all these weird, like type confusion issues. It's just a complete disaster.

Shpat (00:17:31):

So, but despite all of that, it is actually not a total dumpster fire, right. The hacks are relatively rare when you consider the amount of, of essentially money that, that goes through these systems and transfer. So just curious about what that's about.

Dan (00:17:47):

Yeah. So I've built up this kind of a straw man a little bit cause like, you know what, there are a lot of easy bugs to be found. It's, it's mostly true that you can go around the entire blockchain and you can look at the whole thing as a whole, and you can find issues. Analysis of it is a little bit tricky, like it runs inside of a stack machine and, and that's kind of a pain in the neck. So a normal person would think that confidence in these systems is near impossible to gain. But the reality is when we work in this field is that it's kind of leading the industry, the technology industry. We have clients walk in the door with property-based test results with full symbolic verification suites built by people that never went to college. And they ask us for help at that point.

Dan (00:18:34):

And they know they have to go farther, which is completely backwards from what you'd expect. You'd expect that you know, these untrained software engineers who dropped out of college to write smart contracts in a language filled with, with foot guns would be unable to get any kind of concrete proof that their code works correctly out the other end. But instead there's a mature field of symbolic execution as a service vendors that are competing with each other to be the best. Like there's not just one choice. There's five of which trail of bits has one and then there's a whole bunch of others. So I think that, like, when you think about it, it's really easy to wipe that, to like cast that aside and say that, Oh, well this is because that like code is law. And like, you need to be right the first time.

Dan (00:19:21):

And there's no recourse if you're hacked, which kind of, sometimes they cheat and they fork the blockchain and roll back the hacks every once in a while. That's a thing. If it's big enough, they do that.

Or you could say like well, well to that, I would say that regular software needs to be correct to smart contracts. Don't have exclusive ownership of the claim that they, they need high security. Like I'm sure that, you know, over a Galois that there are people in industries all over the place that have extraordinary needs for secure code. There are people that live and die by whether code is safe. IPhones even like there's billions of them when people store the most intimate information on, on their iPhones and Apple spends a billion dollars a year on security for them. Right. So there's a lot of regular software out there that needs to be correct too.

Dan (00:20:11):

So what's different about smart contracts really is what comes down to it. Like if this code is law idea, isn't it. And like, what is it? And for this I'll, I'll step back and I'll, I'll use a analogy. I'm sure everybody's heard the joke about how a dairy farmer asks his physicist friend to help him compete with the big, like agro giant down the block. And then the physicist goes back to his workshop and like, thinks about it a whole lot and comes to the dairy farmer with a solution. And he says that he has one, he has a solution, but it only works in the case of spiritual cows in a vacuum, right. Like famous. The thing about smart contracts and about blockchain software in general is that it makes only spherical cows, that everything is a spherical cow on the blockchain. So it has all these properties where it really enables people to do research and to use testing research and use verification research in a way that other software just doesn't. So that, that's kinda like what I think is different,

Joey (00:21:19):

What your hypothesis is, is that the, the setting maybe not even intentionally, but the setting of the blockchain has somehow worked out in such a way that it's very conducive to doing things that result in, in security in high security in approaches that successfully test for security and results in security. Is that, is that the point that you're, that you're making? Or did I get a little wrong there?

Dan (00:21:45):

Yeah, 100%. So I think that there's actually some stuff that regular software people can learn from the blockchain crazy people. And I think it has a lot of benefits for security. And I think that security people should probably be paying more attention to what's going on. A lot of people are really quick to kind of write them off because they hear about these hacks and they throw up their hands like, Oh, if only these people understood the gospel of software security, but actually the blockchain people understand it better than I think a lot of regular software people do. So like, you know, a couple of things here so on the blockchain, everybody's only got one blockchain. If I install a piece of software a lot of it's affected by the state of my computer. I might have different DLLs and different libraries installed.

Dan (00:22:32):

There's a lot of global mutable state. Certain issues might only be present when like my clock is at a certain time or when my network has been primed in some way by having received a whole bunch of packets, but there's only one blockchain and all software runs on it. So it's been really trivial to get a really good testing environment set up. And that ends up being where a lot of this stuff gets resolved, that I think a lot of the orthodoxy around software security has been that you need good compilers and development tools in order to build secure software, right. The first time. And what I think a lot of blockchain software proves is that you can fix it and testing that if the testing tools are good enough, that you can remediate the issue of having a foot gun field programming language, that seems impossible to write secure code on the first try.

Dan (00:23:21):

So other things every change of state as a transaction the transactions are atomic and they're usually really small like input size is really expensive. So you don't tend to use a whole lot of it. You have to pay per instruction executed and then termination is necessary to which means, you know, when you've reached the end of your analysis. So, you know, maybe there are some ways that incentives kind of matter, but they don't matter in terms of like, Oh, well, if you get hacked, it's irreparable. What they matter is that they've ended up producing software that has constraints that force people to write software. That's easier to analyze. So like we're talking about input sizes that are hundreds of bytes, state variables that are hundreds of bites, instruction traces that are thousands of instructions. So you can use all these fancy tools that work through your whole state space.

Dan (00:24:12):

Symbolic execution ends up actually being the most common technique in the whole space. Like that is the out of the box. Like there are things where I can just do dot slash manta core, which is our symbolic executer. And it will find bugs for you immediately, which is completely backwards from what you think about symbolic execution on any other kinds of software. For that you're like, this is a joke, right? Like who who's going to use the buck execution on anything that came out of LLVM. Not only that, but the binary formats that they're using are trivial. Like a lot of kids when they go through college that they might write a symbolic executer just like you might write like a, you know, portable executable, parser or something like that. You could do it on, on a language like brain, and like an afternoon, like that's a really fun weekend project.

Dan (00:25:00):

But if you try and do that on x86 goodbye to two years of your life there are just so many instructions and so much hidden complexity there, but to do it on something like EVM, the Ethereum virtual machine, which is how a lot of these things get executed in the blockchain world. I I'd give it like a three out of 10. Like it's not something that's trivial, but it's not something that's super hard either. So what it comes down to is that even amateur developers can use these techniques pretty well. And that there's a lot of opportunity for people to apply these more advanced techniques. So obviously experts can do a whole lot more. But what it comes down to is the toughest problem in the space is defining what a bug is. Like what does a bug look like? Not necessarily, how do I test for one? So have you,

Joey (00:25:47):

Have you coined a phrase yet for the opposite of a foot gun? Cause it sounds like that's what you're talking about. In some sense, like in substance, the blockchain has maybe accidentally saved itself from shooting itself in the foot. Yeah.

Dan (00:26:02):
I don't, I don't know. That's a good question.

Joey (00:26:06):
I'm trying to think of something clever, clever here and it's it's not coming to me maybe, maybe by the

end, it'll it'll pop into one of our heads. Dan (00:26:13):

Yeah. I mean, so like try and compare this to like normal, normal software testing, right. Or like norm normal software security. You've got you know, w we review hundreds of code bases a year. People come to us with whatever their giant C plus plus code base, the rust code base that go code base. They drop it on our heads and we take notes every time. Like, you know, surprise clients. We're spying on you. Like the, the notes that we're taking are like, do they unit tests? Can I build in one step what compiler do they use? And I take a lot of that metadata and we keep track. And what you find is that if it's not a blockchain company half the time, they don't have unit tests. And the people that do have unit tests are high-fiving themselves. They're like, this is awesome.

Dan (00:27:04):

You know, our company is really all about, yeah, like we've, we've reached the end. Like we've got units. This Skoda's is amazing. You will find companies. Actually. It's really funny because you will find if you're hunting for jobs anytime soon, and you're looking at job descriptions, people are so proud of the unit tests. They write, they'll put them in the job description. They're like our team has tests we care about, about correct. Goodness. And that's kind of the state of the union. So the next phase up, you know, it might be some random testing, right? So you've got your positive concrete unit tests. The next step up might be some random negative testing that would kind of knock out more spots in the input space for the program. You want to know how much people use fuzzing that aren't, you know, driven by consultants.

Dan (00:27:51):

So trail of bits did a really funny research project about a year or two ago, a guy named kind of Artem Dean of Bergen. My team did this. He dug up some of the earliest work from Barton Miller on fuzzing from 1990. And then he unpacked all the fuzzies, which were perfectly preserved and still compiled. Cause I guess Barton's a bad-ass and thought that far ahead. And we took that code and reapplied it on the newest versions of Linux that are out here. So in 1990, it was a on Slack where two.one. And then in 2018, it was on a boon to 1810. And we just asked the question, like, what happens if we run this again? Are we going to get the same number of crashes or less? And the answer is you get the same 30 years later. Most of the software, if you rerun it with a 30 year old, fuzzer still ends up failing and almost exactly the same way.

Dan (00:28:45):

So we thought, okay, maybe this is a fluke. Maybe it's just Linux. Microsoft is all about software testing. They've got Sage, you know, they've got all these like famous people working in Microsoft research, investing so much in automated testing. So we dug up a fuzzing paper from Barton Miller again, the year 2000 and focused on a windows message parsing issues, and that he wrote a fuzzer for. And did it again on windows. I think it was seven this time from like windows 3.1 to like windows seven. And we found the 93% of the crashes he found were still present. So basically like fuzzy, Hmm. Is a next level set of goals for, for most people that like, it is the cutting edge. You see that in academia too, like a lot of academia is hyper-focused, how do we improve the performance of our fuzzies right now? Everybody's got their own fork of AFL at a different, at a different conference. And it's, it's just nuts. But the whole idea that like, Oh, we could use symbolic execution to is wildly impractical. Nobody takes that seriously, except for the blockchain people.

Joey (00:30:00):

And for everybody else I guess two questions for you since it sounds like this is the answers, you know, this is what you answered from day to day for everybody else that either needs to step into testing and they maybe even know they need to step into testing or fuzzing. A what in your opinion is often preventing people from doing that and, and B how, where do they start? Where do you tell people to start if they need to be testing or fuzzing?

Dan (00:30:28):

Yeah. I mean, I think there's some high level trends that are working in a good direction. Like I think the industry is finally moving away from dynamic types, right? Like when I think about kind of testing, right? Like I've got this kind of big, like square that I'm trying to throw my inputs into. There's a couple of different strategies I can like knock out huge sections of it with really effective testing techniques that get me high coverage of all the possible inputs, or I can shrink the input space to be less. And dynamic types are really one of the key offenders for when it comes to having a super huge input space. So you look at languages that are out here now, and a lot of them are statically typed. You know, you, you've got your type script. Like we finally converted like the 90% of programmers out there that really just do web development, right.

Dan (00:31:17):

Credit applications, like they're using type languages now. That's awesome. So instead of everything being an object everything's kind of turning into a function you know, package managers too are getting a lot more reproducible. We've got stuff like what is it NYXO S Docker containers like who, if I, if, if we had that just a few years earlier that actually lets me plug in a lot of the testing techniques that I'd like to on an engagement where it otherwise would be impossible. So that kind of reproducible environment and the ability to build with one step and then hook into it and actually like do some testing is kind of a really fundamental capability that I think pays off as much for security as it does for like dev ops and operations. So as security engineers, you shouldn't forget that, which is why some of our engagements actually, when we're brought in to go on these like massive bug haunts, some of what we ended up working on our build system improvements for our clients, because that ends up having more of an impact than I was finding like five extra bucks.

Dan (00:32:23):

Well, and you even ended up this situation where like, maybe you get a bug report for a user as a company. You can't even do anything with it because it's like, well fine, but I don't know what was going on in their machine. So if you can, if you can answer that thing right. Making every bug report more valuable as a win over, seeing more bug reports, if you can't do anything with them

Joey (00:32:42):

Yeah. 100%. and that's just a consequence of having like global mutable state that you can't identify, like, why is the program in the state? How did it get here? I have no idea. It's, it's not reproducible. So yeah, I think that's, that's some of the first steps to get there. And, and just kind of recognizing, like, I don't know if this common awareness and regular software engineering, that unit tests and feature tests, integration tasks, like the three of them, everybody knows those words. And then they don't really know that there's anything beyond them. So a lot of what I end up doing with these, you know, fairly typical kind of software engineering teams, just making them aware that, Hey, there's a lot deeper that we can go here. And the techniques aren't super challenging. Like we don't have to go break out in order to make some progress.

Joey (00:33:31):

There are some ways that we can extract out evidence of progress in software security, using techniques that are slightly more advanced than a unit tester, a fuzzer or whatever it might be, or static analysis of some kinds. And that we can help you get the first few steps done. And that tends to be compelling. Cause usually what you're missing security is you're you're missing any kind of measurement. What you're doing is you're like sitting under a table, hoping that like nothing bad happens today and that if nothing bad happens today, it's a good day. You don't really know how to measure that. You've done a good job. So we do find good uptake with a lot of these techniques with companies we work with.

Joey (00:34:13):

Yeah. I'm still curious, I guess a lot of, a lot of people, as you said, a lot of people have heard of testing. And I think I've been in a situation where I've written software and I guess when it comes to something like, see, sometimes you're like, well, I'll write the test later and you say that for a long time and then maybe you don't do it. And you're like, well now, and see everything so hard. And I don't even know how to use test. See, cause it's, I've got to set up my whole environment. I've got to get my library spun up in the right way to like unit tests. And then you just never write them. But I'm curious is that what you see across the board are people feeling surprised that testing is as useful as it is.

Joey (00:34:51):

So a couple of things. So I, I do think that those first few steps can be the hardest. And I think that that's a big reason why people aren't using more advanced techniques like fuzzing, that you've already invested a lot of time to understand how a unit tester works and how it applies to your code, that trying to learn another tool and another tool and another tool just isn't gonna, isn't gonna make sense. And I think that as security engineers, we need to be a lot more cognizant of that kind of cognitive overhead that we might be trying to shove on top of a developer's desk. And this is really apparent when I talk about all those academic research papers like there's, you know Angora comes out, I'm like AFL fast comes out and this other thing comes out. And each one of them necessitates an engineer, learning a new tool, understanding how it works uniquely from all the other ones, reintegrating it into their code base in order to get that like 5% improvement of the number of bugs they find it's completely untenable and no one's going to do it.

Joey (00:35:48):

So one of the things that we did at trail a bits to attempt to solve this, as we made a comment abstraction layer, where we can take a unit test lifted into a property test and then allow any number of fuzzies to consume it and provide the inputs back through a common interface we call this deep state. So what deep state lets us do is we can write the integration between deep state and the fuzzer once for whatever new academic fuzzer comes out. But on the developer side, all they've ever done is they've run their existing unit tests through a test runner that never changes. So you have this right once like cost, but you get continuing benefits as you go. So we have a good paper about that, about how it works. We've got some sample code that's been marked up with deep state tests. I know there are things like deep state in other languages, but deep States, the only one that I know of that works on CNC plus plus so, you know, I think that's kinda where a lot of security engineers fall over is that we just have this ego maniacal view that like, Oh, if I write a fuzzer that's better than everybody else's fuzzer then obviously everyone's going to use it. And then they never thought through like, no, that's actually a lot of hard work. How do I make that easier?

Dan (00:37:07):

Right, right. Yeah. So getting the, it sounds like, yeah, the, the, the approach here or the thought that you're, that you're going after is that even when you write a fuzzer and, and interface wise, buzzers are relatively simple. I remember being somewhat surprised the first time I ran AFL fuzz, cause it was like, well actually I just pointed at this directory of files and an entry point and off it goes, but even then it took sort of, I'd say it took more than a day to understand how to use this thing. And that's far too long to get started with something for an engineer, especially an engineer. That's like, you know, I have some time today to make things better. What am I going to do if you spend the day setting up the tool you've lost that day? It sounds like lowering that barrier to entry is really critical and letting people get something quick and then maybe build up more expertise as they go. And I'm, and I'm, I feel like all too often tool developers are maybe not aware that just how little you can initially ask of users. I know, I know we fall into this trap occasionally as well as like you've really gotta make the first hit super easy and, and find things quick and then let people build up let people build up capability as they go.

Joey (00:38:19):

Yeah. So I, I think that disconnect is just like a parent and it's, it's holding back progress in the whole industry. I think that security engineers need to better understand the value of found objects. Like if your developers or your client up with unit tests already written, how do you enhance the value of those without causing an organizational problem inside their company? Like you're not going to be able to retrain all the developers to become fuzzing experts overnight, even if that might only take four hours, cause four hours with all of them is just going to be impossible to get. So what is a way that you can incrementally improve unit tests to maybe convert them into some kind of negative tests or add some randomness to them that gets them, you know, X amount, more coverage since that's going to still have massive benefits even if you can't ultimately like solve the problem.

Shpat (00:39:13):
Yeah. That resonates a lot, even from a business sense of like, not trying to go and try to boil somebody

else's ocean overnight. But you know, what are small things

Dan (00:39:25):

You know, having reproducible builds that you can test more easily. It doesn't just make security engineers happy. But most security engineers, I'm sad to say aren't really software development experts. They're not people who have been in the position of writing a whole bunch of code in a previous life. It's, it's kind of a thing that I struggle with because if you're familiar with my background, I used to teach in the computer security kind of curriculum at NYU for a long time, I helped them get a NSA center of excellence certificate. I went through that program. I got a concentration in information assurance and a lot of folks that are going through programs like that ended up becoming super specialized on the security part of things too quickly. And then they lose the kind of breadth that they might get of like, how do I be a software engineer? So you got a lot of people that are really good at pointing out bugs and a lot of people that aren't so great at fixing fundamental issues underneath.

Shpat (00:40:19):
It sounds like you had some very first hand experience with this stuff even before starting trade a little

bit.
Dan (00:40:25):

This is, this is why I'm the pointy haired boss now. And I don't have to write any code because I think I fell into that trap by accident, but I empathize, right? Like that's, that's what I've been able to do. I know that's an, even though I'm not a software engineer, at least not anymore. I still know that that's the correct solution to these kinds of problems earlier in the conversation you were saying that you know, th there are a lot of things that people come to trilobites with and the solution isn't always better testing, like testing is just one part of, kind of building a secure application and arriving at safe software. So one way that we tried to measure this too, is what are the limits of testing? How often do we run into bugs that are impossible for a software testing tool to find?

Dan (00:41:12):

And this is an area where again, we've tried to extract information from the projects that we work on. We have the, the real, like privilege to work with hundreds of code bases a year. So we have really good metrics around what kind of stuff is out there. So our research project we did last year is we collected every single smart contract review. We ever put them into a big database and then individually re- reviewed every single bug and tried to differentiate could we have found this bug with a testing tool, not even like a testing tool we have right now, but hypothetically, if the best testing tool in existence were created, would we have been able to find that bug? We got really neat answers out the other end of it. And what we found was that at least for smart contracts, about 80% of the highest risk bugs.

Dan (00:42:00):

So the critical bugs could have been found by a testing tool. But we found that out of the total, about 50% of all the bugs that we discovered on all engagements were things that were outside of the scope of what any kind of automated anything could ever find. So even though maybe you've got such incredible testing there are still going to be issues for which a human is needed. You need to have a conversation with a smart engineer in order to make it apparent that there is risk in what you're doing. This is good, you know, I'm not out of a job yet, and neither are you guys. But it was also a little bit shocking to me that like the number was as high as it was for the critical issues. Cause that's what people care about generally like a bad anti-pattern for people in my kind of side of the industry is that when you get a pen test or you get an AppSec review or you get a code review from a company like trail of bits is you just go and fix the highs or you go and fix the highs and the mediums.

Dan (00:42:58):

And you're like, ah, these low ones don't need to get fixed, throw them out, like, forget about them. So it was really interesting to see that like if you just did the automated kind of approaches, you actually get the majority of the things that you ended up caring about. Since those probably have the highest impact of like severity and w you know, would have caused you the greatest downside. One of the really neat thing that came out of that study unit tests had no impact whatsoever on the number or the severity of bugs that we found. So on a given two, three, four week project the unit tests of like the percentage coverage, like from zero to a hundred, there was zero correlation between how many, how many bugs or how critical bugs we found. I think a lot of that speaks to, you know, it's possible to write negative tests as unit tests. It's possible to test your access control systems. It's possible to do, you know, X, Y, and Z, but nobody does it. A unit tests are functional. They want to make sure your software works correctly. And that's the way that most engineers write them. It's, you know, security is the opposite. End of it. Security is mostly the negative testing. And unless you are taking a deliberate effort at doing negative testing, then you're not doing stuff that's improving your security. And let's say,

Joey (00:44:21):

Let's make sure we just describe what negative testing is in case someone that's listening. Maybe hasn't heard that.

Dan (00:44:27):

Oh, sure. Just like the, the really layman's version, like a positive test would be like, well, if I put five into this function, I expect to get 10 out the other end, that's a positive test. So did I get 10? Yes. Sweet negative testing would be if I send in everything, that's not five and make sure I don't also get out 10. So negative testing is just the flip side. It's everything that positive testing isn't and the problem as most security engineers will be aware is that there is no limit to what negative tests can test for. It kind of never ends which is why, you know, symbolic execution is so nice because it builds up constraints and you can kind of, sort of find the limit sometimes. Like, unless you had a dedicated effort to write unit tests for security things, you shouldn't expect that by virtue of having unit tests, you are safe. Yeah, so that was a cool paper to work on. We also broke down kind of what different categories of controls were most common failures in the smart contract world. I would love to write this again for a more typical software for like large C plus plus code bases, large rust code bases, whatever. But all those tend to be a lot different,

Shpat (00:45:41):
Right? It, it, it sounds like the, the exact same reason why testing is kind of more amenable to blockchain

is why it was more amenable to write this paper.

Dan (00:45:50):

Yep. Yeah. Yeah. So I think it's, it's definitely not one of a kind, I know someone's going to pop up and say, but I have a paper that says exactly the same thing about a different area, but, but it's certainly a rare piece of work. Is this pub, I'm assuming this is public. Oh yeah. There is a blog post that summarizes it, that I'd recommend reading you know, academic papers can be super dry. So the blog post is a good summary.

Shpat (00:46:17):

We'll share that. Perfect. So I want to go back to the, the world of blockchain for a second. Since you have, since you do a bunch of work in this area, you probably have opinions of they're doing a lot of things, right. What else could they do? If somebody is listening that works on smart contracts, or just on blockchain projects, what should they be doing that maybe they're not doing that would make their software even better?

Dan (00:46:46):

That's a great question. Clearly the testing part of things works very well on the blockchain. A lot of people write property tests. They specify ahead of time with our software should do, and then they have programs that automatically generate tests that prove to them that it works that way. That's excellent, but it's not a question of, should I have good software tools or should I do testing? Why not both. I think that a lot of times on, especially if the, which, you know, there's a lot of blockchains out there, but the way things are going, Ethereum is just going to be it like hate to break it to anybody, but they've got 90% of the attention, 90% of the money 90% of the developers, like that's kind of the way things going. There are some other really great ones out there that I, I think have incredible ideas.

Dan (00:47:35):

And what many of them are doing is interoperating Ethereum to kind of bootstrap and gain that kind of like that, that focus, not energy. And that's great too. And I hope that they do well, but just as from a realist perspective that's what I see. And I think the entire Ethereum community is hobbled by the poor development tools that they use. The solidity language in particular needs to go away. There's no reason that people should write in a language that has that many foot guns that makes it impossible to do software engineering correctly. So whether that means using an entirely separate language, we're just removing features aggressively removing features from solidity to turn it into more of something like a DSL, a domain specific language would be excellent. And that's something we've offered to do.

Dan (00:48:24):

I know that there are other teams of people that are trying to take solidity and Decker deprecate the whole code base that compiles it and re implement it inside. LLVM, that's a great thing, too. I didn't even get into the fact that the solidity compiler itself inserts bugs into people's programs half the time when people compile like they make patch releases multiple times per week, because they end up having miss compilation issues that can affect like Gates in your program. It's insane how many potholes and minds are like buried when you're trying to develop secure, smart contracts and people fix it all in, in post, right? They fix it all in tests after the fact. And it's, it's nuts that it works. It does work though, because people are, you know, we work with companies that have hundreds of millions, if not billions of dollars in smart contracts. And I feel very good about that because they have the test to prove it. But to get there, it shouldn't have taken that much work. So please, it's not a choice. Just do both.

Shpat (00:49:29):

So that's at the, at the very high level, right. If you're somebody who maybe can't influence what, or really doesn't have the, the skills or the, it doesn't live in a world where you can rewrite the, the programming language that works for this, if you're just a regular developer. And I don't mean that in any way negatively, if you're a developer,

Dan (00:49:52):
There's gotta be people out there that write the code that, you know, lets people use it.

Shpat (00:49:58):
Exactly. you know, testing sounds like it's great. I'm curious if you have any other advice

Joey (00:50:06):

You know, people succeed more when they're writing high assurance software like blockchain software when they have a spec or when they have invariants that they're going to adhere to, or when they have security properties, they've documented, I don't care what word you use for it, but if you know what you're going to build, and then you can test against that later, you're going to have a better time. So search out whatever tools let you do that. And usually, you know, there's a choice of like what kind of tools to use from a consulting perspective. We like tools to generate results quickly. And I think that as a development organization, you should do, because there are approaches that you can take. For instance, the K language is fairly popular in the Ethereum space. People have used it to great effect, but actually taking advantage of all of its benefits takes months of time.

Dan (00:50:56):

It's a really poor approach. If that's the first thing you're going to do in order to improve your safety you want to make those decisions about where you can spend the lowest effort and get the greatest return. Whereas going down a really heavyweight verification path like using K is a very heavyweight, but also a very high return. So that's not where I would start where I would start would be very high level properties. Things like if I transfer $10 to you, that means that my bank account has $10 less than your bank account has $10 more. That's a really good security property for everyone transferring money. And you should be able to express that in some kind of language that looks a lot like whatever you're coding in. And then you should have a test generator that produces negative tests that make sure you conform to that in variant.

Dan (00:51:50):

And the test generator can be super janky at first. It can be a fuzzer. It can be, you know, you're, you're you random, whatever, just pipe to your input. That's fine. And what you do is you improve that over time after you get your dumb fuzzer, then you get your smart fuzzer, then you get your symbolic executer, and then maybe you start throwing all kinds of fancy, abstract interpretation, whatever at it. And that's when you call up someone at Galois or someone at trail a bits, and you're like, what kind of crazy research things can I do? But as long as you have the properties to find that, lets us do something with it. So that's kinda my, my spiel here.

Shpat (00:52:28):
It makes sense. A lot of sense.

Joey (00:52:29):

Yeah. And so in the blockchain, I think you, you mentioned this a bit in the blockchain space. We have seen in unusually high incidents of wait, what you described as really heavy weight, formal methods stuff that's comes often straight out of academia to be applied to blockchain technology. Has that been providing value in your opinion to blockchain? We're not necessarily talking about optimal value because you just sort of, you have expressed that, that bang for your buck term you're, you're better off applying some of the techniques that we've been talking about but are, is, is the longterm value of parent in that community.

Dan (00:53:08):

So there are a lot of use cases that would be impossible to achieve. If not for these testing tools, you would not be able to run an exchange and let people custody assets on your exchange on, unless you had these testing tools, because there's no way that you can gain that kind assurance that I'm holding a billion dollars worth of cryptocurrency safely on behalf of other people in a way that I can't recover from if I'm hacked unless I have some kind of like, you know, say symbolic execution or stronger guarantees out the other end that I didn't inadvertently leave some code in here, that's going to screw me. So what you're, what you're, what I'm definitely seeing is that there are a lot of risky things that people would like to do that are enabled by the fact we have good testing.

Dan (00:53:56):

Now there's a lot of areas where they've over indexed on that risk. They feel a little bit too safe. And that's where I see things like if you're really in the middle of the blockchain community right now, you know, about defy and decentralized finance, which are people making these little autonomous finance bots that trade on behalf of people, who've put money into them and it's all on chain and there's like no

way to recover from any of it. And most people don't have any sense of like owner privileges or private keys that let them stop what's going on. They just give it away to the community. It's no governance, right? I think there's an area where people feel a little bit too confident in the quality of the tools they've raced ahead of what they can reasonably assure. But still though, there are a lot of people that have built applications like that that are taking an extraordinary amount of risks, and they're completely dependent on the ability of these kinds of tests and tools to get them there. So, you know, I think that the future of the blockchain field depends on and is firmly intertwined with software security testing and research, which is very weird. Like, I don't know really any other field of technology where that's so much.

Joey (00:55:10):

And I think maybe where things haven't matured as much as most would hope. And I think this is the case for testing as well, explaining what that result means and what you can depend on and what you can't is typically a step that gets skipped because it's really a pain to do. And it makes people nervous when you say, well, I can guarantee that things are going to work right under these conditions. Which is, you know, the conditions I've been testing and the conditions I've been proving in, but then they say, well, what about the rest of the conditions? And you're like, yeah, we gotta work on that. We know somewhere else to look now, but doing that consistently is something that is, you know, we work on it really hard, but we're not quite where we want to be in. And I know that, I know that anybody that has tests maybe wants to explain to their boss, like I've unit tested this code. So I know it's great. And that that's that feels good. But at the same time, when it comes back to something has gone wrong, the boss is going to say, well, you told me there were a unit tests. What happened?

Dan (00:56:12):

Yeah. So definitely understand the limits there is important. You know, one thing related that really screws me up on blockchain software is sometimes when you do have those gaps, let's say you've put in all work, you have your symbolic test suite and you evaluate every potential state change, you know, the constraints at all of them. And you've understood what all the bad States you could get into are, and you've eliminated them by changing the code and you're just like totally safe. And then the blockchain forks underneath you and introduces new behavior that makes you vulnerable because smart contracts are supposed to be forever. You're supposed to put them on the blockchain and be able to trust them forever. But that assumes that the underlying blockchain and the semantics and all the instructions and everything don't change. But in fact, once every six months, the Ethereum blockchain has a hard fork and it does change.

Dan (00:57:01):

So reassessing and re-evaluating the efficacy of your test suite and kind of understanding what approves and what it doesn't is kind of a nuance that has been, I think, the most difficult thing to educate people in the blockchain industry about, because now that they've caught on, on these buzzwords, they're like, okay, so formal verification is a good thing and we can do it. Right. Like, yes. It's like, okay. So are you formally verified? Like what, what does that even mean? Like for only verified for what, how like, so I, I think a really big conversation that I've had to have with people is like, look, there's two parts here. How many properties do you have? I'm like, what do they express? And then how are you proving them? And the combination of those two things indicates what you're getting. And that kind of nuance I think, is lost in a lot of people.

Dan (00:57:49):

So right now we see this race towards the community demanding that various. And this is like, again,
like bonkers, right? Like who, what user of Chrome is like, you have to formally verify this, or I won't use it, but on the smart contract side of things, it's like, no, I refuse to use any software. That's not fully verified, like, okay. And that's what users or alternate universe. So a lot of the users, they only know to look for like a check Mark. They're like, Oh, finally, only verified check. Yes. Awesome. So, so that's been a conversation to have you know, we get into it. It's a question too, of like how we communicate. It's not just on our clients. It's not just on users becoming more sophisticated around like what software they use and why, but from the trail of bits perspective, we've started to find visual ways to report long-term risk.

Dan (00:58:40):

So something we start doing in our reports now is we have a maturity report. There's like 12, yes or no questions we can ask to kind of get at the heart of whether you're a competent software engineering organization. Do you have tests? Yes or no? Like there's a lot behind that question. So in our code maturity reports, we have a stoplight graph where we have 10 critical controls that we believe should be present in every code base. And then we rate them from red to yellow, to green, and it gives you a very easy visual indicator of how much effort the team has put into securing their own code base. And that usually is a little bit more understandable than like trying to explain what formal verification actually means to people. They know that, Oh, red means bad. Like, yes, this is good.

Dan (00:59:28):

We're, we're helping. So it, it, you know, it's again like having some empathy for users and, and this is weird because in this case, the users aren't the client, the users are actually a third party to this whole conversation. There's some rando on the internet that loaded up a PDF that I wrote and needs to figure out if they're going to put money into a DiFi project. Like oh my God, this is not a problem that I was equipped to solve, but you know, that's an approach that I think works and it, it empathizes with where people are and kind of understanding what you get from different kinds of tests.

Dan (01:00:02):

Yeah. And so you don't even get to solve this problem just once basically you don't get to describe what you've done one time in one way and say, we're done. You need to describe it for that third party user. That's just like, kind of looking for the yes or no. And that maybe is like the green light red light is enough, but then you're also for your direct client who maybe needs to be ready. The next time that fork happens to reevaluate the properties one by one and say, well, did the fork break this property? Did the fork break this property and to understand where they stand in that situation? Yeah. So it's like a really layered, heavy duty approach to even describe what's going on in these cases.

Dan (01:00:39):

So what's really cool is that when the blockchain does fork, because there is just one blockchain and everybody has access to all the code on it, you can, and everything's reproducible, right? Like I have the same blockchain you do. If you have really good tools, you can just evaluate the impact of those changes with a program like across the whole thing, which is super cool. That's something that we, we did with another company called chain security. Some academic researchers over at ETH Zurich when we were really worried that there were some things that were changing in Ethereum. We had a static analyzer. They had a, I think it was a small like executer. And then with our two powers combined we managed to evaluate every single piece of code in existence to see if it would be safe or unsafe after that change

took place. So obviously a client of ours could do that on their own against their own code, but there's really this extra opportunity for like security surveillance that, that doesn't really come by quite that often. It's something you really only see that like Microsoft could do now with, with their GitHub acquisition, they might be able to tell you the same thing, but even they won't be able to build all the code on GitHub, like fat chance. They can figure that one out

Joey (01:01:52):
People can't build one project on Github so going after all of them is a pretty tall order.

Dan (01:01:57):
Yeah. And that is the difference between normal software and blockchain so far.

Joey (01:02:02):
So we're coming close.

Joey (01:02:03):
So the end of our time Dan, do you have anything else you'd like to share before we wrap up?

Dan (01:02:08):

Yeah, sure. So I think that people should be aware that even though trilobites are very strongly focused on like practitioners and that we want to solve problems right now, we're also neck deep in the academic community. We publish our own peer reviewed academic work on a regular basis, and we've also incentivized other people to do the same along areas that we'd like to see research done. So we have something called the critic prize. It's a $10,000 prize for people that are building novel tools on top of the research that we've done. You can find more information about it on our blog, and I'd really encourage for folks if they're interested in anything I've had to say to, to go check out our blog. We do try to share as much knowledge as we can all the time. And I personally think it's a pretty good read. So come on down,

Shpat (01:02:53):
We'll, we'll include a link to that. Thanks then.

Dan (01:02:55):
It was great being here on the podcast. Thank you so much for having me. This was a great conversation

and I hope that all you listeners out there got something out of all the crazy things I shared with you.

Shpat (01:03:06):
Well, this listener did for sure. So we'll start there. Cool.

Dan (01:03:12):
Well, I'll catch up with you guys at work.

Speaker 3 (01:03:16):
We'll catch up for sure. Dan, thank you so much for being with us today.