Building Better Systems

#3: Stephen Magill & Tom DuBuisson – Musing on continuous code analysis

Episode Summary

The founders of MuseDev discuss making modern static analysis usable and leveraging the latest promising research for automatic bug finding. MuseDev is a spin-off of Galois.

Episode Notes

The founders of MuseDev discuss making modern static analysis usable and leveraging the latest promising research for automatic bug finding. MuseDev is a spin-off of Galois.

Video of this podcast can be found on our Youtube channel:

Galois, Inc.: https://galois.com/

Joey Dodds: https://galois.com/team/joey-dodds/

Shpat Morina: https://galois.com/team/shpat-morina/

Muse.dev

Tom Dubuisson: https://www.linkedin.com/in/thomas-dubuisson-62910453/

Stephen Magill, https://www.linkedin.com/in/stephen-magill-2070a096/

Continuous Reasoning: Scaling the impact of formal methods by Peter W. O’Hearn https://bit.ly/2I0TJEs

Episode Transcription

Shpat (00:00:00):

Welcome to another episode of the building better systems podcast, where we explore tools and approaches that make us more effective engineers and make our software safe and reliable. My name is Shpat Morina.

Joey (00:01:04):

and I'm Joey Dodds.

Shpat (00:01:07):

And today we're joined by Stephen McGill and Tom DuBuisson, Stephen and Tom are cofounders of muse dev. Um, we used to have is actually a GWAS spinout. Um, uh, it's a software company that aims to bring analysis tools that help developers quickly find and fix bugs. And we're gonna get into what that means kind of later on, but, uh, Stephen and Tom, thanks for joining us.

Stephen (00:01:29):

Thank you, happy to be here.

Stephen (00:01:33):

We're happy to have you, um, you, you just recently launched, so you, you were probably doing a lot of these interviews and things like that, or is it, is this kind of new for you?

Stephen (00:01:44):

Yeah, it's an exciting time for sure. Yeah. We're, we're out there on the get hub marketplace now. And, um, and so yeah, we've been, we've been talking to some reporters, you know, putting some, putting some blog posts out there, just describing the system and the capabilities and making people aware that it's available. Um, I'm very, very excited.

Shpat (00:02:03):

It's fantastic. We start kind of with, What is your approach to building better systems? Tom, you want to start with that?

Tom (00:02:25):

Absolutely. Uh, choosing a good foundation and good people, you know, that that's the very, very start after foundation, people, then you can start talking about quality processes, like code review, but it all starts with, with a basis. Uh, I'm not ever going to start a financial company right. In assembly with the wrong people.

Stephen (00:03:00):

Yeah. Yeah. I agree. I mean, it's often, um, I think when it comes to quality systems and by that, I mean, like systems that are reliable, that are secure, um, that perform well, uh, you know, it's, it starts with cultural practices and, um, getting, getting people in the right roles, but getting people sort of focused on the right things as well. And so, you know, you hear a lot about, um, like that stack ops as a, you know, new approach to security and really all that is, is just, uh, you know, having developers focused on security, making security, a core concern, uh, throughout the process, not just at the end of the process when the software is delivered. Right. And so, um, you know, those practices that support that things like code review, uh, things like developer training, uh, ended up being hugely important.

Joey (00:03:49):

And this is something, I guess, the company that you two started, it's not something you need to be aware of just for yourselves, but that you're actually enabling others to be more effective at.

Stephen (00:04:00):

Yeah, that's right. So, you know, the, uh, like I said, the processes in place in the culture, that's where it all starts. Um, but then there are certainly ways that tools can support that process and the tools can help, um, identify things. You know, people are good at finding certain types of errors tools that are good at finding certain other types of errors. And so you want to leverage each for, uh, what it's best suited to. I think the key thing with tools is when you bring them in, you want to bring them in a way, in a way that works with your process, right? You don't want to set up this agile dev ops forward process where, you know, you're pushing code immediately to the production after each code change, um, and then bring in tools in a way that slows that down right there. You know, you put that in place for a reason. Speed is a competitive advantage nowadays. And so, um, you want to be able to bring in tools, uh, in a way that, that works with that process. And so that's really been our key focus as we, as we build muse to have this company that's supporting this platform views that incorporates a bunch of static analysis tools. Um, we're always thinking about how can we do that in a way that works with developers,

Shpat (00:06:01):

so actually for the people who, who might not know, what does, could you give us a little bit of a primer on what you started Muse to do and what your doing?

Stephen (00:06:46):

Sure, sure. Yeah. Um, so we started muse for a variety of reasons. One thing we wanted to do was make it a really easy for people to make use of and try out static analysis tools. So, um, you know, we we'd been all doing work and program analysis, static analysis, the founding team, uh, for awhile, I started working in that space 15 plus years ago when I was doing my PhD, uh, at Carnegie Mellon. And, uh, and then continued doing that sort of work for quite awhile. Um, but got more and more interested in, in how to get these tools in front of developers, how to, how to have an impact on day to day development processes, because often, you know, cool research happens and it stays in the lab, right? You get these tools that work really well, but are kind of fiddly and require a lot of expertise to deploy. And so we really wanted to bridge that gap and make those tools much more accessible. Um, and so, uh, the way we've done that is we, we integrate with git hub. Um, so we're available in the github marketplace that git packages to git up app just, you know, sort of single click install process, uh, like I'll get how apps to enable it on your repositories. And then you get access to a broad suite of static analysis tools. Um, and we worked really hard to, uh, make it automated, make, uh, all the setup and configuration an automated process. So that in most cases you can just turn it on, it'll analyze your code and give you results. Um, and then once, once it's enabled, um, there's been a focus on how do you most effectively deliver results to developers because, um, people don't want to go look at a list of a thousand bugs that are sitting in the repository right now and start working in that down. Right. That's um, that is a way to do things. That's the way that some companies do things, but it's, it's never very successful. And, uh, it doesn't bring people joy.

Shpat (00:08:34):

What works?

Stephen (00:08:35):

Yeah. So, um, what's much more effective and I think it's really Google and Facebook that first discovered this. They were experimented with a lot of ways of deploying these sorts of tools within their development processes, um, and published a lot on what they tried, what worked, what didn't then what they really found to be effective, um, was integrating into code with you, right? That's this step, like we were saying earlier, it all starts with people and processes and workflows, right. And code review is like the key step that you can introduce into your development process to really increase the quality of the software you're delivering. Right? It's a place where everyone comes together. The team looks at the code, your peers examine the code change you made offer feedback and comments, and everyone agrees on what changes should happen before it, it gets merged. Um, so it's also a perfect place to surface the results of, of static analysis tools, right? If, if a tool finds a bug in the code change that you just wrote, um, it's a great time to mention that that bug report, because the developer just wrote the code, it's all fresh in their mind. You know, you don't have to do any work to remember what you were doing and think about how to fix that. Right. It's the same, same sort of process that you'd use to respond to your peers feedback during code review. And so that's really, um, that's really a key stage that we focused on.

Joey (00:09:48):

So mechanically, this is actually something where the tool is, is finding bugs on cloud infrastructure, basically. Uh, it, well, the tool is running on cloud infrastructure to find bugs in your code and then interacting with, it feels like interacting with the code review. Is that what you're saying?

Tom (00:10:04):

That's right. And it's, it's amazing that it took us so long to get to this point as an industry. Uh, you would think that when we're reviewing code, when we're talking about the quality of the code, the bugs that might exist, of course, an automated process and a human telling you, the bug might exist, uh, should be treated similarly. But, but no, it's taken us a long time to evolve to this point. And some of that's just technological barriers and some of it's, uh, people's concepts about how to use a system.

Joey (00:10:31):

So maybe this is too tricky of a q uestion, but, um, how, how should I think obviously I think the very best time to get feedback is the second you save your code or like the second you type in, how do I think about analyses that belong in that setting versus analyses that belong in the setting that you all are providing?

Stephen (00:10:51):

Yeah. So, um, the what, what delivering results that code review time lets you do is, uh, perform a deeper analysis of the code, right? So you have, you have a longer window in which to work in which to analyze the code. You know, we shoot for a 20 minute turnaround time on most projects. Um, and so, uh, that really gives you enough time to, uh, to do deeper analysis that catch things like thread safety, errors, you know, synchronization problems and multithreaded Java code, uh, no pointer D references that span compilation unit span libraries. Um, and so, you know, I think it's a great time to surface those sorts of results. The ID is a great time to surface, uh, things that are more local and syntactic, right? So like, uh, uh, obviously syntax errors and so forth, that would be caught by the compiler basically things that can be caught by the compiler can often be shifted, uh, into the ID itself.

Shpat (00:11:51):

When you were describing muse to us, you spoke, about this drive to not let cool research sit on a shelf would make an actually accessible for people so that they get benefit out of that when it comes to finding all sorts of bugs and reliability, security, what did you identify as some of the important things to kind of lift from, from cool research space into, into usability? What w w what are you excited about, um, when it comes to what muse brings to, to folks who might use it?

Stephen (00:12:27):

Yeah, so the, the race detection, uh, that I mentioned is a great example of cool research technology. That's really at the cutting edge in terms of what, um, what program analysis tools can find.

Stephen (00:12:52):

Tom mentioned, uh, you know, sort of in, in why, why haven't we been at this point before, uh, some of the technological barriers to getting to this point, one of those is just the sheer scalability of, of deep program analysis. So, um, a lot of the tools that you see out there, uh, you know, fortify and Checkmarx and these sorts of systems, um, they, uh, they do a deep analysis of the code, but it takes a long time. And, you know, often people deploy these in sort of overnight runs that they run, you know, maybe every evening or once a week or something. Um, and, and really that's, that's sort of, uh, the way that program analysis works five or 10 years ago. Um, it's, it's only more recently that there's been a lot of work in extremely scalable analysis technology that lets you, uh, analyze millions of lines of code and still turn around and get an answer in, in 20 minutes. And so, um, the, you know, the food safety analysis, I mentioned in the no pointer analysis, there's a sort of various, really a high powered, uh, from a mathematical perspective, um, analysis techniques, underlying these tools. Um, and, uh, yeah, and so what we're trying to do is, is make those more accessible. So you don't have to understand the deep math, you know, it's there, there's great research papers that describe how it works. Uh, but you shouldn't have to be a program analysis expert to use the tools, right? The point of the tool is to package all that up in a way that just delivers results.

Shpat (00:14:15):

Right. And so if I was a developer that wanted to use muse after hearing this, um, what would be the sort of things that maybe I'm not even aware that I should be worried about that I don't have to be worried about later, uh, that wouldn't be easily accessible in other things?

Stephen (00:14:35):

Yeah. I mean, I think one thing that tools do a really good job of is catching all the corner cases that, um, may not be top of mind, you know, maybe you read about it six months ago, but forgot, you know, or if that language feature has been deprecated and there's something else that's recommended now, um, all of these sort of quirks of various APIs and frameworks or language features, um, those are really hard to keep in your head, um, as a developer, when you're writing the code. And then as a code reviewer, when you're reviewing someone's code, you know, to try and go through that checklist, um, is just really draining right. Better to spend your attention and focus on, uh, you know, things like architecture and the algorithms involved in data structures and, and sort of these higher level concerns that really require human input. So I think it's, if that long list that one, one source of value.

Tom (00:15:29):

you covered it pretty well. I feel like what, what constitutes corner case is actually really domain specific. We have one customer, for example, uh, who has this tool that is for internal processing of big data internally. And it's really interesting to see that we have all these, all these analysis that run on their code, and some of them provide, you know, type errors, like, like Pyre on Python, really useful stuff that says this isn't going to function the way you think is going to function. If there's a buck, then there's other security tools that we run on their code base. It's like, Oh, well, you're executing this improperly. Or you're reading this, this from an unsafe source. And they don't care because in this context, it is a tool that they run on internal data. So really what constitutes a corner case is, is hard. And this is, uh, feeds really well into the distinction of the static analysis of 10 years ago, static analysis today, 10 years ago, you, you might run this weekly run every week, and you just know that these bottom 50 errors you ignore every single week, because, because they're exactly in this corner case, you know, variety that you, your company just happens not care about, but when we can run it on an incremental manner, we can only be, can show you only the incremental bugs. And so unless you've actually introduced one, you never see it again. So it has this filtering effect that any false positives just don't arise anymore. Uh, and that's really helpful when you get into tools that detect corner cases.

Joey (00:16:53):

Is there a way to label things that maybe are actually problems, but you haven't gotten around to fixing yet?

Tom (00:17:00):

It depends on the language and the tool, uh, muse is a platform support of many tools. Uh, but yes, uh, there, there are, especially in the Java ecosystem, a lot of ways to annotate a code and, and to make the tool ignore, uh, issues. Uh, what, what I, I see the industry drifting toward more though, is a way to have, have machines learn about the false positives that have persisted in code for awhile and, and to stop reporting on them and have, uh, human mechanisms such as the feedback on poor requests and only on changed code, uh, so that we don't actually present the user with data about bugs that are preexisting about bugs that have nothing to do with what they've written. So it's this sort of a one, two punch of, of a technological solution where we filter out bugs. And we talk about false positive plus and a human solution where we present the right data at the right time.

Joey (00:17:53):

So it sounds like you all, and your tools are sitting is a bit of an interface between what may have historically been researched tools and actual developers and large systems. It sounds like some of the work you're doing is modifying the results that you're getting back from those tools, but are you also providing feedback into those tools to help them meet the needs of developers better?

Stephen (00:18:19):

Yeah. Yeah. So we, um, we do when we set up and configure the tools, um, bring out a lot of, uh, our own experience, but also what we've seen from running these tools at scale over a lot of code, um, you know, what people are caring about, what works what's considered noise, what's considered valuable, and, you know, I think it, um, it impacts what you present and how, but also how you present it, right? So like, you might want to report something, but not make it block the bills. Right. You know, there there's a lot of nuance you can have, especially when you're part of the code review process, you know, that issue is going to be seen, you know, someone's going to look at that and have some workflow for triaging that and signing off ultimately on the merge request or not. Um, and so, uh, yeah, we've certainly learned a lot, like coming from, coming from the academic side of things and just writing tools and thinking, you know, all I need to do is produce a list of results and then that tool will be valuable. And, you know, if I can find the more things I can find the better, you know, even if it's a little bit noisier, right. Like that doesn't necessarily intersect well with, uh, with the day to day development worlds. Right. Um, and so we've learned a lot, you know, as we've brought these tools, um, to practice, uh, about that, that developer and really user interface side, you know, what do people expect to see? What, what makes sense and works well with their work and take, pause and wonder what's going on.

Joey (00:19:51):

that's really cool is one of the things that's going on. I, I assume you're getting enough view into cuss in, into some of your customer's workflows that you're actually seeing people interacting and communicating around some of the issues, the tool raises is that that's actually seems like a really big benefit is that I'm not the only one that's going to see, that's going to see the bug, you're going to see the tool response, but that maybe me and my team can see it and we can have a conversation about it right there.

Stephen (00:20:22):

Yeah. Yeah, that's right. We do see, um, we do see discussions of the bug. We see people, you know, commenting on how they fixed it and following those workflow, those good review workflows. Um, and yeah, it's, you know, I think really it's a, it's a really exciting thing. It's honestly something, um, we want to find good ways to, uh, share with the research community involved the research community and, you know, like, um, it's really hard as a researcher developing a tool to first of all, to roll it out to a lot of repositories and then, you know, collect, collecting feedback as yet another hurdle, you know, you can try to reach out to people and email people or get run user studies, but that's all very involved, right. And adds a lot of overhead to your research. Um, and so, you know, to the extent that we can, uh, make our platform available as an experimental platform, especially when it comes to open source repositories, right. And, um, you know, easily running tools over open source, collecting results. Um, and then, and then acting on those reporting those, and as part of the experimental results section of a research paper, like, um, you know, that's, that's definitely something that we want to support. And I, I gave a talk at, um, the infer practitioners workshop at PLDI, and we have a tutorial coming up at sec dev as well on using the muse platform for exactly these purposes. And so, you know, I'd encourage any listeners who are themselves program analysis researchers to, to check that out and see if that would be useful in their work.

Shpat (00:21:46):

it's interesting that, you know, you were leveraging pretty cutting edge research and tools, um, to help kind of people build better systems, essentially. Um, but at the same time, the way this conversation started, we highlighted cultural practices at the, uh, at the very top. Um, and that really resonates, um, when it comes to the tech companies in general. Unfortunately not every culture is, is great. That from time to time, there are places often there are places that are a little bit more toxic, um, where maybe developers are put in, in this situation where they're blamed for things. Um, I know you've thought a little bit about that. Um, I'm just curious about like what your take is on tools that, that kind of uncover problems. Um, and how, how does muse tackle that in terms of kind of helping, helping the developers as well, as well as the whole system that is trying to find, find the issues with?

Stephen (00:22:56):

Yeah, yeah. I mean, I think, um, there are a wide range of cultures and practices out there and, um, you know, one thing I was, uh, excited to get to do recently is participate, uh, for the second year in a row in the, uh, state of the software supply chain research with Sonatype. And so this is, uh, the research we do, we've done the last couple of years on development practices, um, in open source. And then this year we looked in particular development practices at enterprises at enterprise companies and did a survey of, um, with cultural practices, but also also tool tooling, um, that's in use, uh, and then things like job satisfaction as well. Right. And so there were a few interesting things we found, um, and, uh, the report's not out it'll be coming out in the next month or so, but, um, some of the, some of the interesting findings are that, uh, organizations like open source, we had this collection of attributes that we called, uh, open source enlightenment. Right. So it was things like, uh, do you support contributions back to open source? Right. Do you upstream your changes? Uh, you know, to what extent do, uh, do you share, uh, do you share some of your work as open source, right. If you do, do you accept external contributions or not? Um, things like that. And, um, and that was a very highly correlated with job satisfaction. It was really interesting, like sort of how close to the top that was. Um, found that, uh, when it comes to, to these practices around, um, security and in particular compliance, right? So compliance is a big concern, especially in highly regulated industries like finance and banking. Um, but even like retail brands, consumer brands, you know, any, anytime you're handling credit card information or personal information, you have a lot of regulations you have to adhere to and so that sort of necessarily imposes up on the software, um, requirements in terms of the development process and the checks that you're doing. And, um, and so you could see in, you know, there was sort of a group of companies that, um, they didn't have those concerns, didn't have a compliance process. Didn't sort of worry about that. Right. And they had, uh, they reported certain outcomes in terms of agility and velocity and you know, how quickly they release and how quickly they can turn around code changes and things like that. Um, and then like, as soon as you shift to having any sort of process, you, you sort of like that's unattainable, right? Like you can't, you can't move that quickly and still, you know, pay attention to what you need to be paying attention to from the security and compliance side. Um, but then within that subset, like once you make that decision that like, yeah, we have to comply with this. We have to, you know, have internal processes that ensure X, um, then tooling becomes very important, right. And, and things like, um, tools to, to, uh, analyze dependencies, to do static analysis, to, um, to help manage your workflow. Um, those start to be the factor that differentiates sort of high performers within that class from lower performing companies.

Shpat (00:25:49):

That's very convenient.

Tom (00:25:51):

That's a, a great external view. Uh, the, the internal view, uh, you, you said we we've thought a little bit about, uh, about the, the sort of tension between, uh, the, the developers in an organization and maybe the management of those developers. Uh, that's actually not true. We thought a lot about that tension. Uh, and so we, we really try hard to be the developer's best friend. Uh, we wants to bring all this information and all this power to the developers while they're developing, so that these sorts of issues don't propagate into the security team's purview, the QA teams, purview or production. And then suddenly you do have that this negative impact or this perceived, uh, you're making my life hard, sort of a dichotomy. And, and, and we want to, make, uh, the, the developer and the security analyst and cozy up more than adversarial, or we believe that with some, and with an Avenue for, uh, other teams to add, to tooling to automatically check, the code before it gets out of development, uh, we can make this a much more cooperative process to the benefit of the entire company.

Joey (00:26:55):

So the idea is maybe to try to kind of avoid the situation that maybe a lot of developers have been in, where you kind of dread like the pen test results, basically. You don't want to dread them cause it's really your weight at the end of the day. You want to know, like, is my code secure? Did I do a good job, but you're, you kind of know maybe your manager's going to be like, why do we have five findings wired, three of them high priority, high impact. Um, and it sounds like maybe you're suggesting part of the nature is that we're waiting until the end of the process to get that feedback. And if we can do it from day one, if, if you do your first day of development and you get some feedback there, it's not such a big deal because yeah, of course I made a mistake. I've only been working on it for a day. Let's fix the mistakes. Let's, let's move this forward and we'll keep doing that process every time. Um, cause, cause I think a lot of people have been in that situation then it's, it's stressful. I think it's stressful for it. It's not just stressful for the developers. It's stressful for the managers who now have this really fancy looking report saying how insecure their software is.

Stephen (00:27:55):

Yeah. Yeah. I think, um, you know, the, the best teachers make sure that you're set up at the end of the semester to, you know, pass the test if you've been doing all the work and paying attention. Right. And I think the best security teams, uh, set developers up to, you know, to pass whatever that ultimate gating processes is. And, um, and I think, yeah, running these tools as a continuous part of, uh, development is, is it can be a big factor in making sure that you, that you pass that test. Um, you know, we talk a lot about, uh, continuous assurance, which is this idea that, um, as you're trying to raise the quality of your software, as you're trying to attend to various compliance requirements, um, you, you try and do that in a more and more automated way, the same way that, uh, continuous integration and continuous deployment, uh, right, have automated those build test and deploy processes, right. We see more and more, uh, people in industry, uh, large companies out there, uh, making, uh, changes to automate their compliance workflows and their security workflows. And so, uh, you know, one thing that we've tried to do with Muse is really build a platform that's focused on supporting that sort of automation that, you know, security and compliance, oriented automation process. Um, cause yeah, it's, it's huge in terms of the difference it makes in the day to day life and happiness of the development team.

Joey (00:29:10):

And this is something that both of you have been kind of building to for quite a while. Uh, I know the two of us Stephen worked on the Amazon stuff, which has a bit of this flavor. And then Tom, I remember when I believe was in some ways the inception of muse. At one point, you said, we've got to just do this. We have these tools at Galois. How do we integrate them into github basically? And I don't remember how long you took, but you wrote a Haskell program.

Tom (00:29:39):

Yeah, I took about three months and I glued together, some Galois tooling and into a get hub, a comment system. And it works quite well. You know, the, the main motivation there was that, that, that to run that particular tool, you had to understand approximately five languages. You have the language wrote your code in, you have your language you wrote your specification in, you have the language that glues the two together that has a couple of sub languages and, Oh, it becomes a nightmare. And that's why people pay Galois to run this thing because you get true value out of it. But can we make that more, automateable more easy to apply them so that we can apply it to two code bases that don't need to invest such such high amounts to their correctness. Uh, that, that was definitely a huge motivation is making this, all this work accessible.

Joey (00:30:20):

And you know, now the obvious question is, was that two or three years ago?

Tom (00:30:25):

Uh, yeah. Yes, it was two.

Joey (00:30:28):

How has it been to go from, from that the three month, the three month hackathon project, more or less to you, you all have just released and you're actually sounds like you're seeing, you're seeing your tools make real impact on, on large systems. How, how is that for you two?

Tom (00:30:48):

vgratifying, in a word.

Stephen (00:30:50):

Yeah. Yeah. It's been, it's been really exciting to see it come together. You know, like you discover, you discover a lot of things between the prototype stage and, and, you know, feeling like you have a finished product that, uh, that you didn't count on it and anticipate, but I'm working through that. You know, we have an amazing team at muse dev, uh, you know, just a great, uh, team of developers and people supporting on the business side as well. And, uh, yeah, it's been, it's been a really fun process to put it all together.

Joey (00:31:22):

And a really unique part of the way, uh, not, not unique at Galois, I guess, but unique in the world of software engineering is that you all made the choice to develop this and largely Haskell, right. Has that been a good choice for you? Are you in, are you enjoying that? Is it productive?

Tom (00:31:38):

It is very productive. It's been a great choice. Uh, the language expressivity, uh, is, is tremendous and the ability to, uh, mock and model aspects of the real world and simulate things so that we can run the whole system and isolation, uh, is hugely beneficial. Uh, that, that is a whole conversation of itself. Uh, at the same time, you know, it's not without costs. You know, you live, you work in a small ecosystem and it is bushwhacking everywhere you go. If you want to connect to this, uh, not entirely rare, but not mainstream database. Okay. Make a database library, you make that driver and now you own it.

Joey (00:32:14):

And a lot of what I'm hearing from both of you is that in some sense, it doesn't matter because you care far more about the team than the tooling and that you, I would guess that you would make a bet on your team, regardless of the language that you all are programming in.

Tom (00:32:27):

Without a doubt.

Joey (00:32:30):

How have you put that team together? That's not, that's not an easy thing to get to get to that point and you've done it in a pretty quick turnaround.

Tom (00:32:41):

We try to pull a bit on social networks and then our first few hires came from social networks. They're just, just some great people we knew through the grapevine that, that, uh, you didn't really interview them so much. They interviewed you because the, they came with such high recommendations. Uh, so that, that was a nice start. Uh, and Then, uh, it was tech interviewing is a long painful process because there, there are just so many, um, great people out there and you just need to figure out which one is the best fit and, and who would be quite pleased to be there versus who was looking for, you know, just their next thing, which is legit. But a lot of hard work is the summary.

Joey (00:33:20):

And then the hard work doesn't stop once you've got the team, you've also clearly put more thought into how you allow the, that group to be productive. Is that that's, you know, you think about it externally. And then as you mentioned internally. What's, what's the, what's the culture like at muse.

Tom (00:33:39):

Well, there's a lot of individual responsibility. Uh, we, we have a team that tends to, uh, accept some, some assignments that are, you know, customer driven that say, Hey, we need this and that. And then they'll come back at you. And not only having, you know, finished that in some sprint, but also said, Oh, you know, by the way I noticed we needed a dashboard or, you know, we didn't have these tables tracking this information in the database. So I just made that too. And, you know, here it is at the review time. So it's, it's amazing to me that when you give, uh, good people, uh, freedom, uh, to, to kind of just, you know, understand the space and see the needs better than you can. Cause they're the ones building it. They will come back with things built that you didn't know you needed. And that, that has been a great thing to experience.

Shpat (00:34:23):

Sounds familiar. Sounds like you've taken a trick or two from the parent company, from the mother ship.

Tom (00:34:31):

Yeah. I was going to respond with a quote from the mothership of the word for a job. Well done is more hard work. That's exactly what you're saying.

Joey (00:34:40):

And what's the background of the team that you've been hiring, cause you're in this interface position where you need to, you need to understand what's coming out of these tools. You need to understand how to drive these tools. Some of which are pretty deep research tools. Uh, and you need to, you need to condense that down to developers. So it's a really challenging position. Have you been able to train people to understand what's going on with these tools? Have you been hiring people that already know about them? How has that worked for you?

Tom (00:35:11):

The team is just tremendously diverse. Uh, I feel like I'm not, I'm, I'm talking a lot here, Steven, you wanna jump in, feel free, but W We have a ML expert. We have a mathematician, we have someone who is just really, uh, loving, dependent types and came to us by reference to the instructor. Uh, and then we have someone who who's just day to day has been gluing Terraform together and keeping everything afloat in a prior company. Uh, and, and it's so, so there's no one who you look at and say is this is exactly like another person in a company, they all brought a very different set to the table, a set set of skills to the table. Uh, and that has been, uh, one of the biggest benefits I think, to having this, this team that as it stands.

Stephen (00:35:51):

Yeah. I think if there, if there's a common thread, uh, connecting everyone, it's, you know, continual interest in learning, um, and, and, and growing and ambition, you know, sort of self that self driven, curious, uh, sort of quality, um, that I think underlies a lot of good developers. And, uh, you know, I think that a great example of that is, you know, we have this internal tech talk series that the development team put together, you know, they were like, we want a time when we can discuss, you know, interesting things, not necessarily work-related right. That, um, that we're, that we've been looking into at our own time. Right. And so, um, you know, we've, we've heard about Coleman filters. We've heard about, you know, data analysis and machine learning with Python, um, uh, you know, just a wide variety of different things, um, that people have shared that they're just personally interested in.

Joey (00:36:42):

And presumably you're all absolutely thrilled to have released. What, what does release mean for the world? What does it mean for, for me as an external person to muse?

Stephen (00:36:51):

Yeah. So it means you can, you can, github. Uh, you can, uh, sign up, you know, before this, we were in a closed beta period. So you would get, you know, we sent out some invite links. We had some people using the system as we developed it, so we could get that early feedback. Um, but there, wasn't a way for just anyone to go out and sign up to you as a service. And now that exists, right? Anyone can go to github marketplace, uh, they can install it. Um, it's free for public repos and fee for open source repos. And during this early access period that we're in right now free for private repos as well.

Stephen (00:37:22):

So like everyone, you know, whatever, whatever sort of software development you're using github for, right. Whether it's at your company or personal projects, you can go and try it out. Um, and, uh, yeah, we, we also, we're continuing in a private beta phase with, we have an on prem product. So one thing, you know, one thing about our gender prizes and how they approach software development, is it still mainly an on premises development process? Right. So there's infrastructure that the company manages. That's either, either physical infrastructure in their data center or, um, virtual infrastructure in their AWS instance. Right. But, um, but they're managing their version of GitHub enterprise. They're managing a Jenkins instance. They're managing, you know, sort of whatever is involved in, in touching their code, uh, exists within their firewall. Right. And so you have to support, uh, to reach, reach those customers and bring, you know, the value of muse there. Uh, we have to support an on premises product as well. And so, uh, we do, we do provide that we're, uh, we have some beta customers that are using that, that's continuing and beta and, you know, certainly if people are interested in trying it out, you know, reach out. Um, but yeah, I think the cloud SAS version is, is, uh, is what's out there and we're really excited about it because that really is the easiest way to try things out. Right. It's just, you know, it's, the onboarding processes is very smooth.

Joey (00:38:40):

And if I'm kind of listening to this and wondering if Muse is for me, how, how would I know?

Stephen (00:38:47):

Yeah. Try a sign up, try. Um, so, you know, if you go to the marketplace and install it, uh, it'll sort of walk you through the few steps, uh, to pick a repo, to, to run it on, uh, it'll run, try and auto configure and run a variety of analysis tools on that repo and give you results. Um, and so it goes through that process, see what the results are, see if it looks interesting. Um, and then think, you know, you've been talking a lot, like it's interesting people when they first try it out, everyone starts with give me that s . Right. Analyze my repo and, and, and do show me that that was in bugs that I'll probably never look at again. Right. But that's, that's your first window into what sorts of things this tool can find. Right. And so I'd say, you know, uh, look through that window, take, take a look at the results that it provides. Um, but then think about using it day to day as part of your development process. Cause that's really, uh, the typical mode of interaction is you're just developing code, you're just submitting PRs, and then you're seeing those results, uh, when they come up and when they're applicable and you don't have to think about it otherwise.

Joey (00:39:51):

In the, the evolution is, as Tom mentioned, this iterative process where you start seeing not all of the old issues, some of which might be false positives, but you start seeing the things that changed and the things that might be new risks as time goes is that that's evolution you're kind of hinting at.

Stephen (00:40:08):

Well, yeah. During, uh, as part of that pull request based interaction, you'll only see results that pertain to the code change that you made. Right. So it's necessarily scoped to the change and to, you know, to, to that, uh, that bit of code that you just wrote.

Joey (00:40:23):

That makes sense. And that works for open source setups as well, where maybe I'm accepting pull requests from people I barely even know. Is that a, is that a really valuable thing for them?

Stephen (00:40:33):

Yeah, that's right. And you know, one thing, um, we've heard particular interest in, and some of the open source projects is, um, some things you, You were talking about IDE support, Joey, and things you might run in the ID. Um, a lot of companies will, will have a standard where, you know, everyone uses this ID or everyone uses this ID plugin. And so certain, certain sorts of bugs are expected to be caught at the ID level. Um, but when you're an open source project and you're getting external contributions from the community, you don't know what they run on the code. You don't know if they've, you know, run check style with a style preferences that you told them to and everything. And you could run that as part of CI and get sort of a report at the end of the day that you can go and access in Travis or whatever. But, um, you know, you can also just plug that tool into muse and then see those results during code review, uh, only in the cases where the applicable, right? So it gives you a way to centralize your scanning infrastructure, even when it's a community project.

Joey (00:41:35):

So people who have been running those in the IDE, I'll never have to deal with them as part of the pull request. But if somebody hasn't been a, it still lets you maintain the standards that you expect of your project more or less.

Stephen (00:41:45):

That's right. That's right. Yep.

Shpat (00:42:49):

For people who kind of develop research prototypes and things that, uh, whether assurance tools or otherwise that might get used from developers. Um, I'm curious to kind of hear from you, what, what, what have been the things that have surprised you in the process and, um, what are the various things that end up being important from an adoption perspective? Um, when you're building things that are coming out of research, uh, that will be used by kind of a more general population. I'm curious if you have any thoughts on that.

Stephen (00:43:27):

Yeah. So, um, there, there's a few things that come to mind. So, um, one is just, uh, sort of what separates, uh, an actionable bug report from something that maybe makes sense to the tool author, but not a developer without that background. Um, so you see that actually you see that even in commercial tools, um, where like tools that are aimed at security teams, uh, will tend to present security issues differently in a more technical manner, you know, usually, uh, then tools that are aimed at developers. Um, and, uh, you know, I think so, so that's important. Um, I think also, uh, just providing, um, providing the right sort of feedback. So I mentioned, you know, whether you break the build or whether you don't, you know, what, what is actually, um, as sort of a unit, you know, universally something that people want to know about right. Versus something that's maybe, um, more stylistic. And then the, the other thing that comes to mind is configurability, right? So, um, you know, often different projects will want to deploy tools in different ways. Um, and so there's this balance, right? Like not everyone wants to take the time to configure something. And so you want to provide, you know, sort of value out of the box and have things, uh, have things tuned appropriately to begin with. But, um, but it is, it is nice to provide those levers, those things that you can go into, um, to tweak. And so one example of that is, um, we've been doing work to support information, flow analysis and finding, uh, you know, cross site scripting and cross site request, forgery type things. Um, you know, data that's not properly encrypted before written to disk. Those sorts of information flow based errors, um, is a place where configuration is really important because, uh, you know, the sources of user data in one application are going to be very different from the sources of user data in another application. And you need a good way to specify that.

Joey (00:45:22):

So what I'm hearing is some of these, some of these analyses that you're able to run through muse are going to more or less work out of the box, and you're going to get some nice feedback right away, but others you're going to need to give a little bit to get better information back.

Stephen (00:45:35):

Yeah. I mean, we, we work hard to make sure that, uh, it delivers what we consider valuable results in what we think developers in general will find valuable, uh, sort of as it's configured, um, when you first turn it on, but you can certainly reach in and, and tweak, you know, what you want to see. And we see that in particular, in the stylistic space, you know? Right. Do you, do you have a particular coding style that you're trying to enforce? Um, you know, that if, so then you might want to do some configuration or even, or even add a tool, right? So we have a plugin interface where you can easily add your own, uh, analysis tools. And so people will use that sometimes if they have a particular tool that's looking for, for coding style issues, or maybe, uh, you know, scanning for, uh, authentication tokens, or other secrets that are, that shouldn't be in the repo, uh, things like that.

Joey (00:46:27):

And if I'm a developer on the other side, if I'm making tools, like let's say I'm doing a new Python static analysis, and I don't know how I'm going to make it work on thousands and thousands of Python projects. Does muse provide something for me, what I need to reach out to you?

Stephen (00:46:45):

Yeah. Uh, definitely, um, reach out to us, uh, because yeah, we do have an API for, um, kicking off those analysis processes, right? So like a, it's basically an API for saying, you know, run an analysis on this GitHub repo and, you know, give me the results. And so then you can programmatically collect all the kickoff those experiments, collect all those results, do your data processing and so forth. Um, it's a really great way to run those experiments at scale. Um, and, uh, also, uh, you know, let us know if you have a really awesome new Python analysis, cause like we're always looking for new tools and in that space and in other languages as well.

Joey (00:47:28):

And then I guess on the other side, if I'm using all of the Python analyses in muse and you all discover a great new one, what, what happens exactly? Do I get that right away?

Stephen (00:47:39):

Yeah, it does. I mean, that's the beauty of a software as a service, right? As it's just always up to date and, you know, not just new tools, but as new versions of the existing tools that we support come out, um, you know, you get those new versions and then if they add features, you know, configurations for those features that are, that are tuned to be a low noise and high value., You know, all of that, um, you know, if you like, we, All the tools we incorporate right now are open source, so you could go get them and integrate them in yourself and sort of build up the infrastructure. We have, you know, try to try to replicate that. But like, there really is a lot of work that goes into not just building those integrations to begin with, but then supporting it over time. Right. If you think about the number of new releases of, uh, tools and rule packs and configurations and you know, over time and, and then keeping that up to date as language systems evolve, you know, as new versions of Java come out, changes in the Maven build system you know, there's really a lot of support and maintenance that goes into that process.

Joey (00:48:41):

Can you give me a sense of how, like you say, I have a Java a project, how many analyses and tools am I likely to have running on my code base?

Stephen (00:48:51):

Yeah. Um, so right now we have, uh, three tools that we run, uh, by default over Java code. Um, there's, uh, the, in virtual from Facebook, um, there's error prone, which is a tool out of Google's developer productivity group. Um, and then there's find sec bugs, which is a community supported product product project that's really focused on security. Um, like API misuse era is crypto crypto, uh, probably insecure crypto use of that, things like that. Um, and then we also support PMD as an optional tool. Uh, some teams are using that. Um, and you know, one thing that's cool about a tool like that. So, so PMD has a number of, um, uh, rules that span the range from like, Oh, this is definitely an error to like this a stylistic, and maybe it doesn't apply to this code base or maybe it does. Right. And so, um, you might have a PMD configuration that you're running in CII already. That's catching certain types of things. Um, but you know, there's other things that are s Sometimes it's a bug, sometimes it's not, but it's not often enough a bug, uh, that you want to break the build on it. Right. But that sort of rule, you can still trigger via muse because it'll just show up as a full request comment. You can deal with it, or, you know, as part of your full request workflow and decide in that instance, is it something I care about or not? You don't have to break the build. You can just, you can triage it as part of that code review process.

Joey (00:50:19):

I see. What you're saying is some of these tools that are more configurable and that if you want to always get the perfect results going to spend a lot of time twiddling the dials on and trying to get them just right. In some sense, you don't have to worry about quite as much with muse because you're only going to get a few results with every small change and that it's much easier to filter those as they come.

Stephen (00:50:41):

Yeah. I think you can be a bit more permissive in terms of how you set up your rules, that you can, you can turn the dial a bit more towards - tell me, tell me things I might care about instead of having to set it to only tell me things that I know I'll care about.

Joey (00:50:56):

And presumably you're doing something somewhat smart here. So if I put a bigger comment at the top, I'm not going to see all of the old issues cause the line number has changed or something like that.

Tom (00:51:05):

Not at all.

Joey (00:51:07):

How smart are you being?

Tom (00:51:09):

Naming bugs is a PhD hard problem.

Joey (00:51:12):

Unquestionably. Heuristics work very well Without writing another dissertation It's a PhD hard problem that hasn't been solved yet. So, so at least one PhD hard problem. It sounds like.

Shpat (00:51:31):

Well, speaking of, you know, kind of PhD research type work. Um, I know both of you before doing this, were pretty much focused on, on really cutting edge computer science research, um, working on prototypes, proving things out. Right. Um, what has that transition been like to now working on a production cloud deployed system for you?

Tom (00:52:04):

It's it's um, it's been a lot more all encompassing, you know, prototypes, everyone knows are simpler than something going out to a, uh, production, uh, SAS system with, you know, large numbers of customers. But, uh, the, the breadth of concerns is just truly immense. Uh, it's, it's hard to, hard to overstate and hard to overestimate when you come from a research prototype environment. Uh, and it's, it's not that I'm surprised at the, at the amount of work there is, but, but it is something worth, you know, stepping back out and kind of nodding your head saying, yes, there's a lot here. Uh, the experience has been, um, one where every time I thought that we had a team of sufficient size, I found that requirements grow and, uh, we, we could easily justify a team of much larger size, you know, it's interesting that how much automation is absolutely necessary to stay afloat. You know, every little thing really needs to be automated. Uh, even if it seems like it's simple, it takes five minutes. We do it every couple of weeks. No, automate that now don't you dare, you, you got it. You got to take, you take advantage of every minute you got.

Stephen (00:53:18):

Yeah, I think, um, for me too, it's been, it's been interesting learning sort of, uh, how you have to think about things when, uh, when your tools are going to be used by, you know, a bunch of people who aren't, you know, doing a PhD in program analysis, right? So like, you know, when you're writing a prototype, a research prototype, you can assume, well, you know, probably you can count on one hand the number of people who are going to like actually get it working and, and replicate results and, and build on it. But like, um, that's not completely true. There's some various successful research systems, but like, um, the other thing is, you know, you can assume a certain amount of expertise from the people using it and willingness to dive deep and read the research paper that, you know, underlies the tool and understand how it works. So when it breaks and gives you this obscure error message, you sort of have a mental model for like how it broke. Um, you know, you definitely don't have that luxury here. And, um, you know, you even, uh, the notion of like, well, so work arounds are not a thing, right? You can't, it's not enough to say, Oh, you can get this working in this roundabout way. Right. Um, yeah, even when you do have an actual solution for everything, communicating that solution is really hard, right. Describing, um, how, how to set things up, structuring the documentation, structuring the workflow, um, so that it's clear to people who are using the tool, how to use it and, and how to set it up and what to do next, um, is, is really tricky.

Joey (00:55:21):

Have you started tackling that?

Stephen (00:55:24):

Yeah. So, I mean, we have started tackling that you have to tackle that. And, uh, you know, we just learned more and more about that as, uh, as we go, you know, you see how people use the system, you see where they get stuck. Um, and you think about ways that you can make it more clear, what the fix for that is, what the, what the approach to solving that problem is. Um, you know, and we try to automate as much as possible in terms of the setup. Um, but, uh, you know, you can't, But, you can't handle everything. Uh, we have to understand the build process pretty deeply to do the sorts of analysis, um, that tools like infer do, um, you know, they have to know all the source code that goes into the compilation process and run over that, you know, have that all available. Um, and so, uh, you know, there, there's always quirks, right. You know, nonstandard build target ,dependencies that are required. Uh, Oh, you need the source for this dependency and the directory called this, or it won't build, you know, so things like that require setup and configuration. And, so making it clear how to do that setup and configuration is something we've been spending a lot of time on.

Joey (00:56:32):

Is this a somewhat interactive process or does the tool pretty much always just figure out what is needed for the repository and take care of it.

Tom (00:56:42):

I'd say over half of the time we're successfully taken care of it. Uh, but there is a very long tail, you know, we support eight or 10 different types of built ways to go out, building a project across several languages, and then there's going to be another 30 or 40 that, that are rather unpopular, but they still are out there. Uh, documentation is quite critical. You know, you're going to reach a lot of people with, with automation, but, uh, you can't skimp on the ladder cause you're all the people with documentation.

Shpat (00:57:24):

This podcast is about building better systems. So we have, we have people who watch and listen from kind of people who are kind of more into research and develop new approaches, and then folks who are also, maybe not in that world, interested in, um, uh, interested in just, you know, having their workflows be better and building better systems in general, um, whether software hardware. Have we missed anything that you kind of dying to share with that audience, uh, from, from all the lessons you've learned, building this company from the ground up?

Tom (00:58:05):

There's more tooling out there than, you know, there really is. And it's probably applicable to you. Uh, you know, the, the dynamic languages community have, have woken up to the idea of static guarantees over the decade. You've seen flow, we've seen TypeScript and sorbet and Pyre the Python type checker. that world has, has really seen huge advances, but so too has, uh, other more niche tools of the Haskell community has in the last months, seen a static analysis tool, crop up, goes statically checked, but it still has a, an additional static error analysis on top of it. Uh, same with Java with, with infer among others, uh, coming through, you know, there is so much that you can bring to the table and invest, uh, relatively minimal time and, uh, get it work in your workflow, benefit your entire team, you know, save you days of debugging every year.

Joey (00:58:59):

And one of the keys to what you all are doing is, is making that actually be relatively little time. Cause I'm sure we've all gone on the, the tooling binge days where you just sit down and you're like, I'm going to get everything working. I'm going to get all these checks running and in your own environment, it takes you like you're at the end of the day. And you're like, all right, I finally got a red squiggly line under my, under my Haskell types that wasn't type checking.

Tom (00:59:21):

Yeah. We've definitely Worked hard to make it easy, but I I'm really interest in the world moving forward in the world, becoming a computer science, becoming not a stone age, sort of, um, uh, early days, uh, development, you know, I want to see this field progress. And however you get to the tools into your platform, into your process. I encourage you to do it, reach out and talk to us.

Joey (00:59:43):

So it sounds like you really feel like we've been behind for a long time and that this is, this has been a long time coming, which I guess again, is what drove you to say it's time to do a three months demo and actually show some people what we can do.

Tom (00:59:56):

Oh yeah. I could give a very long spiel about how I feel like computer science has not progressed very quickly supporting self and maturing as, as a profession, as a field. Uh, but it's getting there.

Stephen (01:00:07):

Yeah. I think, I think there's been, there's been some really great progress there when it comes to, especially on the program analysis side, when it comes to sort of the collision between the program analysis communities and the industry development communities. Right. You know, there are a whole lot of things that happened on the industry side, you know, dev ops and agile and new ways of building systems, microservice architectures, you know, so there's been a lot of progress in how software's built that. Um, I think it took a little bit of time for the program analysis community to catch up on and plugged into. Um, but you know, it happened and I think it's interesting, it happened largely by virtue of researchers moving into industry and back, and sort of that, that back and forth involving like exchanging personnel, like that's great for, for exchanging ideas and really changing people's point of view. Um, and so I think we've, we've seen a lot of progress there. There's still a lot of opportunity there to make tools even better suited for industry development and even more focused on the sorts of concerns that pop up in modern development environments.

Joey (01:01:08):

And I don't mean this question, I guess, in a muse specific way, but I'm nodding my head when Tom saying we're in the stone age, we need to move things forward. What would you recommend doing? How did you, how did you both take the step to start actually going from one research project to the next, to making impact on a, on a wide range of systems? What, what, what did you both do?

Tom (01:01:45):

I, um, really just started to use the tools, you know, it's it, I, I started to use, SAW and improve things with saw. Uh, so if I, if I wrote an answer to someone's question on the internet, cause I, I really enjoy the whole mentoring aspect of the, the, the connection and the, and the educational aspect that the internet provides. Uh, so, so if I develop something for that, then I would try and prove something about what I built with saw to teach myself something. And then I iterated from that. And then people came to me and said, well, try these other analyzers, someone, someone, I think might've been Mike Dodds came to me and said, Hey, we'll try this Infor tool. I, okay. I tried infer, and while I was working on a crypto research project, I also was, was working on making infer work, uh, in the same environment I had saw working in with GitHub and I, and I had this deadlock issue with Infer and it, it took me an embarrassing long time to realize, Oh, you know, I could actually, I'm sorry I had deadlock issue with the crypto project. It took me an embarrassing long time to realize, Oh, I could actually take this thing I'm doing with the other 20% of my time and apply it to the thing I'm doing my day job and solve the problem, but it did solve the problem. It found my bug. And in my day job, my, my, uh, my side research project did it. Um, and that's when I said, wow, it took me that long. I wonder how, how much evangelism is it going to take to get the rest of the community on board? r

Stephen (01:03:10):

Yeah, you know, I think another thing that's, uh, that's really can be really valuable is just, uh, especially if you're still a student doing research, you know, internships and industry to get a sense for how software is actually built. And like, I think, you know, that obviously it's a big time commitment and requires the right opportunity. Um, but more and more, there are, uh, conferences and workshops that have, um, significant industry involvement. So like Facebook has a great, um, a workshop called the testing and verification, uh, symposium that they host every year, um, that brings together, uh, industry and research, um, academia and so forth so that people can learn from each other. Um, but, but other, you know, large conferences, uh, you know, PLDI, POPPL, there's a lot of industry participation in these now. So I'd say, you know, seek out people who are, who are they're embedded at the Amazons and Facebooks and Googles and Apple, and, you know, learn from them, ask them, uh, tell them about the research that you're doing and ask them if that would be useful in their workflows or if they can think of some variation on that, that would make it more useful.

Joey (01:04:20):

Yeah, that's great. I think it Tom's answer struck me in that I suspect anybody who's done a certain class of tool development, which I've participated in his has run into this situation where they, they spend, we'll say years in their PhD building a tool. And at the end they run it on a hundred line program and they're like, look, it worked. And then it's like, well, my PhD is done and I'm going to go on to the next thing. Um, I don't know how to support that sort of thing in academia that's that's I think an open question.

Stephen (01:04:55):

Yeah. Yeah. I think it, um, the incentives have to change somehow, you know, like what's considered publishable and novel would have to shift a bit to, to be more supportive of, uh, new applications of existing tools. you see conferences accepting, you know, case study reports and so forth, but there's sort of like thinking that's still not viewed on equal footing and not, you know, not as interesting as a new analysis technique. Right. And so, um, I think finding a way to, to reward, uh, sort of continued development and use, like you were saying, it's like, um, uh, you, you need, you need more than one user. You need more than one set of examples. You're applying things to, you know, it's really the systems that have the biggest impact are the ones that, uh, that are applied over a number of years and sort of built on by multiple people. And so, uh, you know, those can be hard to support. Uh, we need, I don't have a good answer, but we need to find more ways to encourage those projects.

Joey (01:05:53):

Yeah. I really, I really liked this. I don't remember the exact paper, but it was a paper by the infer group. And I remember they, it was even anecdotal stuff, but it made such a big impact on me. And I'm sure you have read this many times over, but they sort of broke down what we were looking at. We were looking at bug reports from, from Facebook. And if we gave them overnight, then nobody, nobody fixed anything. Like literally zero bug fixes resulted and we brought it down to a few hours and then, you know, and then we started to see a little bit more and we got it within five minutes and then all of a sudden everybody was fixing everything. reading some of that stuff is so incredibly valuable and impactful even when it's anecdotal. And that in that case, I would love to see more of that. And I sometimes just don't feel like I don't know where to look.

Stephen (01:06:42):

Yeah, yeah, yeah. I love that story because it is, it's like the same analysis, just running in a different way. Right. And you have this difference between, you know, almost a 0% fix rate when you're applying it overnight and 70% in code review. Um, and yeah, I think those lessons are great. The paper that comes to mind there, maybe it's the one you're referring to is a, it's a paper on continuous reasoning that Peter O'Hearn published in the last couple of years. And yeah. And there, he, he describes that principle as he coins the term, a ROFL principle for report only failure list, right? Like as a tool out there, all I have to do is report a list of failures and then my job is done and people will love it and, you know, the world will change and like, uh, and then, and then what they discovered, they tried to apply it at Facebook. Um, and yeah, I think there's a lot more stories out there about that. And, uh, yeah, and it's, it's good to see. It's good to see industry sharing. These Google has a couple of really great communications of the ACM articles about their use of static analysis internally. Um, but I guess that's another thing researchers can do is familiarize themselves with these reports from industry that we do have.

Joey (01:07:50):

And I guess efforts like muse in some way, or they're a chance for everybody to win a bit. Cause it sounds like you all are doing a lot of this work to bridge the gap between, well, I'm making a tool, but like as an academic researcher, I don't, I don't have the months that it would take to integrate with 20 build systems, for example. Um, but that's not work that you should have to do one per research project, right? This is a, this is work that you should maybe have to do once or twice, um, thousands of times over, but, but you can do it once or twice for each individual build system. And if you're consistent about the way that the API, you mentioned, you had a programmatic interface, uh, if you're consistent about that, then hopefully some other people can not have to do that work, but still actually get a chance to get their tool running on a lot of different code.

Stephen (01:08:38):

That's right. What we're finding so far is that the sorts of integrations that you need, uh, to support applications and industry and integration with GitHub and things like that, um, can be done in a, in a way that's separate from the tools themselves, right? And so you can solve that problem once and for all, for, uh, for all the various tool authors. And, and like you said, no one should, no one should have to do that. Certainly not as part of your PhD research, uh, that doesn't need to be between you and graduating.

Joey (01:09:08):

And of course, when you say solve the problem once, and for all, you mean start a company that's gonna work for as long as you all can possibly work to keep supporting everything as it changes. Right.

Stephen (01:09:16):

It's a process. Yeah. Yeah.

Joey (01:09:23):

Great.

Shpat (01:10:21):

So for people on both sides, as researchers, as well as developers that want to leverage muse, what should they know?

Stephen (01:10:30):

Yeah. So, so on the developer side, I would say, you know, go try the tool, uh, connect on, github marketplace and install it there, uh, check it out, um, and, you know, make use of the feedback feature. Um, you know, there's a, there's a forum in there that, um, that you can fill out to tell us how it's working for you. Um, you know, we want to make this tool, uh, the best way to try out static analysis tools, especially cutting edge, uh, results that are coming from the research community. And so let us know how that's working so that we know what changes we might need to make there. Um, on the research side, you know, if you're someone producing one of these tools, developing it, uh, trying to experiment and see, uh, how useful it is, how generalizable it is, you know, what the, what the false positive rates are and so forth, reach out to us about using the muse API to run large scale experiments and collect that data because we really want to support that research and experimentation process.

Shpat (01:11:25):

Fantastic. that's the end of our questions, I think, uh, thank you so much for joining us today and spending this time with us. It was a pleasure chatting with you.

Stephen (01:11:47):

Thank you. It was great to be here.

Shpat (01:11:50):

Absolutely. Well, this has been another episode of the building better systems podcast. Thank you very much for watching and listening and we'll see you next time.