Protecting Against Supply Chain Attacks

Listen to the following security and product experts share their insights in the webinar:

  • Jasson Casey, Chief Technology Officer at Beyond Identity
  • Mike Starr, CEO of a stealth startup



Welcome to our talk on protecting against supply chain attacks. So we'll get into a lot more detail about what this means, but at a high level, you are your customers' supply chain. You probably build software products. You probably build services. You probably have customers, and if you don't, you won't be in business long. But in all of those cases, the choices that you make and the actions that you take today are going to impact your customers tomorrow. 

You are their supply chain. They view us as something they typically deem as third party risk and things have been heating up over the last five to 10 years, where previously we thought about third-party risk as a theoretical concern from a security perspective, but the reality is sadly that's no longer true. We have many breaches and security incidents that have ridden their way in through the supply chain. 

So this talk is really just meant to scratch the surface. We can only cover so much in 25 minutes. But what should you be thinking about as a developer, as a DevOps engineer, as a product manager, and as a CTO about how to secure your environment, your process of building and delivering software so that you don't cause problems for your customers? 

So with that'll see that I don't know how to advance my slides. With that said, we'll move on to the second slide and kind of introduce ourselves. Since I've been talking, I'll go ahead and introduce myself. If you can't tell, I'm the gentleman on the right. My name's Jasson Casey, I'm the CTO at a company called Beyond Identity. We're a relatively young company. 

We're a startup. And if you couldn't have guessed by now, we're in the space of trying to help companies secure themselves, as well as secure their technical workers specifically around identity and what that can do for you, but we'll get into more of that in the future. And with me is Mike Starr. 


I'm Mike, working on stealth startup recently and in the cyber security market focused on making malicious cyber activity financially unviable for bad actors. 

So we'll zoom out a bit before diving into supply chain attacks and first define what's even being attacked. The term supply chain typically refers to like people and processes required for manufacturing of a product or the distribution of that product. 

And typically when we think of manufacturing, we think about physical things like cars or airplanes. And so like an airplane manufacturer might build some components themselves for the airplane, consume others created by third parties. They'll assemble those parts based on like a blueprint or a schematic and conduct some testing to make sure that it actually flies before giving it to the pilots, hopefully. 

We build software in a similar fashion. It has many parts. It's got a lot of developers, infrastructures, both internal and external to the developing body. And like airplanes, some software actually has to continue to operate under duress. So if we put software in front of supply chain, we can say that it includes anything that impacts software testing, development, distribution, or more plainly, anything that goes into or affects your code. 


So before we get into any of the details, we thought it might be best just to kind of ground the audience in some common supply chain attacks that you might have heard of, and maybe just remind everyone of some salient details, as well as kind of cover may be one attack that maybe is a little obscured, you haven't heard of, but just to kind of highlight the ease of which some of these attacks are actually possible. 

By the way, if you're interested in following up on any of this information, we're mostly doing third-party research. The first-party research was conducted by others. In particular, on this slide, the European Union Agency for Cybersecurity has some reports on supply chain attacks, and for some of the others, we actually source the material directly in the slides. 

But why don't we go ahead and get started with the SolarWinds breach because I think it's kind of hard to find someone in tech who doesn't know about the SolarWinds breach? Just a quick high-level reminder, SolarWinds makes network management software. So little agents get deployed on people's machines and laptops and a service or a server in the sky helps understand like what's going on and monitor systems in a continuous way. 

Can also manage and monitor routers and switches, and kind of just give you more of a bird's-eye view of what's going on. So that part of it's not really that pertinent, the most pertinent part is SolarWinds has software or agents that end up getting installed on most of the technical assets of a customer's environment. So that's kind of an interesting target for an adversary or threat actor. 

And, in fact, one did take advantage of that over the time period of the 2019 up until mid-2020. What's believed to be a nation-state threat actor accessed SolarWinds, SolarWinds, the primary company. And they were able to compromise the SolarWinds software development process. 

And in the process of compromising, and I said process twice, in the process of compromising the SolarWinds development process, they were able to inject Trojans into targeted SolarWinds' customers, right? So how did they actually do that? And by the way, FireEye was the original organization that actually detected the compromise. They didn't detect it in SolarWinds. 

They actually detected it in SolarWinds' customers. But most of the really good reporting on this comes from CrowdStrike because I think CrowdStrike ran the incident response for it. So the incident responders were unclear on what the initial access into the SolarWinds environment was, but they were able to rule it down to a couple of things. 

Either it was a third-party compromise, i.e., some vendor of SolarWinds was the route of the threat actor into the SolarWinds environment, or it was social engineering or phishing, right? Someone sent a phishing link and either compromised a local endpoint, or they were brought to a man-in-the-middle service to basically steal their credentials and access token, or they brute-forced some services with either just guessing or credential stuffing

Once they actually had access, they were able to move around between different services, right? We typically call that lateral movement. They were definitely observed doing something called credential hopping. Whereas every time they went to a different machine, they actually used a different set of credentials. They were observed stealing access tokens, and then placing those access tokens in new browsers to kind of like hijack sessions and authorize sessions. 

And what's really, really interesting and, obviously, we can say this after the fact and the damage has been cleaned up, but they inserted an implant into the build machine of SolarWinds itself. So when SolarWinds was building new versions of SolarWinds software, it would actually build would build this implant that the threat actor wanted into the SolarWinds update. 

And this is what was called sunbursts, if you've gone into some of the documentation. And the implant was then distributed down to SolarWinds customers just through the standard update process. And so then as the customer was actually updating their SolarWinds process, they then started running this malicious software that was actually inserted at the point of build. 

So a couple of things to point out there. The insertion of the Trojan actually happened in the build process. It happened just before source files were actually getting compiled into binaries. Also, some other interesting details that kind of came out of this was they kept their infrastructure partitioned. 

So the infrastructure of the Trojan SolarWinds software was actually phoning home and talking to threat actor infrastructure that was actually orthogonal from the infrastructure that was getting phoned home from the implants in SolarWinds itself. So there was just a lot of sophistication that was actually displayed in this particular compromise. 


Yeah. And of three threat categories before we move on to the software supply chain that there are source, build, and dependency threats, this is a great example of a build threat, like you mentioned. 


So I wanted to highlight one more and this one's probably a little bit more obscure. Like, I don't think many people have heard about this. But everybody's heard about Linux, and everyone knows an operating system has a kernel. And most people know that Linux maintainers, and specifically kernel maintainers, are largely a volunteer workforce that offer up their time to maintain the code that really kind of helps run the internet. 

And back in a couple of years ago, some researchers at the University of Minnesota wanted an answer to the question of how hard is it to basically compromise the Linux kernel just through the standard development and submission process. And they wrote a really interesting paper. So we sourced that at the bottom, everybody should take a look at it, but essentially what they were doing is they were inserting compromised code as patches to the maintainers of the Linux kernel and trying to figure out is it possible for us to exploit what they called an immature bug? 

So an immature bug is something that's a little bit fishy, no one sees it actually causing any real problems. And because of the Linux mantra of we only have time to vet code that's fixing real problems, immature bugs are largely left alone. An immature bug can very quickly metastasize into a security vulnerability. And so that's exactly what these researchers did. 

They focused on how to cause these immature bugs with very subtle and very slight code changes that turn into vulnerabilities that in fact, they could use as remote compromises later. Now, luckily, these researchers were white hat researchers, so they didn't actually go through with the full process. Once they got their submissions accepted by the maintainers, they would usually follow up and say, "Oh, never mind. I realized I didn't want to do this." 

It also caused quite the stir in the Linux development community. But it was really illustrative of the concept that how do I know where the source code is coming from? How do I know the source code is in fact coming from Mike Starr, not someone saying they're Mike Starr? Do I know anything about the environment in which it was developed? 

And then also, another thing that's kind of pointed out here is the limits of what a human can do through two-party review, right? We use compilers and we use strongly-typed languages because we know humans make mistakes. And if they write a lot of code, the only way to find some of those mistakes is with mechanized help. Why would it be any different for vulnerabilities? 


Right. Yeah. And so here's a great example of a source threat in this kind of threat categories and a terrifying one, right? Because if the maintainers of the Linux kernel struggle with this problem, we can reasonably assert that us regular people will too. But who cares? Hopefully, after this talk, you will because the software you write, sell, test has many contributors, open-source libraries, and dependencies and again, build infrastructure probably governed by third parties like GitHub or Codacy or BitBucket. 

These are avenues that malicious actors are targeting to compromise your software supply chain and they're obviously opportunistic. They're going to take advantage of the things that they can as they come up. So let's jump to the next slide, I think. 

So like Jason said, do you really know who's committing to your repos? Is Jasson Casey actually committing or someone masquerading as Jasson Casey? Do you know all the open-source dependencies your code has and the dependencies those dependencies have? Do you trust your supply chain infrastructure, your source code repositories, your build pipelines, your package mirrors? 

And if you're saying yes to these questions, why? Like, what gives you this trust, right? And do you know what, if anything they're doing, they, in this case is a third party that you trust, they're doing to protect your code? So let's go to the next slide. 

So here are some ways in which malicious actors target or can target your software supply chain. Some of them are more difficult to pull off than others. Like, the SolarWinds attack was extremely sophisticated. But in the case of the hypocrite source submissions, you have to have the ability to cut some code and see, but largely, it's a fairly simple avenue. 

But yeah, in general, the sophistication level required to be successful in these kinds of attacks is pretty low. So how do we start tackling the problem then? I think that's the next slide. The open framework relatively new known as the Supply Chain Levels for Software Artifacts, it's abbreviated SLSA and it's pronounced SLSA. 

It gives us a good starting point that attempts to comprehensively address the software supply chain by breaking it down into these three categories that I kind of mentioned briefly in the previous slides, source, build, and dependency threats. And, you know, we have these illustrative attacks that we previously covered, meaning, you know, are you pulling software that you actually intended to thwart dependency threats or you're a hypocrite SolarWinds attacks for build and source? So by knowing what's in your code, I think that's the next slide, right? Yeah. So, knowing what's in your software, knowing what's in your code, validating these dependencies, you are pulling in from like...are you pulling from the correct source and you're monitoring the controls of your entire software development process will help to ensure that you're actually prepared to handle securing your software supply chain. 


So this brings us a little bit of what can you do about it? And one thing I just want to kind of double-check on, just bear with me, come back to this particular slide. The SLSA framework by no imagination is perfect or even comprehensive, but it's a really good start, helping us build this model to kind of visual where can problems or threats arise in how we actually build and deliver software to our customers. 

And so each label here is a discreet point that could be taken advantage of. And in the traditional world of building networks, as well as building software, we generally have been lazy and we use this concept called transitive trust, and the whole idea with transitive trust that I don't really put too much scrutiny on things if they happen to be in the right room. 

So if one day I saw Mike in the middle of my house, I wouldn't really ask any questions because I would just assume because he is inside of my house, he probably ought to be inside of my house, right? We had a similar analogy with perimeter-based security, right? Like, if you're on the trusted side of the network that I don't spend too much time trying to figure out if you really should be there, I just kind of assume. 

It's really no different in how we've been building software. There's very little trust or integrity that's used throughout this software delivery pipeline. And most companies when you talk to them are very understanding about how to handle the process of am I running genuine software, software that was built by my supplier named X, right? 

And here that's actually covered with H, right? Like, am I actually using a software artifact that was in fact built by the vendor that I actually expected it to build by? And so we all have software signing keys that we use at the very end of our CI/CD pipeline to assure that. But most of us stop there and don't really think about anything else in the process in terms of where I'm getting dependencies, whether they're internal or not, in terms of the thing that I'm building, is the thing that I'm building, in fact, what was actually in my repository? 

The stuff that's in my repository, is it actually contributed by my developers? How do I know the answers to these questions? And this fundamental turn that's happened in the world of computing over the last five years is this concept of trusted computing. And it entered in with the mass availability of secure enclaves or things called TPMs and T2s, which let us know that we're talking to a particular device and how we're talking to that particular device. 

It also helps us answer questions about software. And as we talk through some of the solutions slide here in a little bit, it's going to be no different and we're going to keep hammering home on some of those same concepts. So what can you do? The first thing that's the obvious thing and a lot of companies do have something here is tooling around prevention and detection of vulnerabilities in your software itself, right? 

And so that can be as simple as static analysis tools, symbolic execution tools, linters. But it can also be a little bit more complicated than that in terms of how do you handle third-party libraries? How do you handle third-party dependencies, right? There's almost nothing today built without a significant amount of open-source software. 

So how do you actually know that what you're building is valid? How do you know that what you're building is in fact from that project and not actually inserted elsewhere? The next thing, and obviously, this is where we feel that we have a play as a company on the Beyond Identity side, integrity of a delivery pipeline or development pipeline. 

It starts with trust and strong identity, right? So what is identity? Identity, in our mind, is in this particular context, it's all about developer identity. How do I know that this is Mike Starr's source code and not someone else's? How do I know this is the source code submitted by Mike Starr and not someone claiming to be Mike Starr? How do I know the build system is actually building the source code that I previously verified came from Mike? 

So you can see there's like this chain of analysis that starts to build. And, you know, you can insert your appropriate blockchain metaphor or symbolism. It's very, very similar. But the reality is you need integrity, right? And traditionally, integrity is made by creating checksums of data, and then signing those checksums. 

And if you want to link two pieces of data, you just include or chain the checksum from before and essentially sign over the new package. So this gives you really, really strong integrity controls, which again, integrity is all about understanding if something has been tampered with. But if you also have strong identity, you can link things back to is this in fact the person that I actually expect? 

And the reason why we talk about passwordless strong identity is remember back to the compromises we were describing earlier. So much of the lateral movement and initial access of SolarWinds was based on password compromise, MFA compromise, access token theft. We now have technology. And most of your developers, if not all of your developers, now have technology resident in their machines, resident in their phones, resident in their laptops that enable you to actually get rid of those weaknesses and actually deploy things that are based on asymmetric authentication, where the private keys are in fact stored in secure elements. 

And those elements are guarded by biometrics or local PINs, or maybe even both. Like, there is a way to actually assert strong identity across your workforce, but you can't build integrity in your delivery pipeline without actually starting with strong identity. So once you have strong identity in place, establishing integrity across your software delivery and development process is kind of the next thing. SLSA has a lot of guidelines on how you actually go about looking at that process. 

But at a high level, you're really trying to answer the question with evidence, how do I know I'm delivering an artifact that was based on something my developers actually contributed and nothing else, right? Now, clearly, there's a lot of nuance to that statement. 

And there's a lot of technical detail that you actually have to figure it out. But, you know, it all starts with strong identity and then integrity protection over those artifacts. And then finally, building software is not a solo sport, it is a team sport. Almost everyone out there uses libraries from other companies if not other parties. So in terms of companies, like you do need to treat them as part of your third-party risk. 

You do need to actually vet them. And in most cases, you know, software suppliers have to go through the same level of security audit and compliance as any other supplier in kind of a modern company. 


Right. And this isn't limited just to the software packages that you're pulling in in your software, it's the things that are hosting your software. So the code repositories that you're committing your code to, or the third-party build pipelines that are building your software as well. 


So this is a quick infographic on how to start with strong identity. So strong identity starts with moving away from symmetric secrets to using asymmetric cryptography for authentication. When someone authenticates, they need to be authenticating with a private key, right? Now, this is not new technology. We've known about this for a long time. It has these great properties that if someone pops the database, the authentication database, all they get are public keys or the ability to authenticate that in fact, yes, that is Mike Starr. 

Great, the average user doesn't like to manage keys. No problem. So the next thing that we want to do is we want to make sure that we're constructing those keys inside of enclaves. It's hard to buy hardware after 2016 that doesn't have some kind of secure enclave technology embedded in it. By constructing these keys inside of those enclaves, the keys can't move. 

The keys can't leave the devices. So it changes the mode a little bit. When you authenticate, you no longer say, "Hey, go get the key and then use the key to sign the challenge." You actually take the challenge and you send the challenge to the enclave and you say, "Sign this challenge." And depending on how that key was created, it might require proof that, in fact, it is Mike Starr standing at the laptop, as opposed to someone standing at Mike Starr's laptop. 

That enclave might be guarding the key with a local PIN or a local biometric. Now, the reason we say local is those authentication techniques are local to this device. That information lives nowhere else in the world, which also has another interesting property of shrinking the blast radius of compromise. If you compromise an authentication service using symmetric cryptography, or symmetric secrets, generally the hall at the end is all of the credentials of all of the users. 

And a model that's based on asymmetric cryptography, you have to compromise the endpoint and if you are successful, which is difficult, but if you are successful, all you get is that one person's ability to communicate from that one device. Finally, you need to be able to incorporate the security stature of the device. Yes, this is Mike Starr, are the security controls that I would expect to be in place for Mike Starr and Mike Starr's device, in fact, in place at the time he's actually trying to sign this Git commit, or trying to push this particular code? 

And that has to be part of the authentication process itself. And then, of course, all of these transactions need to produce tamper-resistant logs that are available for audit and forensics. So like this is what we mean when we say laying down a foundation for strong identity. Getting specific to the source code problem, we're talking about source threats earlier, specifically, the hypocrite commits that happened with the University of Minnesota and Linux. 

How do I actually answer the question of I know who this is coming from, right? Well, if you go into most development shops today, they're going to tell you that they have this problem solved because they're putting Git or GitHub or GitLab in front of an SSO. And so you then have the conversation basically saying, well, that doesn't really prove where my code's coming from. 

That just says that I have strong identity on anyone who's logging into the administrative feature of GitHub or GitLab. So the next thing that they'll then move to is SSH keys. It's like, oh, my developers, they all have SSH keys. SSH keys are how they actually push code back and forth, and so that's secure. So then kind of run through two things. Number one, generally, these SSH keys are unmanaged, right? 

And what we mean by unmanaged is the developer has to kind of manually construct the keys, register their public key. So there's a certain amount of error in there. There's also an ability for the developer to export the private key and move it wherever they want. But then let's assume for a moment they're doing that correctly, when you push code, you're really just syncing Git repositories. You're not making any claims about the authorship of code. 

That actually comes from a commit. And the SSH keys don't actually have a statement around authorship and integrity of commits. That only comes from a little-known feature called commit signing. I've only run into one company that was actively using this. And again, this is usually unmanaged. 

It's best effort on the developer's part. So it's interesting, right? There are the hooks in place to actually start at the top of this SLSA framework and solve some of the problems around source threats and source integrity that are mechanized, that don't devolve back to let's have a two-phase commit, let's have a two-party review. It's mechanized in that like we have an opportunity for strong identity to actually ascertain prove where it's coming from and who it's coming from not just at the moment in time, but also later. 

So what does that actually look like? This is kind of the obligatory tell us a little bit about the thing that we do. So we certainly have a strong identity system that can plug into the back of your existing SSOs and help you get access to general services, but we will also take over key management. We'll construct those keys inside of your enclaves and we'll set up signers on the local developer workstations so that when the developer is going through their normal workflow of development and commit and push and pull, things are transparent. 

They don't have to change their workflow, but all of a sudden their commits are signed by keys that are cryptographically linked to their developer identity or their corporate identity, it's tamper-proof, and the developer can't actually move the keys anywhere. So go ahead, Mike. 


Yeah. So to wrap up, use tooling that prevents and detects vulnerabilities. So these are things like not only just in your code, but like try to assert these kinds of things with your third parties as well. But in your code and your build pipeline, use strong passwordless identity, something that allows you to provide, you know, providence and integrity across your commits and the things that are being introduced into your software. 

And yeah, I kind of just said the integrity thing, but it's kind of the auditable providence and integrity across the software supply chain in your source, build, and dependencies. And then insist that your third parties do the same. 


And that's kind of a wrap for our talk. You know, establishing integrity for your build process starts with strong identity. And we've got a bunch of folks around to answer questions as you may have it. So looking forward to talking. Thanks and bye.