Thought Leadership

Offensive AI Could Replace Red Teams

Written By
Published On
Mar 14, 2023

Join our host Reece Guida, Beyond Identity's CTO Jasson Casey, Product Evangelist Nelson Melo, and VP of Product Strategy Husnain Bajwa (HB) they discuss AI, Red Teams, and if AI might eventually replace Red Teams.

Transcription

Reece

Hello, and welcome to Cybersecurity Hot Takes. We are in the office on a beautiful sunny day. Not summy, because that's not a word, and we have a lot to talk about. So, I'm going to introduce myself. I am Reece Guida, your host and also a sales lady. To my left is Nelson. 

Please tell the people who you are and why you're here. 

Nelson

I'm Nelson, I'm the founding engineer. 

Reece

Okay, that was quick. Jasson, what about you? 

Jasson

I'm Jasson. I'm the CTO. 

Reece

That sounds about right. HB, can you confirm or deny that for us, please? 

HB

I can confirm it, although it's hard with him behind the microphone on his video, but yes, I can confirm it. And I am Husnain Bajwa. Everybody calls me HB, and I do product strategy here. 

Reece

Great. Let's get into the hot take. So here it is. Offensive AI could replace red teams. Now, please note, I'm not talking about offensive like mean, or rude. I'm talking about offensive as opposed to defensive. So, first of all, what the heck is Offensive AI? 

Nelson, you're the one that wanted to talk about this in the first place. 

Nelson

Oh, man. So I don't know much about it, and the team here can help, but what I found was a GitHub compilation of Offensive AI resources that kind of breaks it down by AI that can help with voice generation and image generation and text, and I thought it was interesting. 

I want to know more. So, what do you guys know? 

Reece

So, wait, before that, what is Defensive AI? Because Offensive AI kind of implies the existence of another side of the coin. 

Nelson

Is it detecting that an Offensive AI is trying to offend you? 

Reece

That sounds pretty likely. 

Nelson

Okay. 

Reece

Consider me offended. 

Jasson

If we had two Tweet bots that could chat each other and one was offensive and one was defensive and we pointed them at each other, what would result? 

Nelson

Something offensive. 

Jasson

Yeah. It reminds me of the Microsoft Tweet bot from the 2014. It's like, what is it? It got shut down like in a day and then they're like, "Oh, no, we fixed it," and it got shut down in another day and it was like zero to full-blown racist just like that. Which is an interesting reminder, right? A lot of these technologies are…sometimes it feels weird calling them AI because everyone thinks of AI and thinks of the movies and it's like, "Oh, you've got data from Star Trek or you've got this little kid from the Movie AI." 

But the reality is a lot of the techniques that we use for AI right now, they're really kind of slightly more sophisticated versions of drawing a line between a scatter plot. It is the most average version of things we can teach a computer model about ourselves, right? And the chatbot going racist is maybe should be a concerning exercise or concerning reflection. But more importantly, when we see a lot of these AI techniques, they're really looking at what's come before, right? 

The data sets that we train it on and it's literally just saying, right, if we fast-forwarded time and we gave it a little bit of a prompt, what would we have produced on our own? So when you think about it in that way, there are some fundamental limitations, there are some ceilings. It's not going to do something that can't really be constructed out of these statistical models. But again, that's based on a very specific set of techniques which for the most part, the industry has found useful in augmenting people, right? 

And helping us focus. We can't focus on the million events we get on our screen, but maybe if you can shine the spotlight on the 1% that is most riskiest or most deserving of attention. But I do still think Offensive AI could actually be a dual replacement, not just for the parts of the capability of a red team, but also maybe for the verbal game of a red team. 

Nelson

Did you guys see the tweet from I forgot who it was? Someone who was engaged in red teaming OpenAI. 

Reece

No, I didn't see that. That sounds like a cool tweet. Was it a thread? 

Nelson

I think I found it. So, Paul Röttger said, "I was part of OpenAI's red team for GPT4, testing its ability to generate harmful content." And he has a thread that's kind of cool about how seriously they're taking red teaming OpenAI. 

Jasson

Go ahead. 

HB

I was just going to say that the number of people who have been jailbreaking OpenAI's models in amazing ways, right? Like, let me convince the Generative Model to think that it's impersonating a different generative model in order to jailbreak that generative model's safety mechanisms and let you do things that sort of violate its rules. 

It's been kind of neat. And I think going back to Jasson's thing about Tay, after Tay, the industry sort of slowed down a bit, a lot actually and I think we're really lucky to have ChatGPT coming back now and having everyone focused on responsible AI, because that kind of stuff is like, what I think most of those kinds of red team activities right now are kind of focused on. 

Jasson

And just to clarify, Tay was Microsoft's racist chatbot, right? I believe we're looking for an affirmative or a negatory big bird. 

HB

It was. So Tay is definitely Microsoft's amazing flame out faster than their Zoon flame out or whatever their Zoon Phone, I guess was... 

Reece

Zoon Phone. Oh, I'm kind of glad I've never heard of this. 

Jasson

Oh, yeah, you're too young. I was about to say go watch Chuck TV show. They make references to it, but even that show is probably not quite hitting the age range here. 

Reece

Surprise, podcast listeners. I'm five years old. 

Jasson

Yeah. And it's easy to take hot shots at the world's biggest company because what does it matter to them, right? 

Reece

Absolutely nothing. 

Jasson

So there are a couple of things I would say on using chatbot not just for really any job, right? Whether it's offensive security or testing, QA, knowledge-based search, or whatnot. Everybody on news remember that when you are prompting a system, you're actually training it a little bit. And when you're feeding it your proprietary data, your customers' proprietary data, you may actually be inadvertently disclosing things that you shouldn't be, right? 

So, really understanding the model that's in place with the generative program that you're using is important. Is it a multilayer system where you actually have a guarantee that the information you put in has no propagation capability? Or is it a free for all? Right? So there are some important things to really think about when you're using some of these tools, and in fact, we actually discourage using the more popular ones right now for that specific reason. 

Nelson

Can you guys think of any countermeasures to someone feeding ChatGPT proprietary information in your company? 

Jasson

You can't solve stupid. 

Nelson

Yeah, but is there…

Jasson

Actually, I shouldn't say you can't solve stupid. People are always going to make mistakes, and you can train people to make fewer mistakes, but you're never going to train their mistake rate to zero. I don't know, my intuition on that is like, can I train people to not click on bad links? And again, we can reduce their rate. We can't make it zero, so why would we expect that to carry over to other problems? 

HB

I don't worry about it as much on ChatGPT, to be honest with you, because I've been using ChatGPT or Gpt-3 and 3.5 since last year, maybe even a little earlier whenever Jasper first introduced their products for copywriting and kind of cleaning things up in writing. The more direct link to sort of intellectual property challenges is probably the stuff going on with GitHub Copilot and the stuff that's going to happen with Office Copilot. 

I think those kinds of things create a lot more proprietary challenges. Right now, my bigger concern on Offensive AI is just the speed at which we're about to see a shift in the offensive capabilities of relatively unsophisticated threat actors. It reminds me of self-driving cars. 

The Society of Automotive Engineers has this cool little autonomy roadmap thing and people always talk about, is it Level 3, is it Level 4, is it Level 5 driving? Will we ever get to level five driving? Do we have the hardware? And a lot of times, it gets lost that there's like this great model where the first three levels zero to two are essentially human-led and the latter three levels are essentially supposed to be machine-led with the human-only assisting. 

It's very similar in this Offensive AI that the sophistication of automation attacks on initial access and reconnaissance has been pretty unsophisticated so far. If you start having a bunch of threat actors who are no longer subject to the ridiculous phishing training exercises where you're supposed to go like, "Oh, this person uses 40,000 commas, or this person didn't use any commas." 

And that means that this is a phishing email. When you lose all of that information so quickly, will people be able to adapt and even figure out anything about who's a good actor and bad actor? I just don't know right now. 

Jasson

I do think it's going to expose the difference between kind of heuristic probabilistic actions versus known…it's too abstract. What exactly am I trying to say? So, we've talked about trusted computing a lot, right? And the core concept and just bring up other people up to speed, right? The core concept behind trusted computing is, how do I know a thing is true about a piece of hardware or a piece of software? 

And the answer in a trusted computing world isn't, "Well, I ran a bunch of tests and everything looks good, or I audited the software six different ways, and I had peer-reviewed code and it looks good." The answer is I actually ran it through some sort of mechanical proof that's based on principled logic, and the answer proved true. Right? 

And of course, we have practical instances of this, and navigation systems for rockets, airplanes, auto landers, all those sorts of things. These are practical things. They're not just in research labs anymore. And as the barrier keeps getting lower for folks to fuzz, for folks to build out...kind of automate the reconnaissance phase and kind of get a set of payloads and delivery mechanisms kind of queued up quickly without having to do any sort of thought, I think it's really just going to expose even wider the fact that a lot of engineers thinking on defense and even writing code, to be honest, is not logically principled. 

It's more of, is this good enough? And kind of back to the clicking-the-link situation, right? Like, do we want to solve that problem by training people, which means we're just lowering the error rate, or do we want to solve that problem by making sure when they do click a link, nothing bad happens? I do think you're going to see trusted computing show up more and more as an actual defense against a lot of this. 

Nelson

But isn't this the ultimate opposite of trusted computing where you're going down to the proof? In AI and training models, you're completely probabilistic. Wouldn't you have to try to attack it with probabilistic models as well for the defense side? 

Jasson

I mean, you could. The point I'm trying to make is probabilistic defense, in my mind, is going to be incremental defense. Whereas if you can actually carve up the problem, and so not all problems you can carve up this way, right? But if you could carve up the problem to where it could be solvable in a trusted computing way, you'll have a guarantee, right? 

And that no matter what the adversary comes up with, that guarantee is going to hold unless one of your assumptions are violated, right? 

HB

Right after this topic sort of came up, one of the things that I ended up watching on YouTube sort of accidentally was this guy…was this YouTube series where an expert explains a topic that's extremely complicated at five different levels. 

Jasson

Starts with a five-year-old and then it goes up, right? 

HB

Yeah. And the professor was explaining zero-knowledge proofs, and it immediately reminded me, Jasson, of how you've been particularly strenuous internally on our sort of threat models, on the idea that zero means zero and not if you're going to do something, do it with correctness in mind. 

And I feel like that's the counter, right, is that probabilistic is a way to sort of almost make work. But there are actual solutions out there that are implementable and have just traditionally been seen as too much work. 

And I think now they just can't be seen as too much work. 

Jasson

There's a really good YouTube video by Leslie Lamport. He's giving a talk. I can't remember if it was like just one of his, "Hey, come give a talk," or if he was winning an award, but he was basically talking about when he…so for people who don't know, Leslie Lamport is a very famous computer scientist. He's a big shot over at Microsoft. 

But more importantly, he kind of discovered, right? In computer science, you say discovered, not invented, because it was always there, right? But he discovered a lot of the core distributed algorithms that kind of make the modern Internet work. So, whenever you're thinking about how to do interprocess communication, whenever you're thinking about, I have multiple threads and I want to make sure that I'm sharing data in a way that's consistent and safe. 

You're either using something that he designed or you're using something that relies on something that relies on something that he designed. Anyway, the talk that he gave I thought was really, really good because it was a combination of like here's a practical example of something that was happening with database replication at the time. And I think he was referencing something in the '70s or the '80s. 

And this person published a paper about how to actually do this correctly. And he's looking at it and he's thinking, "This makes no sense." Like, I can come up with these different sequences where your data becomes inconsistent very, very quickly. One of his first contributions, I'm actually going to forget the name of the paper, was how to actually do a consistent update across two different independent data, or not independent, but two different data sets that were trying to stay in synchronization. 

And the points that he's trying to stress in his talk is when you're really focused on solving kind of principled problems, right? Like, whether it's scale, whether it's performance or whatnot, there is a math and there is a science that actually can give you absolutes on specific boundary conditions, right? 

But it can give you absolutes, and you don't have to guess and you don't have to say, "Well, we tried hard." You can actually know. And number one, he's built a career on that. But number two, we've kind of built the Internet on top of his career of that. And yeah, I think it's just another good…I'll go try and dig up the URL so we can post it later. But it's a really good reminder, I think, to anyone, whether it's a defender, a red team engineer, or even just a general software engineer, to really kind of have that perspective, have that tool in their toolkit and kind of understand when to work from one to the other. 

Reece

So, I'm going to hold your feet over the fire, everybody, and I'm going to ask each of you individually a binary question. Yes or no. Do you think Offensive AI could replace red teams? Nelson? 

Nelson

No. 

Reece

Why is that? 

Nelson

I think it'd become a tool for folks to target whatever they're trying to solve for. It's just like futzing. It's more to the arsenal as to the arsenal of how you choose to test a system, but you could automate it, of course. 

There's going to be folks driving that. 

Reece

Jasson, do you think Offensive AI could replace red teams? 

Jasson

I'm going to give you my answer in a Heisenberg state. 

Reece

That one went over my head, Jasson, but HB loved it. 

Jasson

The answer is simultaneously yes and no. 

Reece

Oh, God, I love those kind of answers. 

Jasson

And until we peek at the cat, we're never going to know. The cat's live or dead in Schrodinger…

Reece

Yeah, Schrodinger's cat. I know that one. 

Jasson

Okay. So, there are scenarios where, sure, it can replace a red team, but in those scenarios, whoever is actually seeking out the work is kind of fooling themselves or they don't really need the full...they just need some low-level fuzzing. They don't really need the value of what a red team brings. 

In the scenario where it's clearly not replacing a red team is when you really need the value of what a red team brings. 

Reece

What is that value? 

Jasson

The brains. 

Reece

The fleshy brains. 

Jasson

Yeah, I mean, we're all zombies and they bring us brains. No, a good red team is not going to show up with a bag of tools and mash a button and go and wait for the clock to ring. They're going to analyze whatever system you're asking them to actually take a look at. They're going to think back to first principles and just understand, all right, through what is the lifecycle of this thing and what are all the possibilities of attack, then they're going to think through likelihoods, like what's hard to get, right? 

And then they're going to bring tools to bear. Then they're going to bring automation scripts to bear. So, there's a certain level of deep thinking in terms of figuring out approach, setting up an approach, choosing the right tools, choosing the right team, that no, it's not going to get replaced. Maybe with AGI. 

Right? But AGI is kind of like quantum computing, right? It's still 10 years out. 

Reece

Got it. So, we'll see what the future holds on that one. HB, what about you? Are you going to hit me with a gray area answer, or are you going to go hard yes or no here? 

HB

No, I'm going to go with a yes. When I look at the market today, Jasson's point on pen testing being an important brain exercise is a great one that's kind of theoretical. 

Like, the reality is that you have tons of pen testing as a service products popping up, and they're not that great, but they're what a large portion of the industry keeps adopting, especially as requirements around OWASP and other compliance standards isn't clear. So, I think pen testing is largely going to be in the realm of compute platforms and automation and products that can scale smart people's insights to larger audiences because the adversaries will not be fully Offensive AI. 

I think the adversaries, the top 160 adversaries in the world, the likes of CrowdStrike Track, or any of the sort of larger cartel type of entities. I think that those guys will keep going and I think maybe in the global 2000, you might still have like, pen testing capabilities and maybe within security software companies, but there's a real lack already of the people who can be on the opposite side of these adversaries. 

And I think with Offensive AI, it just becomes harder and we'll need to figure out some way to scale, and I think platforms are the way. 

Nelson

Did you just invent the Uber of pen testing? 

Jasson

Doesn't that already exist? Isn't it like HackerOne and Bugcrowd? 

HB

There's a bunch of them that are essentially doing like…

Jasson

I mean, I guess one thing I would poke onto that, HB, will an underwriter actually give you an insurance policy if you don't have a real pen test? 

HB

But will an underwriter care if the pen test came from a third-rate independent in Idaho? 

Jasson

Fair enough, nothing against Idaho. 

Reece

Yeah. Please, Idaho listeners, don't quit on us. We love you. We love your potatoes. I think that's a beautiful note to end it on today. That's all for today. We'll see you in the next episode. 

Please like and subscribe. Goodbye. Good riddance.

Get started with Device360 today
Weekly newsletter
No spam. Just the latest releases and tips, interesting articles, and exclusive interviews in your inbox every week.

Offensive AI Could Replace Red Teams

Download

Join our host Reece Guida, Beyond Identity's CTO Jasson Casey, Product Evangelist Nelson Melo, and VP of Product Strategy Husnain Bajwa (HB) they discuss AI, Red Teams, and if AI might eventually replace Red Teams.

Transcription

Reece

Hello, and welcome to Cybersecurity Hot Takes. We are in the office on a beautiful sunny day. Not summy, because that's not a word, and we have a lot to talk about. So, I'm going to introduce myself. I am Reece Guida, your host and also a sales lady. To my left is Nelson. 

Please tell the people who you are and why you're here. 

Nelson

I'm Nelson, I'm the founding engineer. 

Reece

Okay, that was quick. Jasson, what about you? 

Jasson

I'm Jasson. I'm the CTO. 

Reece

That sounds about right. HB, can you confirm or deny that for us, please? 

HB

I can confirm it, although it's hard with him behind the microphone on his video, but yes, I can confirm it. And I am Husnain Bajwa. Everybody calls me HB, and I do product strategy here. 

Reece

Great. Let's get into the hot take. So here it is. Offensive AI could replace red teams. Now, please note, I'm not talking about offensive like mean, or rude. I'm talking about offensive as opposed to defensive. So, first of all, what the heck is Offensive AI? 

Nelson, you're the one that wanted to talk about this in the first place. 

Nelson

Oh, man. So I don't know much about it, and the team here can help, but what I found was a GitHub compilation of Offensive AI resources that kind of breaks it down by AI that can help with voice generation and image generation and text, and I thought it was interesting. 

I want to know more. So, what do you guys know? 

Reece

So, wait, before that, what is Defensive AI? Because Offensive AI kind of implies the existence of another side of the coin. 

Nelson

Is it detecting that an Offensive AI is trying to offend you? 

Reece

That sounds pretty likely. 

Nelson

Okay. 

Reece

Consider me offended. 

Jasson

If we had two Tweet bots that could chat each other and one was offensive and one was defensive and we pointed them at each other, what would result? 

Nelson

Something offensive. 

Jasson

Yeah. It reminds me of the Microsoft Tweet bot from the 2014. It's like, what is it? It got shut down like in a day and then they're like, "Oh, no, we fixed it," and it got shut down in another day and it was like zero to full-blown racist just like that. Which is an interesting reminder, right? A lot of these technologies are…sometimes it feels weird calling them AI because everyone thinks of AI and thinks of the movies and it's like, "Oh, you've got data from Star Trek or you've got this little kid from the Movie AI." 

But the reality is a lot of the techniques that we use for AI right now, they're really kind of slightly more sophisticated versions of drawing a line between a scatter plot. It is the most average version of things we can teach a computer model about ourselves, right? And the chatbot going racist is maybe should be a concerning exercise or concerning reflection. But more importantly, when we see a lot of these AI techniques, they're really looking at what's come before, right? 

The data sets that we train it on and it's literally just saying, right, if we fast-forwarded time and we gave it a little bit of a prompt, what would we have produced on our own? So when you think about it in that way, there are some fundamental limitations, there are some ceilings. It's not going to do something that can't really be constructed out of these statistical models. But again, that's based on a very specific set of techniques which for the most part, the industry has found useful in augmenting people, right? 

And helping us focus. We can't focus on the million events we get on our screen, but maybe if you can shine the spotlight on the 1% that is most riskiest or most deserving of attention. But I do still think Offensive AI could actually be a dual replacement, not just for the parts of the capability of a red team, but also maybe for the verbal game of a red team. 

Nelson

Did you guys see the tweet from I forgot who it was? Someone who was engaged in red teaming OpenAI. 

Reece

No, I didn't see that. That sounds like a cool tweet. Was it a thread? 

Nelson

I think I found it. So, Paul Röttger said, "I was part of OpenAI's red team for GPT4, testing its ability to generate harmful content." And he has a thread that's kind of cool about how seriously they're taking red teaming OpenAI. 

Jasson

Go ahead. 

HB

I was just going to say that the number of people who have been jailbreaking OpenAI's models in amazing ways, right? Like, let me convince the Generative Model to think that it's impersonating a different generative model in order to jailbreak that generative model's safety mechanisms and let you do things that sort of violate its rules. 

It's been kind of neat. And I think going back to Jasson's thing about Tay, after Tay, the industry sort of slowed down a bit, a lot actually and I think we're really lucky to have ChatGPT coming back now and having everyone focused on responsible AI, because that kind of stuff is like, what I think most of those kinds of red team activities right now are kind of focused on. 

Jasson

And just to clarify, Tay was Microsoft's racist chatbot, right? I believe we're looking for an affirmative or a negatory big bird. 

HB

It was. So Tay is definitely Microsoft's amazing flame out faster than their Zoon flame out or whatever their Zoon Phone, I guess was... 

Reece

Zoon Phone. Oh, I'm kind of glad I've never heard of this. 

Jasson

Oh, yeah, you're too young. I was about to say go watch Chuck TV show. They make references to it, but even that show is probably not quite hitting the age range here. 

Reece

Surprise, podcast listeners. I'm five years old. 

Jasson

Yeah. And it's easy to take hot shots at the world's biggest company because what does it matter to them, right? 

Reece

Absolutely nothing. 

Jasson

So there are a couple of things I would say on using chatbot not just for really any job, right? Whether it's offensive security or testing, QA, knowledge-based search, or whatnot. Everybody on news remember that when you are prompting a system, you're actually training it a little bit. And when you're feeding it your proprietary data, your customers' proprietary data, you may actually be inadvertently disclosing things that you shouldn't be, right? 

So, really understanding the model that's in place with the generative program that you're using is important. Is it a multilayer system where you actually have a guarantee that the information you put in has no propagation capability? Or is it a free for all? Right? So there are some important things to really think about when you're using some of these tools, and in fact, we actually discourage using the more popular ones right now for that specific reason. 

Nelson

Can you guys think of any countermeasures to someone feeding ChatGPT proprietary information in your company? 

Jasson

You can't solve stupid. 

Nelson

Yeah, but is there…

Jasson

Actually, I shouldn't say you can't solve stupid. People are always going to make mistakes, and you can train people to make fewer mistakes, but you're never going to train their mistake rate to zero. I don't know, my intuition on that is like, can I train people to not click on bad links? And again, we can reduce their rate. We can't make it zero, so why would we expect that to carry over to other problems? 

HB

I don't worry about it as much on ChatGPT, to be honest with you, because I've been using ChatGPT or Gpt-3 and 3.5 since last year, maybe even a little earlier whenever Jasper first introduced their products for copywriting and kind of cleaning things up in writing. The more direct link to sort of intellectual property challenges is probably the stuff going on with GitHub Copilot and the stuff that's going to happen with Office Copilot. 

I think those kinds of things create a lot more proprietary challenges. Right now, my bigger concern on Offensive AI is just the speed at which we're about to see a shift in the offensive capabilities of relatively unsophisticated threat actors. It reminds me of self-driving cars. 

The Society of Automotive Engineers has this cool little autonomy roadmap thing and people always talk about, is it Level 3, is it Level 4, is it Level 5 driving? Will we ever get to level five driving? Do we have the hardware? And a lot of times, it gets lost that there's like this great model where the first three levels zero to two are essentially human-led and the latter three levels are essentially supposed to be machine-led with the human-only assisting. 

It's very similar in this Offensive AI that the sophistication of automation attacks on initial access and reconnaissance has been pretty unsophisticated so far. If you start having a bunch of threat actors who are no longer subject to the ridiculous phishing training exercises where you're supposed to go like, "Oh, this person uses 40,000 commas, or this person didn't use any commas." 

And that means that this is a phishing email. When you lose all of that information so quickly, will people be able to adapt and even figure out anything about who's a good actor and bad actor? I just don't know right now. 

Jasson

I do think it's going to expose the difference between kind of heuristic probabilistic actions versus known…it's too abstract. What exactly am I trying to say? So, we've talked about trusted computing a lot, right? And the core concept and just bring up other people up to speed, right? The core concept behind trusted computing is, how do I know a thing is true about a piece of hardware or a piece of software? 

And the answer in a trusted computing world isn't, "Well, I ran a bunch of tests and everything looks good, or I audited the software six different ways, and I had peer-reviewed code and it looks good." The answer is I actually ran it through some sort of mechanical proof that's based on principled logic, and the answer proved true. Right? 

And of course, we have practical instances of this, and navigation systems for rockets, airplanes, auto landers, all those sorts of things. These are practical things. They're not just in research labs anymore. And as the barrier keeps getting lower for folks to fuzz, for folks to build out...kind of automate the reconnaissance phase and kind of get a set of payloads and delivery mechanisms kind of queued up quickly without having to do any sort of thought, I think it's really just going to expose even wider the fact that a lot of engineers thinking on defense and even writing code, to be honest, is not logically principled. 

It's more of, is this good enough? And kind of back to the clicking-the-link situation, right? Like, do we want to solve that problem by training people, which means we're just lowering the error rate, or do we want to solve that problem by making sure when they do click a link, nothing bad happens? I do think you're going to see trusted computing show up more and more as an actual defense against a lot of this. 

Nelson

But isn't this the ultimate opposite of trusted computing where you're going down to the proof? In AI and training models, you're completely probabilistic. Wouldn't you have to try to attack it with probabilistic models as well for the defense side? 

Jasson

I mean, you could. The point I'm trying to make is probabilistic defense, in my mind, is going to be incremental defense. Whereas if you can actually carve up the problem, and so not all problems you can carve up this way, right? But if you could carve up the problem to where it could be solvable in a trusted computing way, you'll have a guarantee, right? 

And that no matter what the adversary comes up with, that guarantee is going to hold unless one of your assumptions are violated, right? 

HB

Right after this topic sort of came up, one of the things that I ended up watching on YouTube sort of accidentally was this guy…was this YouTube series where an expert explains a topic that's extremely complicated at five different levels. 

Jasson

Starts with a five-year-old and then it goes up, right? 

HB

Yeah. And the professor was explaining zero-knowledge proofs, and it immediately reminded me, Jasson, of how you've been particularly strenuous internally on our sort of threat models, on the idea that zero means zero and not if you're going to do something, do it with correctness in mind. 

And I feel like that's the counter, right, is that probabilistic is a way to sort of almost make work. But there are actual solutions out there that are implementable and have just traditionally been seen as too much work. 

And I think now they just can't be seen as too much work. 

Jasson

There's a really good YouTube video by Leslie Lamport. He's giving a talk. I can't remember if it was like just one of his, "Hey, come give a talk," or if he was winning an award, but he was basically talking about when he…so for people who don't know, Leslie Lamport is a very famous computer scientist. He's a big shot over at Microsoft. 

But more importantly, he kind of discovered, right? In computer science, you say discovered, not invented, because it was always there, right? But he discovered a lot of the core distributed algorithms that kind of make the modern Internet work. So, whenever you're thinking about how to do interprocess communication, whenever you're thinking about, I have multiple threads and I want to make sure that I'm sharing data in a way that's consistent and safe. 

You're either using something that he designed or you're using something that relies on something that relies on something that he designed. Anyway, the talk that he gave I thought was really, really good because it was a combination of like here's a practical example of something that was happening with database replication at the time. And I think he was referencing something in the '70s or the '80s. 

And this person published a paper about how to actually do this correctly. And he's looking at it and he's thinking, "This makes no sense." Like, I can come up with these different sequences where your data becomes inconsistent very, very quickly. One of his first contributions, I'm actually going to forget the name of the paper, was how to actually do a consistent update across two different independent data, or not independent, but two different data sets that were trying to stay in synchronization. 

And the points that he's trying to stress in his talk is when you're really focused on solving kind of principled problems, right? Like, whether it's scale, whether it's performance or whatnot, there is a math and there is a science that actually can give you absolutes on specific boundary conditions, right? 

But it can give you absolutes, and you don't have to guess and you don't have to say, "Well, we tried hard." You can actually know. And number one, he's built a career on that. But number two, we've kind of built the Internet on top of his career of that. And yeah, I think it's just another good…I'll go try and dig up the URL so we can post it later. But it's a really good reminder, I think, to anyone, whether it's a defender, a red team engineer, or even just a general software engineer, to really kind of have that perspective, have that tool in their toolkit and kind of understand when to work from one to the other. 

Reece

So, I'm going to hold your feet over the fire, everybody, and I'm going to ask each of you individually a binary question. Yes or no. Do you think Offensive AI could replace red teams? Nelson? 

Nelson

No. 

Reece

Why is that? 

Nelson

I think it'd become a tool for folks to target whatever they're trying to solve for. It's just like futzing. It's more to the arsenal as to the arsenal of how you choose to test a system, but you could automate it, of course. 

There's going to be folks driving that. 

Reece

Jasson, do you think Offensive AI could replace red teams? 

Jasson

I'm going to give you my answer in a Heisenberg state. 

Reece

That one went over my head, Jasson, but HB loved it. 

Jasson

The answer is simultaneously yes and no. 

Reece

Oh, God, I love those kind of answers. 

Jasson

And until we peek at the cat, we're never going to know. The cat's live or dead in Schrodinger…

Reece

Yeah, Schrodinger's cat. I know that one. 

Jasson

Okay. So, there are scenarios where, sure, it can replace a red team, but in those scenarios, whoever is actually seeking out the work is kind of fooling themselves or they don't really need the full...they just need some low-level fuzzing. They don't really need the value of what a red team brings. 

In the scenario where it's clearly not replacing a red team is when you really need the value of what a red team brings. 

Reece

What is that value? 

Jasson

The brains. 

Reece

The fleshy brains. 

Jasson

Yeah, I mean, we're all zombies and they bring us brains. No, a good red team is not going to show up with a bag of tools and mash a button and go and wait for the clock to ring. They're going to analyze whatever system you're asking them to actually take a look at. They're going to think back to first principles and just understand, all right, through what is the lifecycle of this thing and what are all the possibilities of attack, then they're going to think through likelihoods, like what's hard to get, right? 

And then they're going to bring tools to bear. Then they're going to bring automation scripts to bear. So, there's a certain level of deep thinking in terms of figuring out approach, setting up an approach, choosing the right tools, choosing the right team, that no, it's not going to get replaced. Maybe with AGI. 

Right? But AGI is kind of like quantum computing, right? It's still 10 years out. 

Reece

Got it. So, we'll see what the future holds on that one. HB, what about you? Are you going to hit me with a gray area answer, or are you going to go hard yes or no here? 

HB

No, I'm going to go with a yes. When I look at the market today, Jasson's point on pen testing being an important brain exercise is a great one that's kind of theoretical. 

Like, the reality is that you have tons of pen testing as a service products popping up, and they're not that great, but they're what a large portion of the industry keeps adopting, especially as requirements around OWASP and other compliance standards isn't clear. So, I think pen testing is largely going to be in the realm of compute platforms and automation and products that can scale smart people's insights to larger audiences because the adversaries will not be fully Offensive AI. 

I think the adversaries, the top 160 adversaries in the world, the likes of CrowdStrike Track, or any of the sort of larger cartel type of entities. I think that those guys will keep going and I think maybe in the global 2000, you might still have like, pen testing capabilities and maybe within security software companies, but there's a real lack already of the people who can be on the opposite side of these adversaries. 

And I think with Offensive AI, it just becomes harder and we'll need to figure out some way to scale, and I think platforms are the way. 

Nelson

Did you just invent the Uber of pen testing? 

Jasson

Doesn't that already exist? Isn't it like HackerOne and Bugcrowd? 

HB

There's a bunch of them that are essentially doing like…

Jasson

I mean, I guess one thing I would poke onto that, HB, will an underwriter actually give you an insurance policy if you don't have a real pen test? 

HB

But will an underwriter care if the pen test came from a third-rate independent in Idaho? 

Jasson

Fair enough, nothing against Idaho. 

Reece

Yeah. Please, Idaho listeners, don't quit on us. We love you. We love your potatoes. I think that's a beautiful note to end it on today. That's all for today. We'll see you in the next episode. 

Please like and subscribe. Goodbye. Good riddance.

Offensive AI Could Replace Red Teams

Phishing resistance in security solutions has become a necessity. Learn the differences between the solutions and what you need to be phishing resistant.

Join our host Reece Guida, Beyond Identity's CTO Jasson Casey, Product Evangelist Nelson Melo, and VP of Product Strategy Husnain Bajwa (HB) they discuss AI, Red Teams, and if AI might eventually replace Red Teams.

Transcription

Reece

Hello, and welcome to Cybersecurity Hot Takes. We are in the office on a beautiful sunny day. Not summy, because that's not a word, and we have a lot to talk about. So, I'm going to introduce myself. I am Reece Guida, your host and also a sales lady. To my left is Nelson. 

Please tell the people who you are and why you're here. 

Nelson

I'm Nelson, I'm the founding engineer. 

Reece

Okay, that was quick. Jasson, what about you? 

Jasson

I'm Jasson. I'm the CTO. 

Reece

That sounds about right. HB, can you confirm or deny that for us, please? 

HB

I can confirm it, although it's hard with him behind the microphone on his video, but yes, I can confirm it. And I am Husnain Bajwa. Everybody calls me HB, and I do product strategy here. 

Reece

Great. Let's get into the hot take. So here it is. Offensive AI could replace red teams. Now, please note, I'm not talking about offensive like mean, or rude. I'm talking about offensive as opposed to defensive. So, first of all, what the heck is Offensive AI? 

Nelson, you're the one that wanted to talk about this in the first place. 

Nelson

Oh, man. So I don't know much about it, and the team here can help, but what I found was a GitHub compilation of Offensive AI resources that kind of breaks it down by AI that can help with voice generation and image generation and text, and I thought it was interesting. 

I want to know more. So, what do you guys know? 

Reece

So, wait, before that, what is Defensive AI? Because Offensive AI kind of implies the existence of another side of the coin. 

Nelson

Is it detecting that an Offensive AI is trying to offend you? 

Reece

That sounds pretty likely. 

Nelson

Okay. 

Reece

Consider me offended. 

Jasson

If we had two Tweet bots that could chat each other and one was offensive and one was defensive and we pointed them at each other, what would result? 

Nelson

Something offensive. 

Jasson

Yeah. It reminds me of the Microsoft Tweet bot from the 2014. It's like, what is it? It got shut down like in a day and then they're like, "Oh, no, we fixed it," and it got shut down in another day and it was like zero to full-blown racist just like that. Which is an interesting reminder, right? A lot of these technologies are…sometimes it feels weird calling them AI because everyone thinks of AI and thinks of the movies and it's like, "Oh, you've got data from Star Trek or you've got this little kid from the Movie AI." 

But the reality is a lot of the techniques that we use for AI right now, they're really kind of slightly more sophisticated versions of drawing a line between a scatter plot. It is the most average version of things we can teach a computer model about ourselves, right? And the chatbot going racist is maybe should be a concerning exercise or concerning reflection. But more importantly, when we see a lot of these AI techniques, they're really looking at what's come before, right? 

The data sets that we train it on and it's literally just saying, right, if we fast-forwarded time and we gave it a little bit of a prompt, what would we have produced on our own? So when you think about it in that way, there are some fundamental limitations, there are some ceilings. It's not going to do something that can't really be constructed out of these statistical models. But again, that's based on a very specific set of techniques which for the most part, the industry has found useful in augmenting people, right? 

And helping us focus. We can't focus on the million events we get on our screen, but maybe if you can shine the spotlight on the 1% that is most riskiest or most deserving of attention. But I do still think Offensive AI could actually be a dual replacement, not just for the parts of the capability of a red team, but also maybe for the verbal game of a red team. 

Nelson

Did you guys see the tweet from I forgot who it was? Someone who was engaged in red teaming OpenAI. 

Reece

No, I didn't see that. That sounds like a cool tweet. Was it a thread? 

Nelson

I think I found it. So, Paul Röttger said, "I was part of OpenAI's red team for GPT4, testing its ability to generate harmful content." And he has a thread that's kind of cool about how seriously they're taking red teaming OpenAI. 

Jasson

Go ahead. 

HB

I was just going to say that the number of people who have been jailbreaking OpenAI's models in amazing ways, right? Like, let me convince the Generative Model to think that it's impersonating a different generative model in order to jailbreak that generative model's safety mechanisms and let you do things that sort of violate its rules. 

It's been kind of neat. And I think going back to Jasson's thing about Tay, after Tay, the industry sort of slowed down a bit, a lot actually and I think we're really lucky to have ChatGPT coming back now and having everyone focused on responsible AI, because that kind of stuff is like, what I think most of those kinds of red team activities right now are kind of focused on. 

Jasson

And just to clarify, Tay was Microsoft's racist chatbot, right? I believe we're looking for an affirmative or a negatory big bird. 

HB

It was. So Tay is definitely Microsoft's amazing flame out faster than their Zoon flame out or whatever their Zoon Phone, I guess was... 

Reece

Zoon Phone. Oh, I'm kind of glad I've never heard of this. 

Jasson

Oh, yeah, you're too young. I was about to say go watch Chuck TV show. They make references to it, but even that show is probably not quite hitting the age range here. 

Reece

Surprise, podcast listeners. I'm five years old. 

Jasson

Yeah. And it's easy to take hot shots at the world's biggest company because what does it matter to them, right? 

Reece

Absolutely nothing. 

Jasson

So there are a couple of things I would say on using chatbot not just for really any job, right? Whether it's offensive security or testing, QA, knowledge-based search, or whatnot. Everybody on news remember that when you are prompting a system, you're actually training it a little bit. And when you're feeding it your proprietary data, your customers' proprietary data, you may actually be inadvertently disclosing things that you shouldn't be, right? 

So, really understanding the model that's in place with the generative program that you're using is important. Is it a multilayer system where you actually have a guarantee that the information you put in has no propagation capability? Or is it a free for all? Right? So there are some important things to really think about when you're using some of these tools, and in fact, we actually discourage using the more popular ones right now for that specific reason. 

Nelson

Can you guys think of any countermeasures to someone feeding ChatGPT proprietary information in your company? 

Jasson

You can't solve stupid. 

Nelson

Yeah, but is there…

Jasson

Actually, I shouldn't say you can't solve stupid. People are always going to make mistakes, and you can train people to make fewer mistakes, but you're never going to train their mistake rate to zero. I don't know, my intuition on that is like, can I train people to not click on bad links? And again, we can reduce their rate. We can't make it zero, so why would we expect that to carry over to other problems? 

HB

I don't worry about it as much on ChatGPT, to be honest with you, because I've been using ChatGPT or Gpt-3 and 3.5 since last year, maybe even a little earlier whenever Jasper first introduced their products for copywriting and kind of cleaning things up in writing. The more direct link to sort of intellectual property challenges is probably the stuff going on with GitHub Copilot and the stuff that's going to happen with Office Copilot. 

I think those kinds of things create a lot more proprietary challenges. Right now, my bigger concern on Offensive AI is just the speed at which we're about to see a shift in the offensive capabilities of relatively unsophisticated threat actors. It reminds me of self-driving cars. 

The Society of Automotive Engineers has this cool little autonomy roadmap thing and people always talk about, is it Level 3, is it Level 4, is it Level 5 driving? Will we ever get to level five driving? Do we have the hardware? And a lot of times, it gets lost that there's like this great model where the first three levels zero to two are essentially human-led and the latter three levels are essentially supposed to be machine-led with the human-only assisting. 

It's very similar in this Offensive AI that the sophistication of automation attacks on initial access and reconnaissance has been pretty unsophisticated so far. If you start having a bunch of threat actors who are no longer subject to the ridiculous phishing training exercises where you're supposed to go like, "Oh, this person uses 40,000 commas, or this person didn't use any commas." 

And that means that this is a phishing email. When you lose all of that information so quickly, will people be able to adapt and even figure out anything about who's a good actor and bad actor? I just don't know right now. 

Jasson

I do think it's going to expose the difference between kind of heuristic probabilistic actions versus known…it's too abstract. What exactly am I trying to say? So, we've talked about trusted computing a lot, right? And the core concept and just bring up other people up to speed, right? The core concept behind trusted computing is, how do I know a thing is true about a piece of hardware or a piece of software? 

And the answer in a trusted computing world isn't, "Well, I ran a bunch of tests and everything looks good, or I audited the software six different ways, and I had peer-reviewed code and it looks good." The answer is I actually ran it through some sort of mechanical proof that's based on principled logic, and the answer proved true. Right? 

And of course, we have practical instances of this, and navigation systems for rockets, airplanes, auto landers, all those sorts of things. These are practical things. They're not just in research labs anymore. And as the barrier keeps getting lower for folks to fuzz, for folks to build out...kind of automate the reconnaissance phase and kind of get a set of payloads and delivery mechanisms kind of queued up quickly without having to do any sort of thought, I think it's really just going to expose even wider the fact that a lot of engineers thinking on defense and even writing code, to be honest, is not logically principled. 

It's more of, is this good enough? And kind of back to the clicking-the-link situation, right? Like, do we want to solve that problem by training people, which means we're just lowering the error rate, or do we want to solve that problem by making sure when they do click a link, nothing bad happens? I do think you're going to see trusted computing show up more and more as an actual defense against a lot of this. 

Nelson

But isn't this the ultimate opposite of trusted computing where you're going down to the proof? In AI and training models, you're completely probabilistic. Wouldn't you have to try to attack it with probabilistic models as well for the defense side? 

Jasson

I mean, you could. The point I'm trying to make is probabilistic defense, in my mind, is going to be incremental defense. Whereas if you can actually carve up the problem, and so not all problems you can carve up this way, right? But if you could carve up the problem to where it could be solvable in a trusted computing way, you'll have a guarantee, right? 

And that no matter what the adversary comes up with, that guarantee is going to hold unless one of your assumptions are violated, right? 

HB

Right after this topic sort of came up, one of the things that I ended up watching on YouTube sort of accidentally was this guy…was this YouTube series where an expert explains a topic that's extremely complicated at five different levels. 

Jasson

Starts with a five-year-old and then it goes up, right? 

HB

Yeah. And the professor was explaining zero-knowledge proofs, and it immediately reminded me, Jasson, of how you've been particularly strenuous internally on our sort of threat models, on the idea that zero means zero and not if you're going to do something, do it with correctness in mind. 

And I feel like that's the counter, right, is that probabilistic is a way to sort of almost make work. But there are actual solutions out there that are implementable and have just traditionally been seen as too much work. 

And I think now they just can't be seen as too much work. 

Jasson

There's a really good YouTube video by Leslie Lamport. He's giving a talk. I can't remember if it was like just one of his, "Hey, come give a talk," or if he was winning an award, but he was basically talking about when he…so for people who don't know, Leslie Lamport is a very famous computer scientist. He's a big shot over at Microsoft. 

But more importantly, he kind of discovered, right? In computer science, you say discovered, not invented, because it was always there, right? But he discovered a lot of the core distributed algorithms that kind of make the modern Internet work. So, whenever you're thinking about how to do interprocess communication, whenever you're thinking about, I have multiple threads and I want to make sure that I'm sharing data in a way that's consistent and safe. 

You're either using something that he designed or you're using something that relies on something that relies on something that he designed. Anyway, the talk that he gave I thought was really, really good because it was a combination of like here's a practical example of something that was happening with database replication at the time. And I think he was referencing something in the '70s or the '80s. 

And this person published a paper about how to actually do this correctly. And he's looking at it and he's thinking, "This makes no sense." Like, I can come up with these different sequences where your data becomes inconsistent very, very quickly. One of his first contributions, I'm actually going to forget the name of the paper, was how to actually do a consistent update across two different independent data, or not independent, but two different data sets that were trying to stay in synchronization. 

And the points that he's trying to stress in his talk is when you're really focused on solving kind of principled problems, right? Like, whether it's scale, whether it's performance or whatnot, there is a math and there is a science that actually can give you absolutes on specific boundary conditions, right? 

But it can give you absolutes, and you don't have to guess and you don't have to say, "Well, we tried hard." You can actually know. And number one, he's built a career on that. But number two, we've kind of built the Internet on top of his career of that. And yeah, I think it's just another good…I'll go try and dig up the URL so we can post it later. But it's a really good reminder, I think, to anyone, whether it's a defender, a red team engineer, or even just a general software engineer, to really kind of have that perspective, have that tool in their toolkit and kind of understand when to work from one to the other. 

Reece

So, I'm going to hold your feet over the fire, everybody, and I'm going to ask each of you individually a binary question. Yes or no. Do you think Offensive AI could replace red teams? Nelson? 

Nelson

No. 

Reece

Why is that? 

Nelson

I think it'd become a tool for folks to target whatever they're trying to solve for. It's just like futzing. It's more to the arsenal as to the arsenal of how you choose to test a system, but you could automate it, of course. 

There's going to be folks driving that. 

Reece

Jasson, do you think Offensive AI could replace red teams? 

Jasson

I'm going to give you my answer in a Heisenberg state. 

Reece

That one went over my head, Jasson, but HB loved it. 

Jasson

The answer is simultaneously yes and no. 

Reece

Oh, God, I love those kind of answers. 

Jasson

And until we peek at the cat, we're never going to know. The cat's live or dead in Schrodinger…

Reece

Yeah, Schrodinger's cat. I know that one. 

Jasson

Okay. So, there are scenarios where, sure, it can replace a red team, but in those scenarios, whoever is actually seeking out the work is kind of fooling themselves or they don't really need the full...they just need some low-level fuzzing. They don't really need the value of what a red team brings. 

In the scenario where it's clearly not replacing a red team is when you really need the value of what a red team brings. 

Reece

What is that value? 

Jasson

The brains. 

Reece

The fleshy brains. 

Jasson

Yeah, I mean, we're all zombies and they bring us brains. No, a good red team is not going to show up with a bag of tools and mash a button and go and wait for the clock to ring. They're going to analyze whatever system you're asking them to actually take a look at. They're going to think back to first principles and just understand, all right, through what is the lifecycle of this thing and what are all the possibilities of attack, then they're going to think through likelihoods, like what's hard to get, right? 

And then they're going to bring tools to bear. Then they're going to bring automation scripts to bear. So, there's a certain level of deep thinking in terms of figuring out approach, setting up an approach, choosing the right tools, choosing the right team, that no, it's not going to get replaced. Maybe with AGI. 

Right? But AGI is kind of like quantum computing, right? It's still 10 years out. 

Reece

Got it. So, we'll see what the future holds on that one. HB, what about you? Are you going to hit me with a gray area answer, or are you going to go hard yes or no here? 

HB

No, I'm going to go with a yes. When I look at the market today, Jasson's point on pen testing being an important brain exercise is a great one that's kind of theoretical. 

Like, the reality is that you have tons of pen testing as a service products popping up, and they're not that great, but they're what a large portion of the industry keeps adopting, especially as requirements around OWASP and other compliance standards isn't clear. So, I think pen testing is largely going to be in the realm of compute platforms and automation and products that can scale smart people's insights to larger audiences because the adversaries will not be fully Offensive AI. 

I think the adversaries, the top 160 adversaries in the world, the likes of CrowdStrike Track, or any of the sort of larger cartel type of entities. I think that those guys will keep going and I think maybe in the global 2000, you might still have like, pen testing capabilities and maybe within security software companies, but there's a real lack already of the people who can be on the opposite side of these adversaries. 

And I think with Offensive AI, it just becomes harder and we'll need to figure out some way to scale, and I think platforms are the way. 

Nelson

Did you just invent the Uber of pen testing? 

Jasson

Doesn't that already exist? Isn't it like HackerOne and Bugcrowd? 

HB

There's a bunch of them that are essentially doing like…

Jasson

I mean, I guess one thing I would poke onto that, HB, will an underwriter actually give you an insurance policy if you don't have a real pen test? 

HB

But will an underwriter care if the pen test came from a third-rate independent in Idaho? 

Jasson

Fair enough, nothing against Idaho. 

Reece

Yeah. Please, Idaho listeners, don't quit on us. We love you. We love your potatoes. I think that's a beautiful note to end it on today. That's all for today. We'll see you in the next episode. 

Please like and subscribe. Goodbye. Good riddance.

Offensive AI Could Replace Red Teams

Phishing resistance in security solutions has become a necessity. Learn the differences between the solutions and what you need to be phishing resistant.

Join our host Reece Guida, Beyond Identity's CTO Jasson Casey, Product Evangelist Nelson Melo, and VP of Product Strategy Husnain Bajwa (HB) they discuss AI, Red Teams, and if AI might eventually replace Red Teams.

Transcription

Reece

Hello, and welcome to Cybersecurity Hot Takes. We are in the office on a beautiful sunny day. Not summy, because that's not a word, and we have a lot to talk about. So, I'm going to introduce myself. I am Reece Guida, your host and also a sales lady. To my left is Nelson. 

Please tell the people who you are and why you're here. 

Nelson

I'm Nelson, I'm the founding engineer. 

Reece

Okay, that was quick. Jasson, what about you? 

Jasson

I'm Jasson. I'm the CTO. 

Reece

That sounds about right. HB, can you confirm or deny that for us, please? 

HB

I can confirm it, although it's hard with him behind the microphone on his video, but yes, I can confirm it. And I am Husnain Bajwa. Everybody calls me HB, and I do product strategy here. 

Reece

Great. Let's get into the hot take. So here it is. Offensive AI could replace red teams. Now, please note, I'm not talking about offensive like mean, or rude. I'm talking about offensive as opposed to defensive. So, first of all, what the heck is Offensive AI? 

Nelson, you're the one that wanted to talk about this in the first place. 

Nelson

Oh, man. So I don't know much about it, and the team here can help, but what I found was a GitHub compilation of Offensive AI resources that kind of breaks it down by AI that can help with voice generation and image generation and text, and I thought it was interesting. 

I want to know more. So, what do you guys know? 

Reece

So, wait, before that, what is Defensive AI? Because Offensive AI kind of implies the existence of another side of the coin. 

Nelson

Is it detecting that an Offensive AI is trying to offend you? 

Reece

That sounds pretty likely. 

Nelson

Okay. 

Reece

Consider me offended. 

Jasson

If we had two Tweet bots that could chat each other and one was offensive and one was defensive and we pointed them at each other, what would result? 

Nelson

Something offensive. 

Jasson

Yeah. It reminds me of the Microsoft Tweet bot from the 2014. It's like, what is it? It got shut down like in a day and then they're like, "Oh, no, we fixed it," and it got shut down in another day and it was like zero to full-blown racist just like that. Which is an interesting reminder, right? A lot of these technologies are…sometimes it feels weird calling them AI because everyone thinks of AI and thinks of the movies and it's like, "Oh, you've got data from Star Trek or you've got this little kid from the Movie AI." 

But the reality is a lot of the techniques that we use for AI right now, they're really kind of slightly more sophisticated versions of drawing a line between a scatter plot. It is the most average version of things we can teach a computer model about ourselves, right? And the chatbot going racist is maybe should be a concerning exercise or concerning reflection. But more importantly, when we see a lot of these AI techniques, they're really looking at what's come before, right? 

The data sets that we train it on and it's literally just saying, right, if we fast-forwarded time and we gave it a little bit of a prompt, what would we have produced on our own? So when you think about it in that way, there are some fundamental limitations, there are some ceilings. It's not going to do something that can't really be constructed out of these statistical models. But again, that's based on a very specific set of techniques which for the most part, the industry has found useful in augmenting people, right? 

And helping us focus. We can't focus on the million events we get on our screen, but maybe if you can shine the spotlight on the 1% that is most riskiest or most deserving of attention. But I do still think Offensive AI could actually be a dual replacement, not just for the parts of the capability of a red team, but also maybe for the verbal game of a red team. 

Nelson

Did you guys see the tweet from I forgot who it was? Someone who was engaged in red teaming OpenAI. 

Reece

No, I didn't see that. That sounds like a cool tweet. Was it a thread? 

Nelson

I think I found it. So, Paul Röttger said, "I was part of OpenAI's red team for GPT4, testing its ability to generate harmful content." And he has a thread that's kind of cool about how seriously they're taking red teaming OpenAI. 

Jasson

Go ahead. 

HB

I was just going to say that the number of people who have been jailbreaking OpenAI's models in amazing ways, right? Like, let me convince the Generative Model to think that it's impersonating a different generative model in order to jailbreak that generative model's safety mechanisms and let you do things that sort of violate its rules. 

It's been kind of neat. And I think going back to Jasson's thing about Tay, after Tay, the industry sort of slowed down a bit, a lot actually and I think we're really lucky to have ChatGPT coming back now and having everyone focused on responsible AI, because that kind of stuff is like, what I think most of those kinds of red team activities right now are kind of focused on. 

Jasson

And just to clarify, Tay was Microsoft's racist chatbot, right? I believe we're looking for an affirmative or a negatory big bird. 

HB

It was. So Tay is definitely Microsoft's amazing flame out faster than their Zoon flame out or whatever their Zoon Phone, I guess was... 

Reece

Zoon Phone. Oh, I'm kind of glad I've never heard of this. 

Jasson

Oh, yeah, you're too young. I was about to say go watch Chuck TV show. They make references to it, but even that show is probably not quite hitting the age range here. 

Reece

Surprise, podcast listeners. I'm five years old. 

Jasson

Yeah. And it's easy to take hot shots at the world's biggest company because what does it matter to them, right? 

Reece

Absolutely nothing. 

Jasson

So there are a couple of things I would say on using chatbot not just for really any job, right? Whether it's offensive security or testing, QA, knowledge-based search, or whatnot. Everybody on news remember that when you are prompting a system, you're actually training it a little bit. And when you're feeding it your proprietary data, your customers' proprietary data, you may actually be inadvertently disclosing things that you shouldn't be, right? 

So, really understanding the model that's in place with the generative program that you're using is important. Is it a multilayer system where you actually have a guarantee that the information you put in has no propagation capability? Or is it a free for all? Right? So there are some important things to really think about when you're using some of these tools, and in fact, we actually discourage using the more popular ones right now for that specific reason. 

Nelson

Can you guys think of any countermeasures to someone feeding ChatGPT proprietary information in your company? 

Jasson

You can't solve stupid. 

Nelson

Yeah, but is there…

Jasson

Actually, I shouldn't say you can't solve stupid. People are always going to make mistakes, and you can train people to make fewer mistakes, but you're never going to train their mistake rate to zero. I don't know, my intuition on that is like, can I train people to not click on bad links? And again, we can reduce their rate. We can't make it zero, so why would we expect that to carry over to other problems? 

HB

I don't worry about it as much on ChatGPT, to be honest with you, because I've been using ChatGPT or Gpt-3 and 3.5 since last year, maybe even a little earlier whenever Jasper first introduced their products for copywriting and kind of cleaning things up in writing. The more direct link to sort of intellectual property challenges is probably the stuff going on with GitHub Copilot and the stuff that's going to happen with Office Copilot. 

I think those kinds of things create a lot more proprietary challenges. Right now, my bigger concern on Offensive AI is just the speed at which we're about to see a shift in the offensive capabilities of relatively unsophisticated threat actors. It reminds me of self-driving cars. 

The Society of Automotive Engineers has this cool little autonomy roadmap thing and people always talk about, is it Level 3, is it Level 4, is it Level 5 driving? Will we ever get to level five driving? Do we have the hardware? And a lot of times, it gets lost that there's like this great model where the first three levels zero to two are essentially human-led and the latter three levels are essentially supposed to be machine-led with the human-only assisting. 

It's very similar in this Offensive AI that the sophistication of automation attacks on initial access and reconnaissance has been pretty unsophisticated so far. If you start having a bunch of threat actors who are no longer subject to the ridiculous phishing training exercises where you're supposed to go like, "Oh, this person uses 40,000 commas, or this person didn't use any commas." 

And that means that this is a phishing email. When you lose all of that information so quickly, will people be able to adapt and even figure out anything about who's a good actor and bad actor? I just don't know right now. 

Jasson

I do think it's going to expose the difference between kind of heuristic probabilistic actions versus known…it's too abstract. What exactly am I trying to say? So, we've talked about trusted computing a lot, right? And the core concept and just bring up other people up to speed, right? The core concept behind trusted computing is, how do I know a thing is true about a piece of hardware or a piece of software? 

And the answer in a trusted computing world isn't, "Well, I ran a bunch of tests and everything looks good, or I audited the software six different ways, and I had peer-reviewed code and it looks good." The answer is I actually ran it through some sort of mechanical proof that's based on principled logic, and the answer proved true. Right? 

And of course, we have practical instances of this, and navigation systems for rockets, airplanes, auto landers, all those sorts of things. These are practical things. They're not just in research labs anymore. And as the barrier keeps getting lower for folks to fuzz, for folks to build out...kind of automate the reconnaissance phase and kind of get a set of payloads and delivery mechanisms kind of queued up quickly without having to do any sort of thought, I think it's really just going to expose even wider the fact that a lot of engineers thinking on defense and even writing code, to be honest, is not logically principled. 

It's more of, is this good enough? And kind of back to the clicking-the-link situation, right? Like, do we want to solve that problem by training people, which means we're just lowering the error rate, or do we want to solve that problem by making sure when they do click a link, nothing bad happens? I do think you're going to see trusted computing show up more and more as an actual defense against a lot of this. 

Nelson

But isn't this the ultimate opposite of trusted computing where you're going down to the proof? In AI and training models, you're completely probabilistic. Wouldn't you have to try to attack it with probabilistic models as well for the defense side? 

Jasson

I mean, you could. The point I'm trying to make is probabilistic defense, in my mind, is going to be incremental defense. Whereas if you can actually carve up the problem, and so not all problems you can carve up this way, right? But if you could carve up the problem to where it could be solvable in a trusted computing way, you'll have a guarantee, right? 

And that no matter what the adversary comes up with, that guarantee is going to hold unless one of your assumptions are violated, right? 

HB

Right after this topic sort of came up, one of the things that I ended up watching on YouTube sort of accidentally was this guy…was this YouTube series where an expert explains a topic that's extremely complicated at five different levels. 

Jasson

Starts with a five-year-old and then it goes up, right? 

HB

Yeah. And the professor was explaining zero-knowledge proofs, and it immediately reminded me, Jasson, of how you've been particularly strenuous internally on our sort of threat models, on the idea that zero means zero and not if you're going to do something, do it with correctness in mind. 

And I feel like that's the counter, right, is that probabilistic is a way to sort of almost make work. But there are actual solutions out there that are implementable and have just traditionally been seen as too much work. 

And I think now they just can't be seen as too much work. 

Jasson

There's a really good YouTube video by Leslie Lamport. He's giving a talk. I can't remember if it was like just one of his, "Hey, come give a talk," or if he was winning an award, but he was basically talking about when he…so for people who don't know, Leslie Lamport is a very famous computer scientist. He's a big shot over at Microsoft. 

But more importantly, he kind of discovered, right? In computer science, you say discovered, not invented, because it was always there, right? But he discovered a lot of the core distributed algorithms that kind of make the modern Internet work. So, whenever you're thinking about how to do interprocess communication, whenever you're thinking about, I have multiple threads and I want to make sure that I'm sharing data in a way that's consistent and safe. 

You're either using something that he designed or you're using something that relies on something that relies on something that he designed. Anyway, the talk that he gave I thought was really, really good because it was a combination of like here's a practical example of something that was happening with database replication at the time. And I think he was referencing something in the '70s or the '80s. 

And this person published a paper about how to actually do this correctly. And he's looking at it and he's thinking, "This makes no sense." Like, I can come up with these different sequences where your data becomes inconsistent very, very quickly. One of his first contributions, I'm actually going to forget the name of the paper, was how to actually do a consistent update across two different independent data, or not independent, but two different data sets that were trying to stay in synchronization. 

And the points that he's trying to stress in his talk is when you're really focused on solving kind of principled problems, right? Like, whether it's scale, whether it's performance or whatnot, there is a math and there is a science that actually can give you absolutes on specific boundary conditions, right? 

But it can give you absolutes, and you don't have to guess and you don't have to say, "Well, we tried hard." You can actually know. And number one, he's built a career on that. But number two, we've kind of built the Internet on top of his career of that. And yeah, I think it's just another good…I'll go try and dig up the URL so we can post it later. But it's a really good reminder, I think, to anyone, whether it's a defender, a red team engineer, or even just a general software engineer, to really kind of have that perspective, have that tool in their toolkit and kind of understand when to work from one to the other. 

Reece

So, I'm going to hold your feet over the fire, everybody, and I'm going to ask each of you individually a binary question. Yes or no. Do you think Offensive AI could replace red teams? Nelson? 

Nelson

No. 

Reece

Why is that? 

Nelson

I think it'd become a tool for folks to target whatever they're trying to solve for. It's just like futzing. It's more to the arsenal as to the arsenal of how you choose to test a system, but you could automate it, of course. 

There's going to be folks driving that. 

Reece

Jasson, do you think Offensive AI could replace red teams? 

Jasson

I'm going to give you my answer in a Heisenberg state. 

Reece

That one went over my head, Jasson, but HB loved it. 

Jasson

The answer is simultaneously yes and no. 

Reece

Oh, God, I love those kind of answers. 

Jasson

And until we peek at the cat, we're never going to know. The cat's live or dead in Schrodinger…

Reece

Yeah, Schrodinger's cat. I know that one. 

Jasson

Okay. So, there are scenarios where, sure, it can replace a red team, but in those scenarios, whoever is actually seeking out the work is kind of fooling themselves or they don't really need the full...they just need some low-level fuzzing. They don't really need the value of what a red team brings. 

In the scenario where it's clearly not replacing a red team is when you really need the value of what a red team brings. 

Reece

What is that value? 

Jasson

The brains. 

Reece

The fleshy brains. 

Jasson

Yeah, I mean, we're all zombies and they bring us brains. No, a good red team is not going to show up with a bag of tools and mash a button and go and wait for the clock to ring. They're going to analyze whatever system you're asking them to actually take a look at. They're going to think back to first principles and just understand, all right, through what is the lifecycle of this thing and what are all the possibilities of attack, then they're going to think through likelihoods, like what's hard to get, right? 

And then they're going to bring tools to bear. Then they're going to bring automation scripts to bear. So, there's a certain level of deep thinking in terms of figuring out approach, setting up an approach, choosing the right tools, choosing the right team, that no, it's not going to get replaced. Maybe with AGI. 

Right? But AGI is kind of like quantum computing, right? It's still 10 years out. 

Reece

Got it. So, we'll see what the future holds on that one. HB, what about you? Are you going to hit me with a gray area answer, or are you going to go hard yes or no here? 

HB

No, I'm going to go with a yes. When I look at the market today, Jasson's point on pen testing being an important brain exercise is a great one that's kind of theoretical. 

Like, the reality is that you have tons of pen testing as a service products popping up, and they're not that great, but they're what a large portion of the industry keeps adopting, especially as requirements around OWASP and other compliance standards isn't clear. So, I think pen testing is largely going to be in the realm of compute platforms and automation and products that can scale smart people's insights to larger audiences because the adversaries will not be fully Offensive AI. 

I think the adversaries, the top 160 adversaries in the world, the likes of CrowdStrike Track, or any of the sort of larger cartel type of entities. I think that those guys will keep going and I think maybe in the global 2000, you might still have like, pen testing capabilities and maybe within security software companies, but there's a real lack already of the people who can be on the opposite side of these adversaries. 

And I think with Offensive AI, it just becomes harder and we'll need to figure out some way to scale, and I think platforms are the way. 

Nelson

Did you just invent the Uber of pen testing? 

Jasson

Doesn't that already exist? Isn't it like HackerOne and Bugcrowd? 

HB

There's a bunch of them that are essentially doing like…

Jasson

I mean, I guess one thing I would poke onto that, HB, will an underwriter actually give you an insurance policy if you don't have a real pen test? 

HB

But will an underwriter care if the pen test came from a third-rate independent in Idaho? 

Jasson

Fair enough, nothing against Idaho. 

Reece

Yeah. Please, Idaho listeners, don't quit on us. We love you. We love your potatoes. I think that's a beautiful note to end it on today. That's all for today. We'll see you in the next episode. 

Please like and subscribe. Goodbye. Good riddance.

Book

Offensive AI Could Replace Red Teams

Phishing resistance in security solutions has become a necessity. Learn the differences between the solutions and what you need to be phishing resistant.

Download the book

By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.