Augments Authentication Flows

This video is number three of the Zero Trust Authentication Master Class series.


I am still Jasson Casey, and you are still watching, apparently. So now we're going to drill in and talk a little bit more and add a little bit more sophistication to the authentication sequence with Alice and her bank. 

So let's start and actually show Alice as kind of an independent actor here relative to her laptop. And then let's also show the bank not as one giant machine that knows everything, but as a collection of devices that provide the bank's services. 

So Alice and her laptop are what we would call an end user. And the bank essentially is served by their services and a database. 

So Alice will start with wanting to access her bank account. So she types in the bank account's URL or domain name. We then get kind of our first access attempt, right? Where it just loads the bank's primary site, right? So that would be a 200 OK. 

And in the body of this is the app, right? When it then presents or displays back to Alice. So Alice then wants to complete the login. So she will then essentially input or enter her username and password as we talked about before. So the username and password comes across and authenticate, and they're in the body as we described before. 

And then we go over here and we want to do a verification, but before we do the verification, what are we verifying it against? So we need to do a lookup. So the server reaches out to the database and it basically says, "Hey, give me all the information on this user named Alice." Obviously, I'm trivializing things here. 

It does its internal computation, making sure that there is an Alice, gives that information back. So the result on Alice, right? There's a lot of stuff it's going to have on Alice. And then there's a verification step, right? Essentially we want to know, is the username correct and is the password correct? Let's assume the answer is yes, right? 

So then we will respond and we will respond with an authenticated session that then gets displayed. And at this point right here, Alice now has an authenticated session and is actually...she's in her service. 

So I said I was going to show you something a little bit more different this time. And what I was doing is I was showing you the different actors here and the different actors in the bank, the servers, and the database, because I wanted to kind of point out a couple things. Remember earlier, we kind of established this framework of surface area, right? Surface area is really just a... think of it as the things that we have to protect, right? 

So password is a piece of data. So the surface area that I want to worry about is this piece of data essentially at rest and the surface area in motion, right? So by expanding this out, we clearly kind of understand that the ecosystem is a little bit more complicated than we previously talked about. 

So now let's make this a little bit more realistic. No one logs into their bank without having to issue a second factor. Most oftentimes, that second factor is a rotating code, right? A time-based rotating code. We call those TOTP. So let's talk about how that actually plays into this scenario. So rather than saying yes and let Alice in, what in fact this would be, is it would actually be a display, not unlike the username and password display, but basically saying, "Enter your one-time code," right? 

Alice would then enter her one-time code. Now, oftentimes, these's an algorithm that once it's synchronized, it really just follows time. And we'll see how that verification works in a minute. So she could have that application on her laptop, or she could have it on a phone in her possession. 

It doesn't really matter. The only thing that matters is that that device was synchronized at some moment before all of this with the database, right? So Alice puts in that one-time code. That one-time code comes across, again, not unlike what we've seen before, right? It's still part of the off, TOTP equals blah, blah, blah, blah, whatever this code is, right? 

And maybe I looked it up before, maybe I didn't. Let's say we're in a phishing and I do a lookup to try and understand, again, some information about Alice, that comes back and I provide a verification step. And the verification step is really just trying to understand, is the code that Alice provided me, in fact, valid against the seed that she had bound this with? 

And let's assume that that is yes. We just send back 200 OK with the authorized session app, right? Of course, that gets displayed. And Alice is now in. 

So this is a slightly more realistic authentication flow. It's not completely realistic because there are more things in play. But it is a pretty accurate first-order representation. So now let's try and analyze a couple of things, right? So a username and password lives in a lot of places in this picture, right? It clearly lives in the database. 

It clearly exists in the laptop for a moment in time. It clearly exists in the servers for a moment of time, right? It clearly is in transit both between the laptop and the servers and the servers and the database. So, you know, we talk about surface area. 

Surface area, it's kind of a fancy way of saying what's the size of the things that we actually have to secure? Because as we know, a username and password in this sense, anyone who knows the username and password and has the bound authenticator can now represent Alice. It also turns out that Alice, like most people, is probably reusing, if not this password, a variation on this password at a bunch of other services that Alice has. 

So it's not just a vulnerability on this service having access to this username and credential, but also a vulnerability on other services. TOTP provides a little bit more protection over just the password, but we'll get into some scenarios in a little bit where we show that little bit of protection actually isn't that much. And Alice is still vulnerable in a lot of phishing scenarios. 

But let's try and quantify what's the risk here, right? So let's imagine that we could represent surface area really with this two-dimensional graph. So on the x-axis, let's worry about data really at rest, right? And I'm just going to keep things simple. Every time there's a unique instance of a password being at rest over here, I'm going to have a little tick mark because it's something I have to think about. 

And I either have to mitigate in some way, risk accepted, i.e., basically just realize that it's a weakness that I'm going to have to live with or accept it and transfer it, which is a fancy way of saying buy insurance for a rainy day. So we know the password lives in Alice's head, that's kind of out of scope, so we're not going to worry about that. 

But when Alice types the password into the computer it clearly lives in memory in the computer, right? We all know from just kind of basic operating system concepts that if Alice's computer swaps memory, it could exist in the file system, right? And memory gets swapped when the file system gets quite large. If that application or the browser were to crash, the memory dump could exist in the file system as well, right? 

On the servers, it clearly exists in memory. And then the two scenarios that I talked about before could also exist, right? In the database, it clearly exists in memory. The two scenarios I talked about before, and specifically because of how a database works, it must definitely, most definitely exist in some persistent store. So that when the database is started, the data is actually still present. 

We know it exists in transit over the TLS connection that's actually shepherding the laptop to server, right? We know that it exists between the server and the database. So put in another way, we kind of have this map or score, if you will, right? So it's one, two, three, four, five, six, seven, eight, nine, 10. 

So we'll call it a 20, right? It's kind of a meaningless unit but we'll compare it to something here in a minute, right? So like this is a surface area that exists that I have to worry about protecting the username and password that an adversary could basically take advantage of any of these points in this grid. So a couple of you may be thinking, "Yeah, but Jasson, no one stores passwords in the clear, passwords are stored and they're salted." 

And you're right, right? So let's actually dig into that a little bit. So first of all, what that means is the concern is what if someone were to get access to the file system that backs up this database, then they could kind of dump everyone's password, right? So we clearly want that password to not list in the clear, and the typical approach that people take is they have what's called a salt, right? 

So you can think of a salt as just a random string. So the salt plus the password is run through a cryptographic hash function. And so this produces a unique value, right? And this unique value or string is it's very easy to predict if you know the salt and the password. It's very hard to predict if you don't know the salt and password, and it's because it's a hash, it's something called a one-way function. 

And what that basically means is if someone were to have access just to the unique value, it's not possible for them to recompute the password. Now it turns out that there's a lot of attacks against this. That while it's technically true, this is not an invertible function, I can't easily go from this direction to this direction. 

It turns out most users actually use dictionary words, dictionary phrases. And so if I know the salt, I can actually just start salting words in a dictionary, producing hashes. And then if I have access... This is an offline attack. So if I have access to this file, then I can just compare all these hashes to what's in the file. 

And if I get a hit, then I know the dictionary word that matched over to that particular hash that I got a hit on is in fact the actual password. Even though this is a one-way function, that's how people actually attack this sort of system. But let's say that does give us some measure of protection. 

So how does it actually affect the surface area? Or remember, only one of these data points had to do with the file system. The rest of it had to do with memory, right? I have to have the password to then salt the password with a hash to then compare against the unique string. So I'm only reducing my surface area by two units, right? 

And again, we kind of said these units were meaningless, but just by magnitude, yes, 18 as a number is smaller than 20, but I haven't effectively changed the surface area that I have to mitigate, risk accept, and/or risk transfer. So again, the whole point of that is really just to show that when I'm using a username and password, there are techniques to protect against insider threat and offline attack, i.e., if someone is able to get access to the file that backs up this database. 

And most mitigations are to do this thing where you basically just hash the inputs in a way that's predictable to where you can use it successfully for future authentications. But if an adversary were to get access to it, it's not easy for them. Now it turns out there are techniques, but again from a surface area perspective, we're not materially impacting the vulnerability of the password. 

And in the next video, we'll get into why the second factor isn't really giving us that much protection either.