Security Unlocked


Inside Insider Risk

Ep. 23
Throughout the course of this podcast series, we’ve had an abundance of great conversations with our colleagues at Microsoft about how they’re working to better protect companies and individuals from cyber-attacks, but today we take a look at a different source of malfeasance: the insider threat. Now that most people are working remotely and have access to their company’s data in the privacy of their own home, it’s easier than ever to access, download, and share private information.On today’s episode, hosts Nic Fillingham and Natalia Godyla sit down with Microsoft Applied Researcher, Rob McCann to talk about his work in identifying potential insider risk factors and the tools that Microsoft’s Internal Security Team are developing to stop them at the source.In This Episode, You Will Learn:• The differences between internal and external threats in cybersecurity• Ways that A.I. can factor into anomaly detection in insider risk management• Why the rise in insider attacks is helping make it easier to address the issue.Some Questions We Ask:• How do you identify insider risk?• How do you create a tool for customers that requires an extreme amount of case-by-case customization?• How are other organizations prioritizing internal versus external risks? Resources:Rob McCann’s Linkedin: McCann on Uncovering Hidden Risk: Risk Blog Post: Fillingham’s LinkedIn: Godyla’s LinkedIn: Transcript[Full transcript can be found at]Nic Fillingham:Hello and welcome to Security Unlocked, a new podcast from Microsoft where we unlock insights from the latest in news and research from across Microsoft security engineering and operations teams. I'm Nic Fillingham. Natalia Godyla:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft Security. Deep dive into the newest threat intel, research and data science. Nic Fillingham:And profile some of the fascinating people working on artificial intelligence in Microsoft Security. Natalia Godyla:And now, let's unlock the pod. Natalia Godyla:Hello Nic, welcome to today's episode, how's it going with you? Nic Fillingham:Hello Natalia, I'm very well, thank you, I hope you're well, and uh, welcome to listeners, to episode 23, of the Security Unlocked podcast. On the pod today, we have Rob McCann, applied researcher here at Microsoft, working on insider risk management, which is us taking the Security Unlocked podcast into- to new territory. We're in the compliance space, now. Natalia Godyla:We are, and so we're definitely interested in feedback. Drop us a note at to let us know whether these topics interested you, whether there is another avenue you'd like us to go down, in compliance. Also always accepting memes. Nic Fillingham:Cat memes, sort of more specifically. Natalia Godyla:(laughing) Nic Fillingham:All memes? Or just cat memes? Natalia Godyla:Cat memes, llama memes, al- Nic Fillingham:Alpaca- Natalia Godyla:... paca memes. Nic Fillingham:... memes. Yeah. Alpaca. Yeah, this is a really interesting uh, topic, so insider risk, and insider risk management is the ability for security teams, for IT teams, for HR to use AI and machine learning, and other sort of automation based tools, to identify when an employee, or when someone inside your organization might be accidentally doing something that is going to create risk for the company, or potentially intentionally uh, whether they have, you know, nefarious or sort of malicious intent. Nic Fillingham:So, it really- really great conversation we had with- with Rob about what is insider risk, what are the different types of insider risk, how is uh, AI and ML being used to go tackle it? Natalia Godyla:Yeah, there's an incredible amount of work happening to understand the context, because so many of these circumstances require data from different departments, uh, uniquely different departments, like HR, to try to understand, well is- is somebody about to leave the company, and if so, how is that related to the volume of data that they just downloaded? And with that, on to the pod. Nic Fillingham:On with the pod. Nic Fillingham:Welcome to the Security Unlocked podcast, Rob McCann, thank you so much for your time. Rob McCann:Thank you for having me. Nic Fillingham:Rob, we'd love to start with a quick intro. Who are you, what do you do? What's your day to day look like at Microsoft, what kind of products or technology do you touch? Give us a- give us an intro, please. Rob McCann:Well, I've been at Microsoft for about 15 years, I am a- I've been an applied researcher the entire time. So, what that means is, I get to bounce around various products and solve technical challenges. That's the official thing, what it actually means is, whatever my boss needs done, that's a technical hurdle, uh, they just throw it my way, and I have to try to work on that. So, applied scientist. Nic Fillingham:Applied scientist, versus what's a- what's a different type of scientist, so what- what's the parallel to applied science, in this sense? Rob McCann:So, applied researcher is sort of a dream job. So, when I initially started, they're sort of the academic style researcher, that it's very much uh, your production is to produce papers and new ideas that sort of in a vacuum look good, and get those out to the scientific community. I love doing that kind of stuff. I don't so much like just writing papers. And so, an applied researcher, what we gotta do, is we gotta sort of be this conduit.Rob McCann:We get to solve things that are closer to the product, and sort of deliver those into the product. So we get very real, tangible impact, but then we're also very much a bridge. So, part of our responsibility is to keep, you know, fingers on what's going on in the abstract research world and try to foster, basically, a large innovation pipe. So, I freaking love this job. Uh, it's exactly what I like to do. I like to solve hard technical problems, and then I like to ship stuff. I'm a very um ... I need tangible stuff. So I love it. Nic Fillingham:And what are you working on at the moment, what's the scope of your role, what's your bailiwick? (laughing) Rob McCann:My bailiwick is uh, right now I'm very much focused on IRM, which is insider risk management, and so what we've been doing over the last year or so, insider risk management GA'd in February of 2020, I want to say. So, Ignite Today is a very festive sort of one year anniversary type thing. That with compliance solutions. So, over this last year, what we've done a lot of is sort of uh, build a team of researchers to try to tackle these challenges that are in insider risk, uh, and sort of bring the science to this brand new product. So, a lot of what I'm doing on a daily basis is on one hand, the one hand is, solve some technical things and get it out there, and the other hand is build a team to strengthen the muscle, the research muscle. Natalia Godyla:So, let's talk a little bit more about insider risk management. Can you describe how insider risk differs from external risk, and more specifically, some of the risks associated with internal users? Rob McCann:It's uh, there's some overlap. But it's a lot different than external attack. So, first of all, it's very hard, not saying that external attack is not hard, I- I work with a lot of those people as well. But insiders are already in, right? And they already have permissions to do stuff, and they're already doing things in there. So there's not like, you have a- a ... some perimeter that you can just camp on, and try to get people when they're coming in the front door. Rob McCann:So that makes it hard. Uh, another thing that makes it hard is the variety of risks. So, different customers have different definitions of risk. So, risk might be um, we might want to protect our data, so we don't want data exfiltrated out of the company. We might want trade secrets, so we don't want people to even see stuff that they shouldn't see. We don't want workplace harassment, uh, we don't want sabotage. We don't want people to come in, and implant stuff into our code that's gonna cause problems later. It's a very broad space of potential risks, and so that makes it challenging as well. Rob McCann:And then I would say the third thing that makes it very challenging is, what I said, different customers want- have different definitions of risk. So it's not like ... like, I like the contrast to malware detection. So, we have these external security people that are trying to do all this sophisticated machine learning, to have a classifier that can recognize incoming bad code. Right? And sort of when they get that, like, the whole industry is like, "Yes, we agree, that's bad code, put it in Virus Total, or wherever the world wants to communicate about bad code." And it's sort of all mutually agreed upon, that this thing is bad. Rob McCann:Insider risk is very different. It's um, you know, this customer wants to monitor these things, and they define risk a certain way. Uh, this customer cares about these things, and he want to define risk a certain way. There is a heightened level of customer preferences that have to be brought into the- the intelligence, to- to detect these risks. Natalia Godyla:And what does detecting one of those risks look like? So, fraud, or insider trading, can you walk through what a workflow would look like, to detect and remediate an insider attack? Rob McCann:Yeah, definitely. So- so, first of all, since it's such a broad landscape of potential damage, I guess you would say, first thing the product has to do is collect signals from a lot of different places. We have to collect signals about people logging in. You have to collect signals about people uploading and downloading files from a- from OneDrive, you have to ... you have to see what people are sharing on Teams, what people are ec- you know, emailing externally. If you want the harassment angle, you gotta- you know, you gotta have a harassment detector on communications. Rob McCann:So the first thing is just this huge like, data aggregation problem of this very broad set of signals. So that's one, which in my mind is a- is a very strong advantage of Microsoft to do this, because we have a lot of sources of signals, across all of our products. So, aggregating the data, and then you need to have some detectors that can swim through that, uh, and try to figure out, you know, this thing right here doesn't quite look right. I don't know necessarily that it's bad, but the customer says they care about these kind of things, so I need to surface that to the customer. Rob McCann:So, uh, technics that we use there a lot are anomaly detection. Uh, so a lot of unsupervised type of learning, just to look for strangeness. And then once we surface that to the- the customer, they have to triage it, right? And they have to look at that and make a decision, did I really- do I really want to take action on this thing? Right? And so, along with just the verdict, like, it's probability 98% that this thing is strange, you also have to have all this explanation and context. So you have to say, why do I think this thing is strange? Rob McCann:And then you have to pull in all these things, so like, it's strange because they- they moved a bunch of sensitive data around, that- in ways they usually didn't, but then you also need to bring in other context about the user. This is very user-centric. So you have to say things like, "And by the way, this person is getting ready to leave the company." That's a huge piece of context to help them be able to make a decision on this. And then once the customer decides they want to make a decision, then the product, you know, facilitates uh, different workflows that you might do from that. So, escalating a case to legal, or to HR, there are several remediation actions that the customer can choose from. Nic Fillingham:On this podcast, we've spoken with a bunch of data scientists, Nic Fillingham:... and sort of machine learning folks who have talked about the challenge of building external detections using ML, and from what you've just explained, it sounds like you probably have some, some pretty unique challenges here to give the flexibility to customers, to be able to define what risk needs to them. Does that mean that you have to have a customized model built from scratch for every customer? Or can you have a sort of a global model to help with that anomaly detection that then just sort of gets customized more slightly on top based on, on preferences? I, I guess my question is, how do you utilize a tool like machine learning in a solution like this that does require so much sort of customization and, and modification by the, by the customer? Rob McCann:That's, that's a fantastic question. So, what you tried to do, you scored on that one.Nic Fillingham:(laughs).Rob McCann:You try to do both, right? So, customers don't wanna start from scratch with any solution and build everything from the ground up, but they want customizability. So, what you try to do, I always think of it as smart defaults, right? So, you try to have some basic models that sort of do things that maybe the industry agrees is suspicious type, right? And you expose a few high-level knobs. Like, do you care about printing? Or do you care about copying to USB? Or do you want to focus this on people that are leaving the company? Like some very high level knobs. Rob McCann:But you don't expose the knobs down to the level of the anomaly detection algorithm and how it's defining distance and all the features it's using to define normal behavior, but you have to design your algorithm to be able to respect those higher level choices that the u- that the user made. And then as far as the smart default, what you try to do as you pr- you try to present a product where out of the box, like it's gonna detect some things that most people agree are sort of risky, and you probably wanna take a look at, but you just give the, you offer the ability to customize as, as people wanna tweak it and say, nah, that's too much. I don't like that. Or printing, it's no big deal for us. We do it. We're printing house, right? Nic Fillingham:Does a solution like this, is it geared towards much larger organizations because they would therefore have more signal to allow you to build a high fidelity model and see where there are anomalies. So, for example, could the science of the insider risk management work for a small, you know, multi hundred, couple hundred person organization? Or is it sort of geared to much, much larger entities, sort of more of the size of a, of a Microsoft where there are tens of thousands employees and therefore there's tens of thousands of types of signal and sort of volume of signal.Rob McCann:Well, you've talked to enough scientists. I look at your guys's guest list. I mean, you know, the answer, right, more data is better, right? But it's not limiting. So, of course, if you have tons and tons of employees in a rich sorta like dichotomy of roles in the company, and you have all this structure around a large company, if you have all that, we can leverage it to do very powerful things. But if you just have a few hundred employees, you can still go in there and you can still say, okay, your typical employees, they have this kind of activity. Weird, the one guy out of a 100 that's about ready to leave suddenly did something strange, uh, or you can still do that, right? So, you, you got to make it work for all, all spectrums. But more data is always better, man. Um, more signals, more data, bring it on. Let's go. Give me some computers. Let's get this done. Natalia Godyla:Spoken like a true applied scientist. So, I know that you mentioned that there's a customized components inside of risk management, but when you look across all of the different customers, are you seeing any commonalities? Are there clear indicators of insider threats that most people would recognize across organizations like seeing somebody exfiltrate X volume of data, or a certain combination of indicators happening at once? I'm assuming those are probably feeding your smart defaults?Rob McCann:Correct. So, there's actually a lot of effort to go. So, I s- I said that we're sort of a bridge between external academic type research and product research. So, that's actually a large focus and it happened in external security too. As you get industry to sort of agree like on these threat matrices, and what's the sort of agreed upon stages of attack or risk in this case. So, yeah, there are things that everybody sort of agrees like, uh, this is fishy. Like, let's make this, let's make this priority. So, that, like you said, it feeds into the smart defaults. The same time we're trying to, you know, we don't think we know everything. So, we're working with external experts. I mean, you saw past podcasts, we talked to Carnegie Mellon, uh, we talked to Mitre, we talked to these sort of industry experts to try to make this community framework or, uh, language and the smart defaults. Uh, and then we try to take what we can do on top of that. Nic Fillingham:So, Rob, a couple of times now, you've, you've talked about this scenario where an employee's potentially gearing up to leave the, the company. And in this hypothetical situation, this is an employee that may be looking to, uh, exfiltrate some, some data on their way out or something, something that falls inside the scope of, of identifying and managing, uh, insider risk. I wonder, how do you determine when a user is potentially getting ready to leave the company? Is that, do you need sort of more manual signals from like an HR system because an employee might've been placed on a, on a, on a review, in a review program or review period? Or, uh, are you actually building technology into the solution to try and see behaviors, and then those behaviors in a particular sort of, uh, collection in a particular shape lead you to believe that it could be someone getting ready to leave the company? Or is it both or something else? Rob McCann:So, quick question, Nick, what are you doing after this podcast?Nic Fillingham:Yeah.Rob McCann:Do you want a job? Because it feels like you're reading some of my notes here (laughter). Uh, we, uh-Nic Fillingham:If you can just wait while I download these 50 gigs of files first-Rob McCann:(laughs).Nic Fillingham:... from this SharePoint that, that I don't normally go to, and then I sort of print everything and then I can talk to you about a job. No, I'm being silly. Rob McCann:No, I mean, I mean, you hit the nail on the head there. It's, uh, there are manual signals. This is the same case with say asset labels, like file labels, uh, highly sensitive stuff and not sensitive stuff. So, in both cases, like we want the clear signals. When the customers use our plugins or a compliance solution to tell us that, you know, here's an HR event that's about ready to happen. Like the person's leaving or this file's important. We are definitely gonna take that and we're gonna use it. But that's sort of like the scientists wanna go further. Like what about the stuff they're not labeling? Does that mean they just haven't got around to it? Or does that mean that it's really not important? Or like you just said, like, this guy is starting to email recruiters a lot, this is like, is he getting ready to leave? So, there's definitely behavioral type detection and inference that, uh, we're working on behind the scenes to try to augment what the users are already telling us explicitly. Natalia Godyla:So, what's the reality of insider risk management programs? How mature is this practice? Are folks paying attention to insider risk? Is there a gap here or is there still education that needs to happen? Rob McCann:Yeah. So, there has been people working on this a lot longer than I have, but I do have to say that things are escalating quickly. I mean, especially with modern workforce, right? The perimeter is destroyed and everybody's at home and it's easier to do damage, right? And risk is everywhere, but some, you know, cold, hard numbers, like the number of incidents are going up, b- like, over the last two years. But I think Gardner just come out and said in, in the last two years, the number of incidents have went out by about half. So, the number of incidents are happening more probably, maybe 'cause of the way we work now. The amount of money that people, uh, companies are spending to address this problem is going up. I think Gardner's number was, when, uh, the average went up several million over the last couple of years, um, they just sort of released an insider risk survey and more people are concerned about it. So, all the metrics are pointing up and it just makes sense with the way the world is right now. Nic Fillingham:Where did sort of insider risk start? What's sort of the, the beginning of this solution... what did the sort of incubation technology look like? Where did it start? Uh, are you able to talk to that? Rob McCann:I mean, sure. A little bit. So, this was before me, so a lot of this came out of, uh, DSRE, which is our, our sort of internal security team for, at Microsoft babysitting our own network. So, they had to develop tools to address these very real issues, and the guys that I did a podcast with before Tyler Mirror and, and Robin, they, um, they sorta, you know, brought this out and started making it a proper product to take all these technologies that we were using in-house and try to help turn them into a product to help other people. So, it sort of organically grew out of just necessity, uh, in-house. But as far as like industry, like, uh, Carnegie Mellon, uh, certain National Insider Threat Center and I think they've been, uh, studying this problem for over a decade.Nic Fillingham:And as a solution, as a technical solution, did it start with like, sort of basic heuristics and just looking for like hard coded flags and logs, or did it actually start out as a sort of a data science problem and, you know, the sort of basic models that have gotten more sophisticated over time? Rob McCann:Yeah. So, it did start, start out with some data science at the beginning as well. Uh, so of course he always have the heuristics. We do that in external attack too. Heuristics are very precise, they, uh, allow us to write down things that are very specific. And they're very, very important part of the arsenal. A lot of people diss on heuristics hero sticks, but it's a very im- very important part of that, that thing. But it also has, it started out with some data science in it, you know, the anomaly detection is a big one. Um, and so there were already some models that they brought right from, uh in-house to detect when stuff was suspicious. Natalia Godyla:So, what Natalia Godyla:... what's the future of IRM look like? What are you working on next?Rob McCann:Well, I mean, we could, you could go several ways. You know, there could be broadness of different types of risk. The thing that I enjoy the most is sort of the more sophisticated ways of doing newer algorithms, maybe for existing charters, or maybe broad charters. Rob McCann:Uh, one thing that, I- I'm very interested in lately is the sort of interplay between supervised learning and, and anomaly detection. So you can think of as, uh, semi-supervised. That's a thing that we've actually been playing with at Microsoft for, for a long time. Rob McCann:I've had this awesome, awesome journey here. I've, I've always been on teams that were sorta, like ... It's kinda like I've been an ML evangelist. Like, I always get to the teams right when they're starting to do the really cool tech, and then I get to help usher that in. So, I got to do that in the past with spam filtering, when that was important. Remember when Bill Gates promised that we were gonna solve spam in a, in two years or whatever. Those were some of the first ML models we ever did i- in Microsoft products, and even back then we're playing with this intersection of, you know, things look strange, but I know that certain spam looks like this, so how do you combine that sort of strangeness into sort of a semi-supervised stuff ...Rob McCann:That's the stuff that really floats my boat is ho- how do you, how do you take this existing technology that some people think of as very different ... There's unsupervised, there's supervised, uh, there's anomaly detection. How do you take that kinda stuff and get it to actually talk to each other and do something cooler than you could do on one set or the other? That's where I see the future from a technical standpoint behind the scene for smarter detectors, is how we do that kind of stuff. Rob McCann:Product roadmap, it's related to what we're, we talked about earlier about the industry agreeing on threat major sees and customers telling us what's the most important to them. That, that's stuff's gonna guide, guide the product roadmap. Um, but the technical piece, there's so much interesting work to do.Natalia Godyla:When you're trying to make a hybrid of those different models, the unsupervised and supervised machine learning models, what are you trying to achieve? What are the benefits of each that you're trying to capture by combining them?Rob McCann:Oh, it's the story of semi-supervised, right? I have tons and tons of data that can tell me things about the distribution of activity, I just o-, d-, only have labels on a little bit of it. So, how do I leverage the distributions of activity that's unlabeled with the things that I can learn from my few labeled examples? And how do I get those two things to make a better decision than, than either way on its own? Rob McCann:It's gonna be better than training on just a few things in a supervised fashion, 'cause you don't have a lot of data with labels. So you don't wanna throw away all that distributional information, but if you go over to the distributional information, then you might just detect weirdness. But you never actually get to the target which is risky weirdness, which is two different things.Nic Fillingham:Is the end goal, though, supervised learning, so if you, if you have unsupervised learning with a small set of labels, can you use that small set of labels to create a larger set of labels, and then ultimately get to ... I'm horribly paraphrasing all this here, but, is that sort of the path that you're on?Rob McCann:So, we're gonna try to make the best out of the labels that we can get, right? But, I don't think you ever throw away the unsupervised side. Because, uh, I mean, this c-, this has come up in the external security stuff, as well, is if you're always only learning how to catch the things that you've already labeled, then you're never gonna really s-, be super good at detecting brand new things that you don't have anything like it. Right?Rob McCann:So, you have to have the ... It's sorta like the Explore-exploit Paradigm. You can think of it, at a very high level you can think of supervised as you're exploiting what you already know, and you're finding stuff similar to it. But the explore side is like, "This thing's weird. I don't know what it is, but I wanna show it to a person and see if they can tell me what it is. I wanna see if they like that kinda stuff."Rob McCann:Uh, that's sorta synergy. That's, that's a powerful thing.Nic Fillingham:What's the most sophisticated thing that the IRM solution can do? Like, have you been sort of surprised by the types of, sort of, anomalies that can be both detected and then sort of triaged and then flagged, or even have automated actions taken? Is there, is there a particular example that you think is a paramount sort of example of what, what this tech can do?Rob McCann:Well, it's constantly increasing in complexity. First of all, anybody who's done applied science knows how hard it is to get data together. So when I work with the IRM team, first of all, I'm blown away at the level of the breadth of signals they've managed to put together into a place that we can reason over. That is such a strong thing. So the, their data collection is super strong. And they're always doing more. I mean, these guys are great. If I come up with an idea, and I say, "Hey, if we only had these signals," they'll go make it happen. It is super, super cool.Rob McCann:As far as sophistication, I mean, you know, we start, we start with heuristics, and then you start doing, like, very obvious anomaly detection, like, "Hey, these, this guy just blew us out of the water by copying all these files." I mean, that's sort of the next level. And then the next level is, uh, "Okay, this guy's not so obvious. He tries to fly under the radar and sort of stay low and slow. But can we detect an aggregate? Over time he's doing a lot of damage." So those more subtle long-term risks. That's actually something we're releasing right now.Rob McCann:Another very powerful paradigm that we're releasing right now is, not just individual actions, but very precise sequences of actions. So you could think of it in a external as kill chain. Like, "They did this, and then they did this, and then they did this." That can be much more powerful than, "They did all three of those separately and then added together," if you know what I mean.Rob McCann:So that sort of interesting sequences thing, that's a very powerful thing. And once you sorta got these frameworks up, like, you can get arbitrarily sophisticated under the hood. And so, it's not gonna stop.Nic Fillingham:Rob, you talked about working on spam detection and spam filters as previous sort of projects you were working on. I wonder if you could tell us a little bit about that work, and I wonder if there's any connective tissue between what you did back then and, and IRM.Rob McCann:Yeah, so I've worked on a lot more than spam. So, I got hired to do spam, to do the research around the spam team, but it quickly, uh, it was this newfangled ML stuff that we were doing, and, uh, it started working on lots of different problems, if you can imagine that. And so we started working on spam detection, and, and phish detection. We started working on Microsoft accounts. We would, we would look at how they behave and try to detect when it looks like suddenly they've been compromised, and help people, you know, sort of lock down their accounts and get, and get protection.Rob McCann:All those things it's been cool to watch. We sorta, we sorta had a little incubation-like science team, and we would put these cool techniques on it and it would start working well, and then they've all sort of branched out into their own very mature products over the years. A- and they're all based very heavily on, uh, the sort of techniques that, that have worked along the way.Rob McCann:It's amazing how much reuse there is. I mean, I mean, let's boil down what we do to just finding patterns in data that support a business objective. That's the same game, uh, in a lot of different domains. So, yes, of course, there's a lot of overlap.Nic Fillingham:What was your first role at Microsoft? Have you always been in, in research on applied research?Rob McCann:I have always been a spoiled brat. I mean, I, I just get to go work on hard problems. Uh, I don't know how I've done it, but they just keep letting me do it, and it's fun. Uh, yeah, I've always been an applied researcher.Nic Fillingham:And that, you said you joined about 14 years ago?Rob McCann:Yep. Yep, yep. That was even back before, uh, the sort of cluster machine learning stuff was hot. So we, I mean, we used to, we used to take, uh, lots of sequel servers and crunch data and get our features that way, and then feed it into some, like, single box, uh, learning algorithms on small samples. And, like, I've got to see this progression to, like, distributed learning over large clusters. In-house first, we used to have a system called [Cosmos In-House 00:28:04]. I actually got to write some of the first algorithms that did machine learning on that. It was super, super rewarding. And now we have all this stuff that we release to the public and Azure's this big huge ... It's a very, very cool to have seen happen.Nic Fillingham:Giving the listener maybe a, uh, a reference point for, for your entry into Microsoft-Rob McCann:(laughs)Nic Fillingham:... is there anything you worked on that's either still around, or that people would have known? I think, like, just the internal Cosmos stuff is, is certainly fascinating. I'm just wondering if there's a, if there's a touchstone on the product side.Rob McCann:Spam filtering for Hotmail. That was my first gig.Nic Fillingham:Nice! I, I cut my teeth on Hotmail.Rob McCann:Yeah, yeah-Nic Fillingham:Yeah, I was a Hotmail guy. I was working on the Hotmail team as we transitioned to McCann:Mm-hmm (affirmative).Nic Fillingham:And I was, uh, down in Palo Alto, I can't even remember. I was somewhere, where- wherever the Silicone Valley campus is-Rob McCann:SVC-Nic Fillingham:We were rolling like a boar-, a boardroom waiting for the new domain to go live, and we got, like, a 15 minute heads-up. So I'm just That's, that's my email address, and I got, I got my wife her first name at Nic Fillingham:Were you there for that, Rob? Do you have a, did you get a super secret email address?Rob McCann:I was not there for the release, but as soon as it was out, I went and grabbed some for my kids. So I w-, I keep my Hotmail one, 'cause I've had it forever, but, uh-Nic Fillingham:Yeah.Rob McCann:... I got all my kids, like, the, the ones they needed. So.Rob McCann:It's amazing how much stuff came out of the, that, that service right there. So I talked about identity management that we do for Microsoft accounts now. I, that stuff came from trying to protect people, their Hotmail accounts. So we would build models to try to determine, like, "Oh, this guy's suddenly emailing a bunch of people that he doesn't usually," anomaly detection, if you can imagine, right? The-Nic Fillingham:Yeah-Rob McCann:... same thing works. Rob McCann:All that stuff, and then it sorta grew in, and then Microsoft had a bigger account, and then that team's kinda like, "Hey, you guys are doing this ML to detect account compromise, can you come, like, do some Rob McCann:... of that over here," and then it grew out to what it is today. A lot of things came from the OML days, it was very fun.Natalia Godyla:Thinking of the different policies organizations have and the growing awareness of those policies, over time, employees are going to shift their tactics. Like you said there are some who are already doing low and slow activities that are evading detection, so, how do you think this is going to impact the way you try to tackle these challenges, or have you already noticed people try to subvert the policies that are in place?Rob McCann:Yeah, so that's the, that's the next frontier, which is w-, you know, why I said we started just getting into, like, the low and slow stuff. It's gonna be like all other security, it's gonna be, "These guys are watching this thing, I gotta try something different."Rob McCann:Actually that's a good motivation for the sort of the high-level approach we're taking, which is tons of signals, so there's not very many activities you could do. You could print, copy to USB, you could upload to something, you could get a third-party app that does the uploading for you. There's not very many avenues that you could do that we're not gonna be able to at least see that happening. Rob McCann:So you couple that with some, that mountain of data with some algorithm that can try to pick out, "This is a strange thing, and this is in the context of somebody leaving." It's gonna be an interesting cat-and-mouse, that's for sure.Natalia Godyla:Do you have any examples of places where you've already had to shift tactics because you're noticing a user try to subvert the existing policies? Or are you still in the exploration phase trying to figure out what really, what this is really going to look like next?Rob McCann:So, right now I don't think we've had ... We haven't got to the phase yet where we're affecting people a lot. Uh, this is very early product, we're a year in. So, I don't see the reactions yet, but I, I guarantee it's gonna happen. And then we're gonna learn from that, and we're gonna say, "Okay, I have the Explore-exploit going. The Explorer just told me that something strange that I've never seen before happened." We're gonna put some people on that that are experts that figure out what that's gonna be. We're gonna figure out how to bring that into the fold of agreed-upon bad stuff, so we're gonna expand this threat matrix, right, as we go along? And we're gonna keep exploring. And that's the same for every single security product.Nic Fillingham:Rob, as someone that's been able to sort of come into different teams and, and different solutions and, and help them, as you say, sort of bring more academic or theoretical research into, into product, what techniques are you keeping your eye on? Like, what's, what's coming in the next two or three years, maybe not necessarily for IRM, maybe just in terms of, as machine learning, as sort of AI techniques are evolving and, and, and sort of getting more and more mature, like, what, where are you excited? What are you, what are you looking at?Rob McCann:So you want the secret sauce, is what you're asking for?Nic Fillingham:That's exactly what I want. I want the secret sauce.Rob McCann:(laughs) Um, well, I mean, there's two schools of thought. There's one school of thought which is, "You better keep your finger on the pulse, because the, the new up-n-comers, the whippersnappers are gonna bring you some really cool, cool stuff." And then there's the other school of thought which is, "Everything they've brought in the last ten years is a slight change of what they, was before, the previous ... It's a cycle, right, as with s-, i- ... Science is refinement of existing ideas.Rob McCann:So, I'm a very muted person that way, in that I don't latch on to the next latest and greatest big thing. Um, but I do love to see progress. I s-, just see it as more of a multi-faceted gradual rise of mankind's pattern-recognition ability, right?Rob McCann:Things that excite me are things that deal with ... Like, big data with big labels? Super, super cool stuff happening there. I mean, like, you know, who doesn't like the word deep learning, or have used it-Nic Fillingham:What's a big label? Is there a small label?Rob McCann:(laughs) No, I mean lots of labeled data. Like, uh-Nic Fillingham:Okay.Rob McCann:... yes.Nic Fillingham:Big data sets, lots of labels.Rob McCann:Yes. That stuff, um, that's exciting. There's a lot of cool stuff we couldn't do two decades ago that are happening right now, and that's very, very powerful. Rob McCann:But a lot of the business problems in security, especially, 'cause we're trying to always get this new thing that the bad guys are doing that we haven't seen before. It's very scarce label-wise. And so the things that excite me are how you inject domain knowledge, right? I talked about, we want customers to be able to sort of control on some knobs that you, like, focus the thing on what they think's important. Rob McCann:But it also happens with security analysts, because, there's a lot of very smart people that I get to work with, and they have very broad domain knowledge about what risks look like, and various forms of security. How do you get these machines to listen to them, more than them just being a label machine? How do you embed that domain knowledge into there?Rob McCann:So there's a lot of cool stuff happening. Uh, in that space, weak learning is one that's very popular. Came out of Stanford, actually. But I'm very la-, I'm very, very excited about what we can do with one-shot, or weak supervision, or very scarce labeled examples. I think that's a very, very powerful paradigm.Nic Fillingham:Doing more with less.Rob McCann:That's right. Rob McCann:And transfer learning, I'm sure you guys have talked to a lot of people about that. That's another one. A lot of things we do in IRM ... Well, in, in lots of security is you try to, like, leverage labeled, uh, supervised classification ... Like, think about HR events. Rob McCann:So, maybe I could, don't have a m-, a bunch of labeled, "These are IRM incidents" that I can train this big supervised classifier on. But what I can do is I can get a bunch more HR events, and I can learn things, like you said, that predict that an HR event is probably happening, right? And I chose that HR event, because that's correlated with the label I care about, right? So, I can use all that supervised machinery to try to predict that proxy thing, and then I can try to use what it learned to get me to what I really want with maybe less labels.Nic Fillingham:Got it. My final IRM question is, from what I know about IRM, it feels like it's about protecting the organization from an employee who may maliciously or accidentally do something they're not meant to do. And we've used the example of an employee getting ready to leave the company. Nic Fillingham:What about, though, IRM as a tool to spot well-meaning, but, but practices that, that o-, expose the company to risk? So instead of, like, looking for the employee that's about to leave and exfil 50 gigs of cat meme data that they shouldn't, what about, like, just using it to identify, "You know what, this team's just sort of got some sloppy practices here that's sort of opening us for risk. We can use the IRM tool to go and find the groups that need the, sort of the extra training, and to, need to sort of bring them up to scratch. And so it's almost more of a, um, just thinking of it more in sort of a positive reinforcement sense, as opposed to sort of an avoiding a negative consequence.Nic Fillingham:Is that a big function of IRM?Rob McCann:Yeah, I mean, I, I'm sorry if I didn't, uh, communicate that well, but, IRM is definitely intentional and unintentional. In s-, in some of the workflows the way you can do when we detect risky activity is just send an email to the, uh, to the employee and say, "Hey, this behavior is risky, change your ways, please," right? Rob McCann:So, you're right, it's, it can be a coaching tool as well, it's not just, "Data's gonna leave," right? Intentionally.Nic Fillingham:Got it. You've been very generous. This has been a great conversation. I wondered, before you leave us, do you have anything you would like to plug? Do you have a blog, do you have a Twitter? Is there a- another podcast? Which one were you on, Rob?Rob McCann:Uncovering Hidden Risk. I would also like to point you guys to, uh, an inside risk blog. I mean, we, we publish a lot on, on what's coming out and where the product is headed, so it's: That's a great place to sorta keep abreast on the technologies and, and where we wanna go.Nic Fillingham:That sounds good. Well, Rob McCann, thank you so much for your time. Uh, this has been a great conversation, um, we'll have to have you back on at some point in the future to learn more about weak learning and other th-, other sort of, uh, cool new technique you hinted at.Rob McCann:Yeah. I appreciate it. Thanks for having me.Rob McCann:(music)Natalia Godyla:Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode.Nic Fillingham:And don't forget to tweet us @msftsecurity or email us at with topics you'd like to hear on a future episode.Nic Fillingham:Until then, stay safe.Natalia Godyla:Stay secure.

The Language of Cybercrime

Ep. 22
How many languages do you speak?The average person only speaks oneor twolanguages, and for most people that’s plentybecause even as communities arebecoming more global, languages are still very much tied to geographic boundaries.Butwhat happens when you go on the internet where those regions don’t exist the same way they do in real life?Because the internet connects people from every corner of the world, cybercriminals canperpetratescamsin countriesthousands of miles away. So how doorganizationslike Microsoft’s Digital Crime Unit combatcybercrimewhen they don’t even speak the language of the perpetrators?On today’s episode ofSecurity Unlocked, hostsNic FillinghamandNataliaGodylasit down withPeterAnaman,Principal Investigator on the Digital Crimes Unit,to discusshowPeterlooks at digital crimes inavery interconnected world and how language and culture play into the crimes being committed, who’s behind them, and how to stop them.In This Episode, You Will Learn:• Some of the tools the Digital Crime Unit at Microsoft uses to catch criminals.• How language and culturalfactors into cyber crime• Whycyber crimehas been onthe rise since Covid beganSome Questions We Ask:• How has understanding a specific culture helped crack a case?• How does a lawyer who served as an officer in the French Army wind up working at Microsoft?• Are there best practices for content creators to stay safe fromcyber crime?ResourcesPeterAnaman’s LinkedIn:’s LinkedIn:’s LinkedIn: Security Blog[Full transcript can be found at]Nic:(music)Nic:Hello and welcome to Security Unlocked. A new podcast from Microsoft where we unlock insights from the latest in news and research from across Microsoft's Security Engineering and Operations Teams. I'm Nic Fillingham.Natalia:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft's Security. Deep dive into the newest threat intel, research and data science.Nic:And profile some of the fascinating people working on artificial intelligence in Microsoft Security.Natalia:And now, let's unlock the pod.Natalia:Hello, Nic. How is it going?Nic:Hello, Natalia. I'm very well, thank you. I'm very excited for today's episode. We talk with Peter Anaman, who is a return guest. Uh, he was on an earlier episode where we talked about business email compromise and some of the findings in the 2020 Microsoft Digital Defense Report. And Peter had such great stories that he shared with us in that conversation, that we thought let's bring him back. And let's, let's get the full picture. And wow, did we cover some topics in this conversation. I don't even know where to begin. How would, what's your TLDR for this one, Natalia?Natalia:Well, whenever your friends or family think about cyber security, this is it. One of the stories that really stuck out to me is, Peter went undercover, and has actually gone undercover multiple times, but in this one instance he used the cultural context from his family history, as well as the languages that he knows to gain trust with a bad actor group and catch them out. It's incredible. He speaks so many languages and he told so many stories about how he applies that to his day-to-day work in such interesting ways.Nic:Yeah, I love, for those of you who listened to the podcast, Peter really illustrates how knowledge of multiple cultures, knowledge of multiple languages, understanding how those cultures and languages can sort of intersect and ebb and flow. Peter has used that as powerful tools in his career. I think it's fascinating to hear those examples. Other listeners of the podcast who, who do have more than one language, who do understand and have experience across multiple cultures, maybe oughta see some, uh, some interesting opportunities for themselves in, in, in cyber security maybe moving forward.Nic:I also thought it was fascinating to hear Peter talk about working to try and get funds and sort of treasures and I think gold, l-literal gold that was taken during the second world war. And getting them back to it's original owner. Sort of like, a repatriation effort. As you say, Natalia, these are all things that I think our friends and family think of when they hear the words cyber security. Oh, I'm in cyber security. I'm an investigator in cyber security. And they have this sort of, visions, these Hollywood visions. Nic:This is, that's Peter. That's what he's done. And he's, he talk about it in his episode. It's a great episode.Natalia:And with that, on with the pod.Nic:On with the pod. Nic:(music)Natalia:Welcome back to Security Unlocked, Peter Anaman.Peter:Thank you very much. Thanks for having me back.Natalia:Well, it was a pleasure to talk to you, first time around. So I'm really excited for the second conversation. And in this conversation we really love to chat about your career in cyber security. How you got here? Um, what you're doing? So let's kick it off with a little bit of a refresher for the audience.Natalia:What do you do at Microsoft and what does your day-to-day look like?Peter:So in Microsoft, I work within the legal department. Within a group called the Digital Crimes Unit. We are a team of lawyers, investigators and analysts who look at protecting our customers and our online services from, um, organized crime or attacks against the system. And so we, we bring, for example, civil and criminal referrals in order to do that action. On a day-by-day basis, it's very, very varied. I focus more on business email compromise present with some, with some assistance on ransomware attacks and looking at the depths and the affiliates there. As well as looking at some attacks against the infrastructure based on automated systems. Peter:So it's kind of varied. So on a day, I could, for example, be running some crystal queries or some specialized database queries in order to look for patterns in unauthorized or illegal activity taking place in order to quickly protect our customers. At the same time, I have to prepare reports. So there's a lot of report writing just to make sure that we can articulate the evidence that we have. And to ensure we respect privacy and all the other rules, you know, when we present the data.Peter:And also, in addition to that, uh, big part of it is actually learning. So I take my time to look at trends of what's going on. Learn new skills in order to know that I can adapt and automate some of the processes I do.Nic:Peter, as someone with an accent, uh, I'm always intrigued by other people's accents. May I inquire as to your accent, sir. Um, I'm hearing, I think I'm hearing like, British. I'm hearing French. There's other things there.Peter:(laughs)Nic:Would you elaborate for us?Peter:Yes, of course. Of course. Oh so, I was born in Ghana, West Africa and spent my youth there. And later on went to the UK where I learned that, I had to have elocution lessons to speak like the queen. And so I had lesson and my accent became British. So but at the same time, I'm actually a French national. Um, I've been in the French army as an officer. And so, that's where the French part is. And throughout, I've lived in different countries doing for work. Uh, so I've learned a bit of German, a bit of Spanish on the way.Nic:I, I actually cheated. I looked at your, um, LinkedIn profile and I see you have six languages listed.Peter:Yes.Nic:The two, the two that you didn't mention, I am embarrassingly ignorant of Fante? And T-Twi, Twi? What are they?Peter:Twi and Fante are two of the languages that are spoken in Ghana. They're local languages. And so growing up, I always had that around me. When I went to my father's village where his, we communicate in that language. English is kind of the National Language but within the country, people really speak their own languages. So I've ticked it off now. Can I speak fluently in, in it? No, I've been away for too long. But if you put me there, I would understand everything they're saying. Nic:What are the roots of those two languages? Are they related at all? Or are they completely separate?Peter:They are related but one, one person cannot always understand the other. If you look more broadly, you look at for example, the African continent all are, you'll find that there are over, from what we understand, over, what was it? 2,000 languages are spoken on the continent. So sometimes a person, say on the east coast doesn't understand the person in the west coast, you know. And, and it's fascinating because, you know, when we look at cyber crime, we are facing a global environment. Which is actually pretty carved out, right? The physical world is still pretty segmented.Peter:And so when, for example, investigating some crimes taking place in Nigeria, well they speak pidgin English. And so we have to try and adapt to that to understand, what do they really mean when they say, X or Y? And so, you know, it kind of opens our mind at, as we're doing the investigations. So we have to really try and understand the local reality because the internet is not just one place. And I think, you know, working for, you know, Microsoft and with such an amazing diverse team, we've been able to share knowledge.Peter:So for example, in the case I mentioned, I went to my colleague in Lagos, Abuja. He went, oh, that's what it means. And we're like, okay great. That one makes a lot more sense. And so we can move on. So we have this kind of richness in the team that allows us to lean on each other and, you know, sort of drive impact. But yeah, language is very important. (laughs)Natalia:I was gonna ask, do you have any interesting examples in which the culture was really important to cracking in the case or understanding a specific part of a case that you were working?Peter:Yes. So there was one case I worked on earlier on which was in Lithuania. And in Lithuania, for a very long time, this group had been under investigation but they were very good at their Op Sec and used some, uh, different types of encryption and obsolete, obsolete communication to hide themselves. But what I learned from the chats and when I was, this was in an IRC, it started in IRC channels and then moved out of there afterwards. But I noticed that there was a lot of Italy. There was a lot of Italian references. And my grandfather was Sicilian so I've spent time in Italy. So I kind of understood that they traveled to Italy.Peter:So in part of the persona, I made reference to Sicily. And I just said, you know, that's where my grandfather's from. And this, didn't give a name obviously, but it kind of brought them closer, right? Because like, oh, yeah we, we get it. And after about two, three months, I was able to get them to send me pictures of them going on vacation in Italy. And unfortunately for them, the picture had geo-location on it. And also, we were able to blow it up to get the background of where they were in the airport and using the camera from the airport, we were able to identify who they were. And then go back to the passport, find their path and they got arrested a few weeks later. Peter:So but to get that picture, to get that inner information required a kind of, trust that was being built in the virtual world and that comes from trying to understand the culture. By teasing out, asking questions about who are you and what do you like. So that's just one example.Nic:N-no pressure in answering this question and we'll even, we'll even cut it out of the edit if it's one you don't wanna go with.Peter:(laughs) Sure.Nic:If you're good with it. But um, uh, I heard you now talk about personas and identities and y-you just sort of hinted at it in the answer to the previous question. It sounds like some of the work that you have done in the past has been about creating and adopting personas in order to go and learn more information about bad actors and groups out there in, uh, in cyber land. Is that accurate and are you able to talk about what that role and that sort of, that work look like, when you're performing it?Peter:Yeah. So before you have Peter:...persona, you have to understand where that persona's gonna be acted, right?Peter:And I'll give you an, an example of a story. Once I had to go to LA to give a presentation and when I got to the airport I got a cab. And in the cab I looked at the guy's, the license plate of the, of the person. And I said, I bet you, I can guess, which country you were born in. He was like, an African American kind of person. He goes, impossible. No one has guessed it, you will never know. I was, all right. Are you ready? You're from Ghana. And his mind was blown. He was like, how, how did you pin that to one country? I was like, well, in your name, you have Kwesi. And I know if you're born in a country, in Ghana and have Kwesi, it means you're born on a Sunday. So that fact that you have your, that name there, that means you were born from Ghana. He goes, you are right. And so that was that. Peter:And I said, I miss some food, the cuisine from my, from, from Ghana. And he goes, oh, I know a great place. It's in Compton. I said, go. Uh, when? So I went into my restroom, showered, go ready, try to g-got into a taxi and he goes, I'm not going into Compton. I was like, well, why not? I wanna go to that restaurant. And he goes, oh, no, no, no. I'm going to get robbed or something bad is going to happen to me. I was like, but it- By the way, he left, he went, I had a great meal. Afterwards, I spent two hours in the restaurant 'cause no taxi would come and pick me up. And eventually, the waitress took me to a local casino. And I got a cab there and I got back.Peter:Where, where I'm going with this story is about the environment. I didn't know what Compton meant, right? So if I created a persona that went there that didn't know the environment, they would not succeed. They would stick out like a sore thumb. They would, they would fail. So the first idea, is always to understand what are the different protocols.Peter:If I'm looking at, for example, FTP or IRC, the different peer-to-peer networks. Or I'm looking at NNTP and the old internet, you know. All of those work, you need different tools to work there. Different ways to collect evidence and different breadcrumbs you could leave that you need to know it may be needed. Because when you're there, you're there, right? And it's, you're leaving, you're leaving a mark. Also some people say, use proxies. Well, the problem with proxies that someone could know you got a proxy on. Because well, there's lots of systems out there. So it's about using the system. Understanding how it's interconnected so that when you show up, you show up without too much suspicion.Peter:The other thing I learned is that the personas have to, have to be kind of, sad. 'Cause what I found is that when they were a bit sad, like, I'm happy with your work and things like that. What I found, that's me, right? I found that people were more interested because people are kind by nature, right? And so when they see that you're sad, they're more likely to communicate with you. While, while if you're too confident, I can do everything. They're like, uh, no, that person. Peter:So I try to like, psychologically look at ways to make the person as real as possible, based on my experience, right, because if it was based on me, I would be called out. Because I will be inventing a character that's, was not real. If you try to give me a trick question, because it's based on me, the answer's gonna be the same. I've got, the persona is me. It's just different. And so that's how I took my time to understand it. I spend a lot of time learning the internet, the protocols, you know, how does P2P actually work. When I, going to an IRC channel or when I'm looking at the peer-to-peer network and looking at the net flow. So the data which is passing from my computer upload. What other information is flowing. Peter:Because if I can see it, they can see it, right? And at the same time I have to have the tools. So I was very fortunate to have, for example, some tools that can switch my IP address with any country, like, every minute. So I could really change personas and change location really rapidly and no one would know better 'cause I'm using different personas in different contexts, right?Peter:Now, I never lie. One of, one of the clear things is that you never, I never try and do anything illegal because I have to assume that law enforcement is on the other side. And that's not what I'm trying to do. So I'm not gonna commit the crime. I'm not going to encourage you to do the crime. I'm just listening and just being curious about you. But then people make mistakes because they share, they over share sometimes without knowing. Maybe they're too tired or something. Natalia:I have a bit of a strange question. So with the lockdown, culturally, people are expressing publicly that they feel like they're over sharing. Because they're all locked indoors. They have, their only outlet is to share online. So have you noticed that in your work in security? Do, are people over sharing in that underground world as well? Or there, there hasn't been an equal shift?Peter:No, I, I, I, actually think it's getting worse. Um, and part of the reason is, as more people go online, they're speaking more about how to be anonymous. So for example, I've seen a rapid increase in BackConnect. These are residential IP addresses used as proxies. Well 'cause now they're communicating to each other, saying, hey, we're all online and this is how you can get found out. And so there actually there's more sharing going on. You know, I look at this, many more VPN services out there. It just seems, they're better prepared. Now, obviously, we see a lot more, right? So I'm definitely seeing more sophistication because people are spending more time online. So they, they're not walking around waiting for the bus. They're reading, they're learning, they're adapting. They communicate with each other. Peter:I've even found like, cyber crime as a service, we've found clusters of groups of people. And when you look at that network, you could see. They're saying, oh, I offer phishing pages or I offer VPN. They become specialized. So now you have people that are saying, I am just gonna focus on getting your, for example, some exploits. Or I'm just gonna focus on getting you, um, some red team work so that you can go and drop your ransomware. You know what, they, they've become more specialized actually because they're online. And they've got the time to learn.Nic:Peter, you mentioned earlier, some time you spent in, I think, was it the French army, is that correct?Peter:Yes, that's correct.Nic:Do you want to talk about that? Was that your foray into security? Did it, did it begin with your career in the army? Or did it begin before then?Peter:Hmm. I think it started probably before then. In a sense that, once I left high school, I decided I wanted to study law. Because I wanted the system that I was gonna be working in. And so I went to law school, uh, in the UK. And when I came out, unfortunately, the market was not as good. So I couldn't get a job. And when I looked around at what other trenches I had. I found there was an accelerated cause to become an officer in the French Army. It's a bit like, West Point in the US. Or, and so to do that, it was basically two years, it a two year program condensed into four months. It was hard. And so (laughs) I-Nic:It was what? No sleep? Is that what it was? (laughs)Peter:Ahhh. I've lived through little sleep.Nic:No sleep before meals.Peter:Yeah. I had to, you know, even- Well one time, I even had to evacuated because I got hyperten- you know, uh, hypothermia. (laughs) It was, uh, sort of a character build, character builder, I like to call it that. Uh, but really I think that started the path. Uh, but for the security side was, was after that. So, 'cause of my debts from law school, I, I left the army and I went to, back to the UK. And there, the first job I found was to be a paralegal, photocopying accounts, bank accounts opened between 1933 and 1947. It was part of something called a survey. And it actually had something to do with the Nazi gold.Peter:So what happened is that during the second world war, a lot of peop- uh, people of Jewish origin, saw that they were gonna be persecuted and took their money to, uh, Switzerland and put them in numbered accounts. And kept the number in their head. While unfortunately, so many of them sadly, uh, were victimized, they died. And the number died with them. Well, the money stayed in the accounts and over time because the accounts were dormant, well, you had charges. And so the money left. Peter:And so this was something that Paul Volcker, I believe it was, started the survey to get the Swiss banks to comply and give the money back to the families as result. So I was part of a team investigating one of the banks there. And although I started photocopying, I looked at, using my military skills, to be very efficient. So I was the best photocopier.Natalia:(laughs)Peter:And uh, and we were five levels underground. And that's what I did and I worked hard. And then after a few weeks, I got promoted to manage, uh, photocopiers. The people photocopying. We were a great team. And after that, they realized I was still hanging around because everyone was sleeping. 'Cause working five levels underground is a bit depressing sometimes. Peter:And so eventually, I became a data analyst. And so now I had to do the research on the accounts to try and find someone writing in pen, oh, this number is related to this other main account. Or this there piece of evidence is linked to this name. And so basically, for about, I think about three years, I basically, I eventually ran the French team and we looked at all the French cards opened from that period. And that started the investigations and sort of, trying to think deeper into evidence and how to make it work. Natalia:I really didn't think of myself as being cool before this, but I'm definitely not cool after hearing this. It's been validated, these stories are way beyond me. Peter:(laughs) Well, no. Just stories.Natalia:(laughs) So what brought you to Microsoft? That how did you go from piracy investigation to working at Microsoft as an investigator?Peter:So what took place was actually, my troubles created by Microsoft. So back in 2000 it was Microsoft who actually saw that the internet was becoming something that could really hurt internet commerce and e-commerce of role and wanted to make sure Peter:But they could contribute to it, and participate by building this capacity. And all the way through, they were one of my clients, at, essentially. And at some point, I realized that in my career, working for different customers, clients is great, because you learn, you don't have something different. So, for example, a software company is very different to a games company. Is different to a publishing company, is different to a mo- motion picture company, although it's digital piracy, it's actually very different in many respects. And I have- I saw how Microsoft was investing more in the cloud at that time, and I saw that as a big opportunity to really help a bigger threat to the system, right? Peter:And when I say to the system, E-commerce, 'cause everything was booming, this was in like 2008. And so, I decided that I would work for them. And actually, they offered me the job. So, I- I didn't, you know, I'm very privileged to be where I am now. But the, the, the way they positioned it is that they were looking for someone to help develop systems to map out, create a heat map of online piracy. I was like, "Wow, this is a global effort." So, uh, that's what I came on board with. And I built actually, a, a system similar to Minority Report, whereby I got basically these crawlers that I built that would go out and visit all these pirate sites. And you'll find this fascinating 'cause... Well, I found it fascinating, in some cases- Natalia:(laughs). Peter:... as we accessed the forums that we're offering, you know, download sale, RapidShare was one of the companies at the time, as we shut them down, they have crawlers in the forum, which will go and replace them. So, we had machine or machine wars, where we would shut down a URL, and then they would put another one. The problem is that our system was infinite. That is, we can, the machine can keep clicking. For them, they had about 10 groups of files. And so once they reached number 10, that was it. So, I found a way to automate the systems. And then after that using the, the Kinect, do you remember the Xbox Kinect? Nic:Cer- certainly. Peter:Managed to hack that, and the way it happened is that I built a map on Bing, whereby the Kinect could look in my body structure. And as I moved my hand, it would drill in to a country. And when I pushed, it would create, like, a, a table on the window with the number of infringements, what products were offered, when was the last time it was detected. And then, I could just wave it away and it would go, and then I could spin the world, it was a 3D map to go to another country and say, "What are the concentrations of piracy?" In this way, we had a visualized way of looking at crime as they were taking place online, and then zoom in and say, "We need to spend more effort here." Right? Peter:So, as well, just getting data analytics, but in a 3D format. And so, that was part of the excitement when I joined, is how to do that. Another example is, I found that, I read some research where it said that basically humans only spend a minute and a half on any search query. You know, in itself it doesn't mean much. But imagine you have a timer and it's one second, two seconds, three seconds, right? You're waiting for a minute and a half, right? So, 90 seconds, let's double that and say 180 seconds. Basically, let's say three minutes, it means that if you go to anyone you know, and ask them, "Go and search for Britney Spears downloads." And you look too, go, do, do the search, and they will click a link, nothing. Go next, click next, and they'll keep going. Peter:Before the three minute mark, they'll stop. They'll change the query, they'll do something different. Because they wouldn't get a result. Which means that when you do a search, and a search has got a million results, uh, it doesn't really matter. People are not going to go through the million. So, I started to think about the problems that when executives and people were saying, "Oh, I go on the internet, and I can find bad stuff." I was like, "Okay, but you can do like in three minutes. How about I build a robot that will pretend to be you, and go and find the infringements within that three minute window? Which is about 400 URLs. But I'm going to hit it with like send 100 queries, distributed." Peter:All of a sudden, we were finding the infringements before anyone could click on it, because we would report it to Google, Bing, Yandex, Baidu. And they would remove it from the, from the search results. And then, we had a measurement system, which would check and see, if I was a human, how many seconds would it take before I found an active download? Right? You could automate it. And so, we had a dashboard that could show that, and it worked. You know, we could, we saw a decline in the number of complaints because, well, it wasn't as visible. Now, if you knew where the pirate bay was, yeah, okay. But that wasn't really what we were doing. We were looking at protecting people from getting downloads which contain malware, or something nefarious, right? And, and, so we built these systems to protect consumers, essentially.Natalia:So, is there a connection, or maybe a community behind the work that you've done in piracy and the world of copyright? Uh, any, any best practices that are shared with content creators who are equally concerned with a malware being in their content, or just the sheer, the sheer fact that someone is pirating their content?Peter:I think from a contents per- perspective, and there are several amazing organizations out there, such as the BSA, Business Software Alliance, you have the MPAA, you know, you have the RIAA, and also IACC, the International Anti-Counterfeiting Coalition. Who have just incredible guidance for their members, which are specialized. So, for example, when you look at counterfeit goods, that's a very different thing to like, say, video, because video is distributed in a diff- different way. But one thing, which I think is important is that you don't just leave your, your house open, you lock it with a key, otherwise, someone will just come in and take your stuff. Peter:So, I think the same with contents, that when we create content, we have to find a way to work not only with different organizations that are looking to protect those rights, but also assume your own responsibility of locking your door. For example, what security could you put on it? Right? To maintain it? And how could you work with law enforcement who are there to protect the law, right? There are, I think there are different things that could be considered but most of it really, I would say the best is to start with the industry association, because they are much more specialized, and can give better advice, depending on the nature of the content that the person has. Peter:But, you know, when we were looking at online piracy, it wasn't just online piracy, because, you know, Microsoft participated in something called Operation Pangea. This was an Interpol driven operation where we found that a Russian organization that was distributing software for download in the millions of dollars, we took action to dismantle their payment mechanism. So, Visa and MasterCard would stop the payment on their website. So, they moved to prescription drugs, and they started selling prescription drugs. And so, for certain, it's really not in Microsoft's mandate to do that, right? Peter:But what we did is that we provided the expertise, and the knowledge we have to law enforcement to detect these websites. There were about 10,000 of them, and then drill down to say, "What's the payment gateway?" Because that's a choke point, you know, a criminal, definitely does what he does for the money. You know, you're not gonna rob a bank if there's no money there, right? So, with that in mind, they were able to do really, massively disrupt this organization. And that's because Microsoft looks at providing its expertise, and also learning from other people's expertise, right? But to tackle this bigger problem that impacts all of us.Nic:Peter, I'd love to circle back to language for a sec here. And when you were talking about the languages that you speak, and, and the importance of understanding culture. From your perspective, do you think there are countries, language groups, ethnic groups that are disproportionately... Well, I'm trying to think of the most elegant way to say, not protected or not protected as well as they could because they speak a language that is, you know, not as prevalent? So, you know, I looked at, you know, I'd never heard of the two, the two, uh, Ghanaian languages that you had on your- Peter:Mm-hmm (affirmative). Nic:... on your profile there, I'm not even gonna say them right, but Fante and- Peter:(laughs), so, it's Fante and Twi. Nic:Fante and Twi. So- Peter:Perfect. Nic:... native Fante, and Twi, I'm, I'm assuming there's, there's hundreds of thousands, maybe even millions of speakers of those- Peter:Yeah. Yes, absolutely.Nic:... two languages?Peter:Yes, yeah. Nic:Do AI and ML systems allow for supporting people that, you know, either don't speak English, or a sort of major international language?Peter:You're touching on something, which is very near and dear to me, 'cause it's a whole different conversation. And if you look at the history of language, there's, a, a great group of seminars written about it. It's actually I think, I believe, somewhere, I read somewhere that 60% of languages are actually not written. Right? And yes, you can go and see Microsoft has, translates between say, 60 or 100 pairs of languages, and Google the same. But what about the others? What about the thousands of others, that I think there are over 6,000 languages in the world. You're right. I mean, earlier this year, if I may be personal, I'm trying to adopt a baby girl. And so, I went to Ghana to try and manage the situation, which is very slow. Peter:And when I was there, I just saw the reality that, you know, they don't have access to resources, right? Because a book costs money. And so even for AI, how would they even know what AI is? So, I think there is an increasing gap, which is taking place. We can't keep build, building bigger walls, because it's just not going to work. We gotta be, we gotta think bigger than that. And so, one of the ideas is that when we look at some of the criminals, like I've had quite a few of them, a lot of them go to the same technical universities, for example, in West Africa. Well, why is that? It's cause I think they develop skills, and then they leave, and they can't get a job. And so, they end up being pulled into a life of cybercrime. So, culture Peter:It's I think becoming an important thing is that, there is a bigger and bigger divide 'cause not as many people have access to the resources, and how can we as a community who do have access, sort of proactively contribute to that? 'Cause we can't, there's no way you can, you know, just Nigeria has 190 million people. That's a lot of people, that's a lot people. The African continent has 1.2 billion. Asia, four billion, was like, um, I think it's like, is it two, three billion? No, two billion? Something like that but it's a lot people- Nic:It's a lot. Peter:... outside, right? (laughs). And so I think, I'm glad you brought that up 'cause I think it's a- an interesting conversation that we need to develop even, even more. Natalia:So, just trying to distill some of that down. So, are, are you saying then that, uh, at least when we're looking at language, there is a greater diversity of threat actors than there are targets? That those targets are centralized more around English speakers, but because of disproportionate opportunities in other parts of the world, we see threat actors across a number of different languages, across a number of different cultures? Peter:Yes. I, I think that's, that's a goo- uh, kind of a good summary of that, but I'll probably take it a step further and say, from my vantage point, again, you know, there are many other more brilliant people out there than me, I can only speak of what I've seen. I still find there are concentrations, right? When you look at business email compromise, and you go and pick up a newspaper and say, "Show me all articles about BEC, the biggest crime right now in the world, and show me all the people who've been arrested." Guess what? They're all from one place, West Africa. Why? Because if you look at the history of that crime, BEC, it was a ruse. Before that it used to be called, it was all under the category of Advanced E-fraud, but it used to be a lottery scam. Oh, the Bill and Melinda Gates lottery, you've won $25 million, or, uh, the Nigerian prince, right?Peter:Some people call 419 which is a criminal code in Nigeria. And then it went further back, they used to send faxes. Or, a lot of people developed a culture called the Yahoo boys, right? They it called Yahoo-Yahoo. And what they do is you go on YouTube, and you search for Yahoo-Yahoo, you'll see them like there's a whole culture behind that. They're dancing, they say, "This is my Monday car, my Tuesday car." And because they're making money and their communities are not, the community helps them because they get money. The stolen money is shared, and so now it becomes harder to break that because it becomes part of a culture. And so, that's why we see a lot more there I think than for example, in the US, or in Russia or in other countries it's 'cause I think there was, there's a, they have this kind of lead way that they'd be doing it for a lot longer and have a better sense of how to be sly. Nic:It sounds like the, the principles of reducing crime apply just as generally in the cyberspace as they do too in the, the non-cyber space. Whereas if you can give opportunities and lu- you know, um, lucrative opportunities to people, to utilize the skills that they've developed, both sort of in an orthodox or in an unorthodox fashion- Peter:Mm-hmm (affirmative). Nic:... then they're gonna put those skills to good use. But if you, if you train them up and then don't give them any way of using those skills to, to go, you know, ma- make a living in a, in a positive sense, they're, they're gonna turn to other, other avenues. Sounds like in, in, in parts of West Africa, that is business email compromise.Peter:Right, it is. And if I could just add two things there, one is that, you know, when I started looking at how to address cyber, online criminality, I have to look at the physical part of it. And in the physical world, there's actually, I call them neighborhoods. You have good neighborhoods, and bad neighborhoods, right? There are some neighborhoods you go to, no one's going to pick pockets you, right? Everyone's got a nice car or whatever. The other neighborhoods you go to, and there are some shady people in the corner, probably selling drugs or something. You know, uh, I'm, I'm being very simplistic, but I'm just trying to say, there are differences in neighborhoods in the physical world, and those need to be looked at as well. Because even if you gave education or a job to someone in a bad neighborhood, because of the environmental pressure, they may not be able to leave that neighborhood because they could be pressured into it. Peter:Online it's the same, I found that you see there are clusters of criminal activities that happen. And in those virtual they're interconnected, it's like, like two, or three levels, they know each other mostly. And so, we can have this kind of, we have to think more holistically, I suppose. I'm trying to say, Nic, that, it, we also have to look at the neighborhood and how do you make sure, for example, that neighborhood they have a sports field or the streets are clean because it makes you feel good, right? There's, there are other environmental factors that I think we may need to consider in a more holistic way. We, we can move much faster that way, because there are different factors, uh, which contribute to this.Nic:So, Peter, I honestly feel like we could keep chatting for the next four hours, right? Natalia:(laughs), I know. Peter:(laughs). Nic:We, we, (laughs). We, we've already, (laughs), eaten up a, a lot of your time, and we've covered a lot of ground. I'd love to circle back one final time to, to language and really sort of ask you is, eh, maybe it's not language, but is there something that you sort of feel particularly passionate about in your career at Microsoft? What you've done so far, what you're working on, and what you hope to do moving forward, is language and opening up accessibility through language, and other sort of cultural diversity? You, you, you, spoke a lot about that in the last sort of, you know, 45 minutes. Is that, is that something that you're personally, uh, invested in, and would like to work more on in the future? And, and if not, what other areas are you, are you looking forward to in the future? Peter:It's, it's absolutely something I'm, I'm very passionate about. And within Microsoft, as an example, the company has invested a lot in diversity and inclusion and equity, and it ended last year, but I was the president of the Africans in Microsoft employee resource group, for example, which has close to a thousand people. And all of it is about helping, working in a two way street, where we help our community, who are at times new in the country. And so, don't understand the cultural differences and how do we help them better, not integrate, but be themselves. And also, allow others that don't understand that they may be a minority, but there's so much richness to that diversity and how it makes teams stronger, because then you're not all looking through the same lens and you can bring in, you know, different perspectives about it. So, I'm absolutely invested in that, not just here in the US but also, you know, the African continent. Peter:And, and I'm very fortunate to be working in a company that's actually pushing me to do that. You know, the company is, is doing amazing things when it comes to diversity and inclusion. And yes, there's room to be made, but at least they're active. Going back really quickly to what you mentioned about language and AI, when we look at the internet, the internet is still zeros and ones. So, when you look at machine learning models, a lot of it is looking for like over 250 signals, right? In a, in one site. And it's not just about the language, it's about different languages, computer code and human code. And so, the machines are bringing those two together, which can help better secure platforms. Natalia:And just as we wrap up here, is there anything you want to plug? Any resources, any groups that you'd like to share with our audience? Peter:I think for me, you know, always try and keep updated on security. So, you know, the Microsoft Security Bulletin is a, is a great source for, uh, up-to-date information. Also, I think there are many other organizations that people can search for and reach out to me on the antenna. If you're not a bad guy or girl, I'll- Natalia:(laughs). Peter:... I'll share, (laughs), we, we can, um, actually, you know, I try to mentor as many people in our industry because, eh, together we become stronger. So, do reach out if you want to. Natalia:Awesome. Thank you for that, Peter. It was great having you on the show again, and I can honestly say, we'd be happy to have you back, and it was infinitely fascinating. Peter:Thank you very much for the invitation again. And, uh, it was a pleasure participating. Natalia:By the way, [foreign language 00:38:17]. Peter:Uh, there you go. Natalia:If you ever want to. Peter:(laughs). Natalia:(laughs). Peter:(laughs). Nic:Natalia, I didn't know you speak Spanish.Natalia:(laughs). Peter:(laughs). Natalia:Well, we had a great time unlocking insights into security from research to artificial intelligence, keep an eye out for our next episode. Nic:And don't forget to tweet us @msftsecurity or mail us at with topics you'd like to hear on a future episode. Until then, stay safe.Natalia:Stay secure.

The Human Element with Valecia Maclin

Ep. 21
For Women’s History Month, we wanted to share the stories of just a few of the amazing women who make Microsoft the powerhouse that it is. To wrap up the month, we speak with Valecia Maclin, brilliant General Engineering Manager of Customer Security & Trust, about the human element of cybersecurity.In discussion with hosts Nic Fillingham and Natalia Godyla, Valecia speaks to how she transitioned into cybersecurity after originally planning on becoming a mechanical engineer, and how she oversees her teams with a sense of humanity - from understanding that working from home brings unique challenges, to going the extra mile to ensure that no member of the team feels like an insignificant cog in a big machine - Valecia is a shining example of what leadership should look like, and maybe humanity too.In this Episode You Will Learn:• The importance of who is behind cybersecurity protocols• How Microsoft’s Engineering, Customer Security & Trust team successfully transitioned to remote work under Valecia’s leadership• Tips on being a more inclusive leader in the security spaceSome Questions that We Ask:• What excites Valecia Maclin about the future of Cybersecurity• How does a mechanical engineering background affect a GM’s role in Infosec• How Valecia Maclin, General Manager of Engineering, Customer Security & Trust, got to where she is todayResources:Valecia’s LinkedIn: Minorities’ Interest in Engineering:’s TEALS:’sDigiGirlz:’s LinkedIn:’s LinkedIn: Security Blog:[Full transcript can be found at]Nic Fillingham:Hello, and welcome to Security Unlocked, a new podcast from Microsoft, where we unlock insights from the latest in news and research from across Microsoft security engineering and operations teams. I'm Nic Fillingham. Natalia Godyla:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft security, deep dive into the newest threat intel research and data science. Nic Fillingham:And profile some of the fascinating people working on artificial intelligence in Microsoft security. Natalia Godyla:And now let's unlock the pod. Hey Nic, welcome to today's episode. How are you doing today? Nic Fillingham:Hello Natalia, I'm doing very well, thank you. And very excited for today's episode, episode 21. Joining us today on the podcast is Valecia Maclin, general manager of engineering for customer security and trust someone who we have had on the shortlist to invite onto the podcast since we began. And this is such a great time to have Valecia come and share her story and her perspective being the final episode for the month of March, where we are celebrating women's history month. So many incredible topics covered here in this conversation. Natalia, what were some of your highlights? Natalia Godyla:I really loved how she brought in her mechanical engineering background to cybersecurity. So she graduated with mechanical engineering degree and the way she described it was that she was a systems thinker. And as a mechanical engineer, she thought about how systems could fail. And now she applies that to cybersecurity and the- the lens of risk, how the systems that she tries to secure might fail in order to protect against attacks. And I just thought that that was such a cool application of a non-security domain to security. What about yourself? Nic Fillingham:Yeah. Well, I think first of all, Valencia has a- a incredibly relatable story up front for how she sort of found herself pointed in the direction of computer science and security. I think people will relate to that, but then also we spent quite a bit of time talking about the importance of the human element in cybersecurity and the work that Valecia does in her engineering organization around championing and prioritizing, um, diversity inclusion and what that means in the context of cybersecurity. Nic Fillingham:It's a very important topic. It's very timely. I think it's one that people have got a lot of questions about, like, you know, we're hearing about DNI and diversity and inclusion, what is it? What does it mean? What does it mean for cybersecurity? I think Valecia covers all of that in thi- in this conversation and her perspective is incredible. Oh, and the great news is, as you'll hear at the end, Valecia is hiring. So if you like me are inspired by this conversation, great news is actually a bunch of roles that you can go and, uh, apply for to go and work for Valecia on her team.Natalia Godyla:On with the pod?Nic Fillingham:On with the pod. Valecia Maclin, welcome to the Security Unlocked podcast. Thank you so much for your time. Valecia Maclin:Thank you, Nic and Natalia. Nic Fillingham:We'd love to start to learn a bit about you. You're, uh, the general manager of engineering for customer security and trust. Tell us what that means. Tell us about your team, us about the amazing work that you and- and the people on your team do. Valecia Maclin:I am so proud of our customer security and trust engineering team. Our role is to deliver solutions and capabilities that empower us to ensure our customers trust in our services and our products. So I have teams that build engineering capabilities for the digital crimes unit. We build compliance capabilities for our law enforcement and national security team. And our team makes sure that law enforcement agencies are in compliant with their local regulatory responsibilities and that we can meet our obligations to protect our customers. Valecia Maclin:I have another team that provides on national security solutions. We do our global transparency centers on where we can ensure that our products are what we say they are. I have two full compliance engineering teams that build capabilities to automate our compliance at scale for our Microsoft security development lifecycle, as well as, uh, things like, uh, advancing machine learning, advancing open source security, just a wealth of enterprise wide, as well as stakeholder community solutions. Um, I could go on and on. We do digital safety engineering, so a very broad set of capabilities all around the focus and the mission of making sure that the products and services that we deliver to our customers are what we intend and say that they are Nic Fillingham:Got it. And Valencia so how does your engineering org relate to some of the other larger engineering orgs at Microsoft that are building, uh, security compliance solutions?Valecia Maclin:So our other Microsoft organizations that do that are often building those capabilities within a particular product engineering group. Um, customer security and trust is actually in our corporate, external and legal affairs function. So we don't have that sales obligation. Our full-time responsibility is looking across the enterprise and delivering capabilities that meet those broad regulatory responsibility. So again, if we think about our digital crimes unit that partners with law enforcement to protect our customers around the world, well building capabilities for them or digital safety, right? If you think about the Christ church call and what happened in New Zealand, we're building capabilities to help with that in partnership with what those product groups may need to do. So, um, so we're looking at compliance more broadly. Nic Fillingham:Got it. And does your team interface with some of the engineering groups that are developing products for customers? Valecia Maclin:Absolutely. So when you think about the work that we do in the open source security space, our team is kinda that pointy end of the spear to do, um, that assessment and identify here where some areas are that we need to put some focus and then the engineering, the product engineering groups will then and build, go and build that resiliency into the systems. Nic Fillingham:To follow up questions. One is on the podcast, we've actually spoken to some- some folks that are on your team. Uh, Andrew Marshall was on an earlier episode. We spoke with Scott Christianson, we've had other members of the digital crimes unit come on and talk about that work, just a sort of a sign post for listeners of the podcast. How does Andrew's work, uh, fit in your organization? How does Scott's work fit into your organization? Valecia Maclin:So, um, both Andrew and Scott are in a team, um, within my org, uh, that's called security engineering and assurance, and they're actually able to really focus their time on that thought leadership portion. So again, if you think about the engineering groups and the product teams, they have to, you know, really focus on the resiliency of the products, what our team is doing is looking ahead to think about what new threat vectors are. So if you think about the work that Andrew does, he partnered with Harvard and- and other parts of- of Microsoft to really advance thought leadership and how we can interpret adversarial machine learning. Valecia Maclin:Um, when you think about some of our other work in our open source security space, it is let's look forward at where we need to be on the edge from a thought leadership perspective, let's prototype some capabilities operationalizes, so that it's tangible for the engineering groups that then apply and then, uh, my guys will go and partner with the engineering groups and gi- and girls, right? So- so, um, we will then go and partner with the product groups to operationalize those solutions either as a part of our security, um, development life cycle, or just a general security and assurance practices. Nic Fillingham:Got it. And I think I- I can remember if it was Scott or Andrew mentioned this, but on a previous podcast, there was a reference to, I think it's an internal tool, something called Liquid. Valecia Maclin:Liquid, yes, uh, yeah. Nic Fillingham:Is that, can you talk about that? Cause we, uh, it was hinted at in the previous episode? Valecia Maclin:Absolutely. Yes. Yeah. So Liquid, um, actually have a full team that builds and sustains Liquid. It is a, um, custom built capability that allows us to basically have sensors within our built systems. Um, and so when you think about our security development life cycle, and you think about our operational security requirements, it's given us a way to automate not only those requirements, but you know, ISO and NIST standards. Um, and then that way, with those hooks into the build systems, we can get a enterprise wide look at the compliance state of our bills as they're going on. Valecia Maclin:So a developer in a product group doesn't have to think about, am I compliant with SDL? Um, what they can do is, you know, once the- the data is looked at, we can do predictive and reactive analysis and say, hey, you know, there's critical bugs in this part of the application that haven't been burned down within 30 days. And so rath- rather than a lot of manual and testation, we can do, um, compliance a scale. And I- I just mentioned manual and testation of security requirements. Oh, one of my other teams, um, has recently just launched Valecia Maclin:.. the capability that we're super excited about that leverages what we call Coach UL or used to be called Simile. That again, is automating kind of on the other edge, right? So, with liquid, it's once we pulled in the build data. Um, we're working with the engineering groups in Microsoft now to, um, do the other edge where they don't have to set up a test that they're compliant with security requirements. Um, we're, we're moving very fast to, um, automate that on behalf of the developer, so that again, we're doing security by design. Nic Fillingham:So, how has your team had to evolve and change, uh, the way that they, they work during this sort of the COVID era, during the sort of work from home? Was your team already set up to be able to securely work remotely or were there sort of other changes you had to make on the fly? Valecia Maclin:So, you know, uh, as we've been in COVID, my team does respond to phenomenally. We were actually well positioned to work from home and continue to function from home. You know, there were some instances where from an ergonomic perspective, let's get some resources out to folks because maybe their home wasn't designed for them to be there, you know, five days a week. So, the, the technical component of doing the work, wasn't the challenge. What I, as a leader continuously emphasized, and it's what, what my team needed, frankly, is making sure we stayed with the connectedness, right?Valecia Maclin:How do we continue to make sure that folks are connected, that they don't feel isolated? That, you know, they feel visibility from their, from their managers? And consider I had, I had 10 new people start in the past year, entirely through COVID including three new college hires. So, can you imagine starting your professional-Nic Fillingham:Wow.Valecia Maclin:... career onboarding and never being in the office with your peers or colleagues and, and, you know, and the connected tissue you would typically organically have to build relationships. And so through COVID, during COVID, we've had to be very creative about building and sustaining the connective tissue of the team. Making sure that we were understanding folks, um, personal needs and creating a safe space for that. You know, I was a big advocate way back in August where I said, Hey folks, you know, 'cause the sch- I knew the school year was starting. And even though we hadn't made any statements yet about when returned to work would, you know, would advanced to, I made a statements to my team of, Hey, it's August, we've been at this for a few months. It's not going anywhere anytime soon. Valecia Maclin:So, I don't want us carrying ourselves as if we're coming back to the office tomorrow. Let's, you know, give folks some space to reconcile what this is gonna look like if they have childcare, if they have elder care, if they're just frozen from being in- indoors this amount of time. Let's make sure that we're giving each other space for that. Also during the past year, you know, certainly we had, I would say, parallel once in a generation type events, right?Valecia Maclin:So, we had COVID, but we also had, uh, increased awareness, you know, of, of the racial inequities in our country. And for me as a woman of color that's in cybersecurity, I've spent my entire career being a, a series of first, um, particularly at the executive table. And so, you know, so it was a, an opportunity we also had in the past year to advance that conversation so that we could extend one another grace, right? So I personally was touched by COVID. I, I lost five people in the past year. Um, and I was also-Nic Fillingham:I'm so sorry. Valecia Maclin:Yeah. (laughs) And you keep showing up, right? And I was personally touched as a black woman who once again, has to be concerned about, you know, I have, uh, I have twin nephews that are 19, one's autistic and the other is not, but we won't allow him to get a driver's license yet 'cause he, my, my sister's petrified because, you know, that's a real fear that a young man who's 6'1", sweetest thing you would ever see, soft-spoken, um, but he's 6'1". He has, you know, dreadlocks in his hair or locks. He would hate to hear me say they were dreads. He has locks in his hair. Um, and he dresses like a 19 year old boy, right?Valecia Maclin:But on spot, that's not what the world sees. And so, um, that's what we're all in. Then you think about what's happening now with our Asian-American community. That's also bundled with folks who are human, having to be isolated and endorse, which that's not how humanity was designed. And so we have to remember that that shows up. And, and when you're in, in the work of security, where you're always thinking about threat actors, and I often say that some of our best security folks have kind of some orthogonal thinking that's necessary to kind of deal with the different nuances.Valecia Maclin:When you, when you are thinking about how do you build resiliency against ever evolving threats, (laughs) not withstanding the really massive one that, you know, was the next one we, we dealt with at the end of the last calendar year. Those are all things that work in the circle. And I always say that people build systems, they don't build themselves. And in this time more than ever, hopefully, as security professionals, we're remembering the human element. And we're remembering that the work that we do, um, has purpose, which is, you know, why I entered this space in, in the first and why I've spent my career doing the things I've done is because we have a phenomenal responsibility increasingly in a time of interconnectedness from a technology perspective to secure our way of life. Nic Fillingham:Wow. Well, on, on that note, you talked about sort of why you went into security. I'd love to sort of, I'd love to go there. Would you mind talking us through how you sort of first learnt of security and, and why you're excited about it, and how you made the decision to, to go into that space? Valecia Maclin:Absolutely. So, mine actually started quite awhile ago. I was majoring in mechanical engineering and material science, uh, at Duke university. I was in my junior year and, um, I should preface it with, I did my four year engineering degree in three and a half years. So, my, my junior year was pretty intense. I worked, was working on a project for mechanical engineering that I'd spent about seven hours on and I lost my data.Nic Fillingham:Ah!Valecia Maclin:I was building a model, literally, I sat at the computer because, you know, you know, back then, you know, there weren't a whole lot of computer resources, so you try to get there early and, and, and snag the computer so that you could use it as long as you needed to. I went in actually, on a holiday because I knew everybody would be gone. So, if I, I could have the full day and not have to give up the computer to someone. So, I'd spend seven hours building this model and it disappeared. Valecia Maclin:And it was the, you know, little five in a 10 floppy, I'm pulling it out, I'm looking at the box (laughs). It's gone. The, the, the model's gone. I was gonna have to start all over. I started my homework over again, but then I said, I will never lose a homework assignment like that again. So, I went and found a professor in the computer science school to agree to do an independent study with me, because as a junior, no one was gonna allow me to change my major for mechanical engineering that far in, at Duke University. So, (laughs) not, not my parents, anyway. So, I, um, did an independent study in computer science and taught myself programming. So, I taught myself programming, taught myself how to understand the hardware with, with my professors help, of course. But it was the work I did with that independent study that actually led to the job I was hired into when I graduated. Valecia Maclin:So, I've never worked as a mechanical engineer. I immediately went into doing national security work, um, where I worked for companies that were in the defense industrial base for the United States. And so I, I started and spent my entire career building large scale information systems for, you know, the DOD, for the intelligence community, and that vectored into my main focus on large, um, security systems that I was developing, or managing, or leading solutions through. So, it started with loss data, right? (laughs) You know, which is so apropos for where we are today, but it started with, you know, losing data on a software, in a software application and me just being so frustrated Valecia Maclin:Straight and said, that's never gonna happen to me again (laughs) that, um, that led me to pursue work in this space. Natalia Godyla:How did your degree in mechanical engineering inform your understanding of InfoSec? As you were studying InfoSec, did you feel like you were bringing in some of that knowledge? Valecia Maclin:One of the beautiful things and that was interesting is I would take on new roles, I'll, I'll never forget. Um, I, I got wonderful opportunities as, as my career was launched and folks would ask me, well, why are you gonna go do that job? You've never done that before, you know, do you know it? (laughs) And so what that taught me is, you know, you don't have to know everything about it going in, you just need to know how to address the problem, right? So, I consider myself a systems thinker, and that's what my mechanical engineering, um, background provided was look at the whole system, right? And so how do you approach the problem? And also because I also had a material science component, we studied failures a lot. So, material failure, how that affected infrastructure, you know, when a bridge collapse or, or starts to isolate. Um, so it was that taking a systems view and then drilling down into the details to predictively, identify failures and then build resiliency to not have those things happen again. Is that kind of that, that level of thinking that played into when I went into InfoSec. Natalia Godyla:That sounds incredibly fitting. So, what excites you today about InfoSec or, or how has your focus in InfoSec changed over time? What passions have you been following? Valecia Maclin:So, for me, it's the fact that it's always going to evolve, right? And so, you know, obviously the breaches make the headlines, but I'm one, we should never be surprised by breaches, just like we shouldn't be surprised by car thefts or home invasions, or, you know, think about the level of insurance, and infrastructure, and technology, and tools and habits (laughs) that we've, uh, we've developed over time for basic emergency response just for our homes or our life, right? Valecia Maclin:So, for me, it's just part of the evolution that we have, that there's always gonna be something new and there's always gonna be that actor that's gonna look to take a shortcut, that's gonna look to take something from someone else. And so in that regard, it is staying on the authence of building resiliency to protect our way of life. And so I, I am always passionate and again, it's, it's likely how I, you know, spent almost, you know, over 27 years of my career is protecting our way of life. But protecting it in a way where for your everyday citizen, they don't have to go and get the degree in computer science, right? Valecia Maclin:That they can have confidence in the services and the, the things that they rely on. They can have confidence that their car system's gonna break, that the brakes are gonna hit, you know, activate when they hit it. That's the place I wanna see us get to as it relates to the dependency we now have on our computer systems, and in our internet connected devices and, and IOT and that sort of thing. So, that's what makes me passionate. Today it may look like multi-factored authentication and, you know, zero trust networks, but tomorrow is gonna look like something completely different. And what I, where I'd love to see us get is, you know, think about your car. We don't freak out about the new technologies that show up in our car, you know, 'cause we know how, we, we, we get in and we drive and, and we anxiously await some people.Valecia Maclin:I, I'm kind of a control freak, I wanna still drive my car. I don't want it to drive itself (laughter). Um, but nevertheless, with each, you know, generational evolution of the car, we didn't freak out and say, Oh my gosh, it's doing this now. If we can start to get there to where there's trust and confidence. And, and that's why I love, you know, what my org is responsible for doing is, you know, that there's trust and confidence that when Microsoft, when you have a Microsoft product or service, you, you, you can trust that it's doing what you intend for it to do. And, and that's not just for here, but then, you know, when you're again, whether it's the car, or your refrigerator, or your television, that's where I'd love to, that's where I want to see us continue to evolve. Not only in the capabilities we deliver, but as a society, how we expect to interact with them. Natalia Godyla:Are you particularly proud of any projects that you've run or been part of in your career? Valecia Maclin:I am. And it's actually what led me to Microsoft, I had my greatest career success, but it, it came also at, at a time of, of, of my greatest personal loss. Literally they were concurrent on top of each other. And so I was responsible, I was the, the business executive responsible for the cybersecurity version of, of, of the JEDI program. Uh, so I was the business executive architecting our response to that work that was what the department of Homeland Security. I worked for a company that at the time wasn't known for cybersecurity, and so it was a monumental undertaking to get that responsibility. And the role was to take over and then modernize the cybersecurity re- system responsible for protecting the .gov domain. So, it was tremendously rewarding, especially in the optic that we have today. I received the highest award that my prior company gives to an individual. Valecia Maclin:I was super proud of the team that I was able to lead and, and keep together during all the nuances of stop, start, stop, start that government contracting, um, does when there's protests. But during that same time, you know, 'cause it was, so it was one of those once in a career type opportunities, if you've ever done national security work, to actually usher an anchor in a brand new mission is how we would label it, um, that you would be delivering for the government. But at the same time, that, that wonderfully challenging both technically and from a business perspective scenario was going on, I, in successive moments, lost my last grandparent, suddenly lost my sister. 12 months later, suddenly lost my mother, six months later had to have major surgery. So, that all came in succession while I was doing this major once in a career initiative that was a large cyber security program to protect our government. Valecia Maclin:And I, I survived, (laughs) right? So, um, the, the program started and did well, but I, I then kind of took a step back, right? Once I, I, uh, I'd promised the company at the time of the government that I would, I would give it a year, right? I would make sure the program transitioned since we'd worked so hard to get there. And then I took a step back and said, Hmm, what do I really wanna do? This was a lot (laughs). And so I did take a step back and got a call from Microsoft, actually, um, amongst some other companies. Uh, I thought it was gonna take a break, but clearly, um, others had, had different ideas. And so, um, (laughter) I had, I had multiple opportunities presented to me, but what was so intriguing and, and what drew me to Microsoft was first of all, the values of the company. You know, I'm a values driven person and the values, um mean a lot and I'm gonna come back to that in a moment. Valecia Maclin:But then also I, I mentioned that the org I lead is in corporate external and legal affairs. It's not within the product group. It's looking at our global obligations to securing our products and services from a, not just a regulatory perspective, but not limited by our, our sales target. And so the ability to be strategic in that way is what was intriguing and what, what drew me. When you think about the commitments the company has made to its employees and to its vendors during a time, um, that we've been in, it says a lot about the fabric of, of who we are to take that fear of employability insurance and those sorts of things that are basic human needs, to recall how early on we still had our cafeteria services going so that they could then go and provide meals for, for students who would typically get school meals. And at the same Valecia Maclin:... time it meant that those vendors that provide food services could continue to do their work. When you think about our response to the racial inequity and, and justice, social justice initiative, and the commitments were not only, not only made, but our, our keeping is the fabric of the company and the ability to do the work that I'm passionate about, that, that drew me here. Nic Fillingham:You talked about bringing the human element to security. What does that mean to you and how have you tried to bring that sort of culturally into your organization and, and, and beyond?Valecia Maclin:So, if you think about the human element of security, the operative word is human. And so as humans, we are a kaleidoscope of gender, and colors, and nationalities and experiences. Even if you were in the same town, you have a completely different experience that you can bring to bear. So, when I think about how I introduce, um, diversity, equity and inclusion in the organization that I lead, it is making sure that we're more representative of who we are as humans. And sometimes walking around Redmond, that you don't always get that, but it's the, you know, I, I come from the East Coast. So, you know, one of the going phrases I would use a lot is, I'm not a Pacific Northwestner or I don't have this passive aggressiveness down, I'm pretty direct (laughs). And so that's a different approach, right, to how we do our work, how we lean in, how we ask questions. Valecia Maclin:And so I am incredibly passionate about increasing the opportunities and roles for women and underrepresented minorities, underrepresented, uh, minorities in cybersecurity. And so we've been very focused on, you know, not just looking at internal folks that we may have worked on, worked on another team, you know, for years, and making sure that every opportunity in my organization is always opened up both internally and externally. They're always opened up to make sure that we're, we're looking beyond our mirror image to, um, hire staff. And it's powerful having people that think the same way you do, because you can coalesce very quickly. But the flip side of that is sometimes you can lose some innovation because everybody's seeing the same thing you see. And, and it's so important in, in security because we're talking about our threat actors typically having human element, is making sure that we can understand multiple voices and multiple experiences as we're designing solutions, and as we're thinking about what the threats may be. Natalia Godyla:So, for women or, uh, members of minority groups, what guidance do you have for them if they're not feeling empowered right now in security, if they don't know how to network, how to find leaders like yourself, who are supporting DNI? Valecia Maclin:One of the things I always encourage folks to do, and, and I mentor a lot is, just be passionate about who you are and what you contribute. But what I would say, uh, Natalia, is for them to take chances, not be afraid to fail, not be afraid to approach people you don't know, um, something that I got comfortable with very early as if I was somewhere and heard a leader speak on stage somewhere, or I was, uh, you know, I saw someone on a panel internally or externally, I would go up to them afterwards and introduce myself and ask, you know, would you be willing to have a career discussion with me? Can I get 30 minutes on your calendar? And so that was just kind of a normal part of my rhythm, which allowed me to be very comfortable, getting to meet new executive leaders and share about myself and more importantly, hear about their journeys. Valecia Maclin:And the more you hear about other's journey, you can help cultivate a script for your own. And so, so that's what I often encourage 'cause a lot of times folks are apr- afraid, particularly women and, and minorities are afraid to approach to say, think, well, you know, I don't know enough, or I don't know what to ask. It can be as simple as, I heard you speak, I would love to hear more about your story. Do you have time? Do you have 20 minutes? And then let, you know, relationships start from there and let the learning start from there. Nic Fillingham:As a leader in the security space, as a leader at Microsoft, what are you excited about for the future? What what's sort of coming in terms of, you know, it could be cultural change, it could be technology innovation. What, what are you sort of looking and seeing in the next three, five, 10 years? Valecia Maclin:For me it the cultural change. I'm looking forward and you heard me kind of allude to a little bit of this of, you now have the public increasingly aware of what happens when there's data loss. I'm so excited to look forward to that moment when that narrative shifts and the public learns and knows more of security hygiene, cyber security hygiene. And, and not, you know, both consumer and enterprise, because we take for granted that enper- enterprises have nailed this. And, and we're in a unique footing as a company to have it more part of our DNA, but not every company does. And so that's what I'm looking forward to for the future is the culture of that young person in the midst of schooling, not having to guess about what a cybersecurity or security professional is, much like they don't guess what a lawyer or a doctor is, right? So, that's what I look forward to for the future. Nic Fillingham:Any organizations, groups that you, you know, personally support or fans of that you'd also like to plug? Valecia Maclin:Sure. So, I actually support a, a number of organizations. I support an organization called Advancing Minorities in Engineering, which works directly with historically black colleges and universities to not only increase their learning, but also create opportunities to extend the representation in security. I also am a board member of Safe Code, which is also focused on advancing security, design, hygiene across enterprises, small midsize and large businesses. And so, so those are, are certainly, uh, a couple of, of organizations that, you know, I dedicate time to.Valecia Maclin:I would just encourage folks, you know, we have TEALS, we have DigiGirlz. everyone has a role to play to help expand the perception of what we do in the security space. We're not monolithic. The beauty of us as a people is that we can bring our differences together to do some of the most phenomenal, innovative things. And so that would be my ask is in, whatever way fits for where someone is, that they reach out to someone and make that connection. I v- I very often will reach down and, uh, I'll have someone, you know, a couple levels down and say, Oh my gosh, I can't believe you called and asked for a one-on-one. Valecia Maclin:So, I don't wait for folks to ask for a one-on-one with me. I, I'll go and ping and just, you know, pick someone and say, Hey, you know, I wanna, I just wanna touch base with you and see how you're doing and see what you're thinking about with your career. All of us can do that with someone else and help people feel connected and seen. Natalia Godyla:And just to wrap here, are you hiring, are there any resources that you want to plug or share with our audience, might be interested in continuing down some of these topics? Valecia Maclin:Absolutely. Thank you so much. Um, so I am hiring, hiring data architects, 'cause you can imagine that we deal with high volumes of data. I'm hiring software engineers, I'm hiring, uh, a data scientist. So, um, data, data, and more data, right?Natalia Godyla:(laughs).Valecia Maclin:And, um, and software engineers that are inquisitive to figure out the, the right ways for us to, you know, make the best use of it. Natalia Godyla:Awesome. Well, thank [crosstalk 00:35:11] you for that. And thank you for joining us today, Valecia.Valecia Maclin:Thank you, Natalia. Thank you, Nic. I really enjoyed it.Natalia Godyla:Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode.Nic Fillingham:And don't forget to tweet us @msftsecurity or email us at with topics you'd like to hear on a future episode. Until then, stay safe.Natalia Godyla:Stay secure.

Identity Threats, Tokens, and Tacos

Ep. 20
Every day there are literally billions of authentications across Microsoft – whether it’s someone checking their email, logging onto their Xbox, or hopping into a Teams call – and while there are tools like Multi-Factor Authentication in place to ensure the person behind the keyboard is the actual owner of the account, cyber-criminals can still manipulate systems. Catching one of these instances should be like catching the smallest needle in the largest haystack, but with the algorithms put into place by the Identity Security team at Microsoft, that haystack becomes much smaller, and that needle, much larger.On today’s episode, hostsNic Fillingham and NataliaGodyla invite back Maria Puertos Calvo, theLeadDataScientistin Identity Security and Protection at Microsoft,to talk with us about how her team monitors such amassive scale of authentications on any given day.Theyalsolookdeeper into Maria’s background and find out what got her into the field of security analytics andA.I. in the first place, and how her past in academiahelpedthattrajectory.In this Episode You Will Learn:• How the Identity Security team uses AI to authenticate billions of logins across Microsoft• Why Fingerprints are fallible security tools• How machine learning infrastructure has changed over the past couple of decades at MicrosoftSome Questions that We Ask:• Is the sheer scale of authentications throughout Microsoft a dream come true or a nightmare for a data analyst?• Do today’s threat-detection models share common threads with the threat-detection of previous decades?• How does someone become Microsoft’s Lead Data Scientist for Identity Security and Protection?Resources:#IdentityJobs at Microsoft:’s First Appearance on Security Unlocked, Tackling Identity Threats with A.I.:’s Linkedin:’s LinkedIn:’s LinkedIn: Security Blog:[Full transcript can be found at]Nic Fillingham:Hello, and welcome to Security Unlocked, a new podcast from Microsoft where we unlock insights from the latest in news and research from across Microsoft security engineering and operations teams. I'm Nic Fillingham.Natalia Godyla:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft security, deep dive into the newest threat intel, research, and data science. Nic Fillingham:And profile some of the fascinating people working on Artificial Intelligence in Microsoft security. Natalia Godyla:And now, let's unlock the pod.Nic Fillingham:Hello, Natalia. Welcome to episode 20 of Security Unlocked. This is, uh, an interesting episode. People may notice that your voice is absent from the... This interview that we had with Maria Puertos Calvo. How, how you doing? You okay? You feeling better?Natalia Godyla:I am, thank you. I'm feeling much better, though I am bummed I missed this conversation with Maria. I had so much fun talking with her in episode eight about tackling identity threats with AI. I'm sure this was equally as good. So, give me the scoop. What did you and Maria talk about?Nic Fillingham:It was a great conversation. So, you know, this is our 20th episode, which is kind of crazy, of Security Unlocked, and we get... We're getting some great feedback from listeners. Please, send us more, we want to hear your thoughts on the... On the podcast. But there've been a number of episodes where people contact us afterwards on Twitter or an email and say, "Hey, that guest was amazing," you know, "I wanna hear more." And Maria was, was definitely one of those guests who we got feedback that they'd love for us to invite them back and learn more about their story. So, Maria is on the podcast today to tell us about her journey into security and then her path to Microsoft. I won't give much away, but I will say that, if you're studying and you're considering a path into cyber security, or you're considering a path into data science, I think you're gonna really enjoy Maria's story, how she sort of walks through her academia and then her time into Microsoft. We talk about koalas and we talk about the perfect taco.Natalia Godyla:Yeah, to pair with the guac which she covered the first time around. Now tacos. I feel like we're building a meal here. I'm kind of digging the idea of a Security Unlocked recipe book. I, I think we need some kind of mocktail or cocktail to pair with this.Nic Fillingham:Yeah, I do think two recipes might not be enough to qualify for a recipe book. Natalia Godyla:Yeah, I mean, I'm feeling ambitious. I think... I think we could get more recipes, fill out a book. But with that, I, I cannot wait to hear Maria's episode. So, on with the pod?Nic Fillingham:On with the pod.Nic Fillingham: Maria Puertos Calvo, welcome back to the Security Unlocked podcast. How are you doing?Maria Puertos Calvo:Hi, I'm doing great, Nic. Thank you so much for having me back. I am super flattered you guys, like, invited me for the second time.Nic Fillingham:Yeah, well, thank you very much for coming back. The episode that we, we, we first met you on the podcast was episode eight which we called Tackling Identity Threats With AI, which was a really, really popular episode. We got great feedback from listeners and we thought, uh, let's, let's bring you back and hear a bit more about your, your own story, about how you got into security, how you got into identity, how you got into AI. And then sort of how you found your way to Microsoft. Nic Fillingham:But since we last spoke, I want to get the timeline right. Did you have twins in that period of time or had the twins already happened when we spoke to you in episode eight?Maria Puertos Calvo:(laughs) No, the twins had already happened. They-Nic Fillingham:Got it.Maria Puertos Calvo:I think it's been a few months. But they're, they are nine, nine months old now. Yeah.Nic Fillingham:Nine months old. And, and the other interesting thing is you're now in Spain.Maria Puertos Calvo:Yes.Nic Fillingham:When we spoke to you last, you were in the Redmond area or is that right?Maria Puertos Calvo:Yes, yes. The... Last time when we, we spoke, I, I was in Seattle. But I was about to make this, like, big trip across the world to come to Spain and, and the reason was, actually, you know, that the twins hadn't met my family. I am originally from Spain, and, and my whole family is, is here. And, you know, because of COVID and everything that happened, they weren't able to travel to the US to see us when they were born. So, my husband and I decided to just, like, you know, do a trip and take them. And, and we're staying here for a few months now. Nic Fillingham:That's awesome. I've been to Madrid and I've been to... I think I've only been to Madrid actually. Where, where... Are you in that area? What part of Spain are you in?Maria Puertos Calvo:Yes, yes. I'm in Madrid. I'm in Madrid. I, I'm from Madrid.Nic Fillingham:Aw- awesome. Beautiful city. I love it. So, obviously, we met you in episode eight, but if you could give us, uh, a little sort of mini reintroduction to who you are, what's your job at Microsoft, what does your... What does your day-to-day look like, that'd be great.Maria Puertos Calvo:Yeah. So, I am the lead data scientist in identity secure and protection, identity security team who... We are in charge of making sure that all of the users who use, uh, Microsoft identity services, either Azure Active Directory or Microsoft account, are safe and protected from malicious, you know, uh, cyber criminals. So, so, my team builds the algorithms and detections that are then put into, uh, protections. Like, for example, we build machine learning for risk based authentication. So, if we... If our models think an authentication is, is probably compromised, then maybe that authentication is challenged with MFA or blocked depending on the configuration of the tenet, et cetera. Maria Puertos Calvo:So, my team's day-to-day activities are, you know, uh, uh, building new detections using new data sets across Microsoft. We have so much data between, you know, logs and APIs and interactions b- between all of our customers with Microsoft systems. Uh, so, so, we analyze the data and, and we build models, uh, apply AI machine learning to detect those bad activities in the ecosystem. It could be, you know, an account compromised a sign-in that looks suspicious, but also fraud. Let's say, like, somebody, uh, creates millions of spammy email addresses with Microsoft account, for example to do bad things to the ecosystem, we're also in charge of detecting that.Nic Fillingham:Got it. So, every time I log in, or every time I authenticate with either my Azure Active Directory account for work or my personal Microsoft account, that authentication, uh, event flows through a set of systems and potentially a set of models that your team owns. And then if they're... And if that authentication is sort of deemed legitimate, I'm on my way to the service that I'm accessing. And if it's deemed not legitimate, it can go for a challenge through MFA or it'll be blocked? Did, did I get that right?Maria Puertos Calvo:You got that absolutely right.Nic Fillingham:So, that means... And I think we might've talked about this on the last podcast, but I still... I... As a long-term employee of Microsoft, I still get floored by the, the sheer scale of all this. So, there's... I mean, there's hundreds of millions of Microsoft account users, because that's the consumer service. So, that's gonna be everything from X-Box and Hotmail and and using the Bing website. So, that's, that's literally in the hundreds of millions realm. Is it... Is it a billion or is it... Is it just hundreds of millions?Maria Puertos Calvo:It depends on how you count them. Uh, if it's per day, it's hundreds of millions, per month I think it's close to a billion. Yes, for... Of users. But the number of authentications overall is much higher, 'cause, you know, the users are authenticating in s- in s- many cases, many, many times a day. A lot of what we evaluate is not only, like, your username and password authentications, there's also the, you know, the model authe- authentication particles that have your tokens cash in the application and those come back for request for access. So, the... We evaluate those as well. Maria Puertos Calvo:So, it's, uh... It's actually tens of billions of authentications a day for both the Microsoft account system and the Azure Active Directory system. Azure Active Directory is also a... Really big, uh, it's almost... It's, it's getting really close to Microsoft account in terms of monthly, monthly active users. And actually, this year, with, you know, COVID, and everybody, you know, the... All the schools, uh, going remote and so many people going to work from home, we have seen a huge increase in, in, in monthly active users for Azure Active Directory as well.Nic Fillingham:And do you treat those two systems separately? Uh, or, or are they essentially the same? It's the same anomaly detection and it's the same sort of models that you'd use to score and determine if a... If an authentication attempt is, is, uh, is legitimate or, or otherwise?Maria Puertos Calvo:It's, like, theoretically the same. You know, like, we, we use the same methodology. But then there are different... The, the two systems are different. They live in different places with different architectures. The data that is logged i- is different. So, these, these were initially not, you know... I- identity only, uh, took care of those two systems, like, a few years ago, before they w- used to be owned by different teams. So, the architecture underneath is still different. So, we still have to build different models and maintain them differently and, you know, uh, uh, tune them differently. So, so it is more work, but, uh, the, the theory and the idea, their... How we built them is, is very similar.Nic Fillingham:Are there some sort of trends that have, you know, appeared, having these two massive, massive systems sort of running in parallel but with the same sort of approach? What kind of behaviors or what kind of anomalies do you see detected in one versus the other? Do they sort of function sort of s- similar? Like, similar enough? Or do you see some sort of very different anomalies that appear in one system and, and not another.Maria Puertos Calvo:They're, interestingly, pretty different. Uh, when we see attack spikes and things like that, they don't always reflect one or the other. I think the, the motivation of the people that attack enterprises and organizations, it's, it's definitely from the, the hackers that are attacking consumer accounts. I think they're, you know, they're so in the black market separately, and they're priced separately, you know, and, and differently. And I think they're, they're generally used for different purposes. We see sometimes spikes in correlation, but, but not that much.Nic Fillingham:Before we sort of, uh, jump in to, to your personal story into security, into Microsoft, into, into data science, is the... You know, these... Talking about these sheer numbers, talking about the hundreds of millions of, of authentications, I think you said, like, tens of billions that are happening every day. Is that a dream for a data scientist to just have such a massive volume of data and signals at your fingertips that you can use to go and build models, train models, refine models? Is that, you know... Is this adage of more signal equals better, does that apply? Or at some point do you now have challenges of too much signal and you're now working on a different set of problems?Maria Puertos Calvo:That's a great question. It is an absolute dream and it's also a nightmare. (laughs) So, yeah. It is... It... And I'll tell you why for both, right? Like, a... It is a great dream. Like, obviously, you bet... The, the sheer scale of the data, the, you know, the, the fact... There are a lot of things that are easier, because sometimes when you're working with data and statistics, you have to do a lot of things to estimate if, Maria Puertos Calvo:... it's like the things that you're competing are statistically significant, right? Like, do I have enough data to approach that this sample, it's going to be, uh, reflection of reality, and things like that. With the amount of data that we have, with the amount of users that we have, it's the, we don't have that, we, we don't really have that problem, right? Like we are able to observe, you know, the whole rollout without having to, to figure out if what we're seeing, you know, it's similar to the whole world or not. Maria Puertos Calvo:So that's really cool. Also, because we're, you know, have so many users, then we also have, you know, we're a big focus for attackers. So, so we can see everything, you know, that happens in, in, in the cybersecurity world and like the adversary wall, we can find it in, in our data. And, and that is really interesting. Right. It's, it's really cool. Nic Fillingham:That sounds fascinating. But let, let, let's table that for a second. 'Cause I'd love to sort of go back in time and I'd love to learn about your journey into security, into sort of computer science, into tech, where did it all start? So you grew up in Madrid, is that right? Maria Puertos Calvo:Yes. I grew up in Madrid and when I was finishing high school and I was trying to figure out like, why do I do, I just decided to study telecommunication engineering, it's what's called a Spain, but it's ev- you know, the, the equivalent who asked degrees electrical engineering. Because I was actually, you know, really, really interested in math and science and physics. They were like my favorite subjects in high school. I was pretty, really good at it actually. Maria Puertos Calvo:And, but at the same time, I was like, well, this, you know, an engineering degree sounds like something that I could apply all of this to. And the one that seems like the coolest and the future and like I, I, is electrical engineering. Like I, at that time, computer science was also kind of like my second choice, but I knew that in electrical engineering, I could also learn a lot of computer science. Maria Puertos Calvo:It w- it has like a curriculum that includes a lot of computer science, but also you learn about communication theory and, you know, things like how do cell phones work? And how does television work? And you can learn about computer vision and image processing and all, all kinds of signal processing. I just found it fascinating. Maria Puertos Calvo:So, so I, I started that in college and then when I finished college, it was 2010. So it was right in the middle of the great recession, which actually hits Spain really, really, really badly when it came to the, the labor market, the unemployment back then, I think it was something like 25%-Nic Fillingham:Wow.Maria Puertos Calvo:... and people who were getting out of school, even in engineering degrees, which were traditionally degrees that would have, you know, great opportunities. They were not really getting good jobs. People, only consulting firms were hiring them, um, and, and really paying really, really little money. It was actually pretty kind of a shame. So I said, what, what, what should I do? And I, I had been a good student during college, so, and I had a professor that, you know, he, that I had done my kind of thesis with him and his research group. Maria Puertos Calvo:And he said, "Hey, why didn't you just like, continue studying? Like, you can actually go for your PhD and, because you have really good grades, I'm sure you can just get it full of finance. You can get a scholarship that will like finance, you know, four years of PhD. And you know, that way you don't have to pay for your studies, but also you kind of like, you're like a researcher and you have, uh, like money to live." And I was like, well, that sounds like a really good plan.Nic Fillingham:Sounds good.Maria Puertos Calvo:Like I actually, yeah. So, so I could do in that. And, and I, you know, then my master said, this masters say, wasn't computer science, but it was very pick and choose, right? Like, like you could pick your branch and what classes you took. And so the master's was the first half of the PhD was basically getting all your PhD qualifying courses, which also are equivalent to, to doing your masters. Maria Puertos Calvo:So I picked kind of like the artificial intelligence type branch, which had a lot of, you know, classes on machine learning and learn a lot of things that are apply that are user apply machine learning, it's like, uh, natural language processing and speech and speaker recognition and biometrics and computer vision. Basically, all kinds of fields of artificial intelligence, where, where in the courses that I took. And, and I really, really fou- found it fascinating. There wasn't, you know, a data science degree back then, like now everybody has a data science degree, but this is like 10 years ago. Uh, at least, you know, in Spain, there wasn't a data science degree.Maria Puertos Calvo:But this is like the closest thing, uh, that, and that was my first contact with, uh, you know, artificial intelligence and machine learning. And I, I loved it. And, and then I did my masters thesis on, uh, kind of like, uh, biometrics in, in terms of applying statistical models to forensic fingerprints to, to understand if a person can be falsely, let's say, accused of a crime because their fingerprint brand only matches a fingerprint that is found in a crime scene. Maria Puertos Calvo:So kind of try to figure out like, how likely is that. Because there have been people in the past that having wrongly convicted, uh, because of their fingerprints have been found in a crime scene. And then after the fact they have found the right person and then, you know, like, uh, it's not a very scientific method, what is followed right now. So that, that was a really cool thing too, that then I never did anything related to that in my life, but, but it was a very cool thing to study when I was in, in school. Nic Fillingham:Well, that, that's fair. I've, I've got some questions about that. That's fascinating. So how did you even stumble upon that as a, as a, as a, as a research focus? Was there a, a particular case you might've read in the, in the news or something like, I, I think I've never heard of people being falsely accused or convicted through having the same fingerprints, I guess, unless you're an identical twin. Maria Puertos Calvo:Mm-hmm (affirmative). (laughs) Actually, I can tell you because I have identical twins, but also that, because I studied a lot of our fingerprints is that identical twins do not have the same fingerprints.Nic Fillingham:Wow.Maria Puertos Calvo:Uh, because fingerprints are formed when you're in the womb. So they're not, they're not like a genetic thing. They happen kind of like, as a random pattern when, when your body is forming in the womb, and they happen, they're different. Uh, so, so humans have unique fingerprints and that's true, but the problem with the, the finger frame recognition is that, it's very partial, and is very imperfect because the, the late latent, it's called the latent fingerprint, the one that is found in a crime scene is then recovered, you know, using like some powder, and it's kind of like, you, you just found some, you know, sweaty thing and a surface, and then you have to lift that from there. Right. Maria Puertos Calvo:And, and that has imperfections in, and it only, it's not going to be like a full fingerprint. You're going to have a partial fingerprint. And then, then you, basically, the way the matching works is using this like little poin- points and, and bifurcations of the riches that exist in your fingerprint. And, and then, you know, looking at the, the location and direction of those, then they're matched with other fingerprints to understand if they're the same one or not. But the, because you don't have the full picture, it is possible that you make a mistake. Maria Puertos Calvo:The one case that it's been kind of really, really famous actually happened with the Madrid bombings that happened in 2004, where, you know, they, they blew up, uh, some trains and, and a couple of hundred people died. Then they, they actually found a fingerprint in one of the, I don't remember, like in the crime scene and it actually match in the FBI fingerprint database. It matched the fingerprint of a lawyer from Portland, Oregon, I believe it's what it was. And then he was initially, you know, uh, I don't know if you ended up being convicted, but, but you know, it wasn't-Nic Fillingham:He was a suspect.Maria Puertos Calvo:... it was a really famous case. Yes. I think he was initially convicted. And then, but then he was not after they found the right person and they, they actually found that yeah, both fingerprints, like the, the guy whose fingerprint it really was. And these other guys, they, their fingerprints both match the crime scene fingerprint, but that's only because it was only a piece of it. Right. You, you don't put your finger, like, you don't roll it left to right. Like when you arrive at the airport, right. That they make you roll your finger, and lay have the whole thing it's, you're maybe just, you know, the, the, the criminal fingerprint is, is very small.Nic Fillingham:Was that a big part of the, the research was trying to understand how much of a fingerprint is necessary for a sort of statistically relevant or sort of accurate determination that it belongs to, to the, to the right person?Maria Puertos Calvo:Yeah. So the results of the research they'd have some outcome around, like, depending on how many of those points that are used for identification, which are called minutia, depending on how, how many of those are available, it changes the probability of a random match with a random person, basically. So the more points you have, the less likely it is that will happen. Nic Fillingham:The one thing, like, as, as we're talking about this, that I sort of half remember from maybe being a kid, I don't know, growing up in Australia is don't koalas have fingerprints that are the same as humans. Did I make that up? Do you know anything about this? Maria Puertos Calvo:(laughs) I'm sure, I have no idea. (laughs) I have never heard such a thing. Nic Fillingham:I have a-Maria Puertos Calvo:Now I wanna know. Nic Fillingham:...I'm gonna have to look this up.Maria Puertos Calvo:Yeah.Nic Fillingham:I have a feeling that koa- koalas, (laughs) have fingerprints that are either very close to or indistinguishable from, from humans. I'm gonna look this one up. Maria Puertos Calvo:I wonder if like a koala could ever be wrongly convicted of a crime. Nic Fillingham:Right, right. So like, if I want to go rob a bank in Australia, all I need to do is like, bring a koala with me and leave the koala in the bank after I've successfully exited the bank with all the gold bars in my backpack. And then the police would show up and they arrest the koala and they'd get the fingerprints and they go, well, it must be the koala. Maria Puertos Calvo:Exactly. Nic Fillingham:This is a foolproof plan. Maria Puertos Calvo:(laughs)Nic Fillingham:I'm glad I discussed this with you on the podcast. Thank you, Marie, for validating my poses.Maria Puertos Calvo:Now, now you can't publish this.Nic Fillingham:Oh, we talked about fingerprints. Oh, crumbs you're right. Yeah. Okay. All right. We have to edit this out of the, (laughs) out of there quick. Maria Puertos Calvo:(laughs)Nic Fillingham:Um, okay. I didn't realize we had talked so much about fingerprints. That's my fault, but I found that fascinating. Thank you. So what happens next? Do you then go to Microsoft? Do you come straight out of your education at university in Madrid, straight to Microsoft? Maria Puertos Calvo:Kind of and no. So what happens next is that while I, I finished the master's part of this PhD, and at this time I'm actually dating my now husband, and he's an American, uh, working in Washington D.C. as an electrical engineer. So I, you know, I finished my master's and my, I say, why, why do I kind of wanna go be in the US uh, so I can be with him. And, you know, I have the space, the scholarship they'll actually lets me go do research abroad and you know, like kind of pays for it. So Maria Puertos Calvo:Find, um, another research group in the University of Maryland, College Park, which is really, really close to, to DC. And, and I go there to do research for, uh, six months. So, I spent six months there also doing research. Uh, also using, uh, machine learning for, for a different around iris recognition. And, you know, the six months went by and I was like, "Well, I want to stay a little longer," like, "I, you know, I really like living here," and I extended that, like, another six months. I... And at that point, you know, I wasn't really allowed to do that with my scholarship, so I just asked my professor to, you know, finance me for that time. And, and, uh, and at that time, I decided, like, you know, I, I actually don't think I wanna, like, pursue this whole PHD thing. Maria Puertos Calvo:So, so I stayed six more months working for him, and then I decided I, I, I'm not a really big fan of academia. I went into research in, in grad school in Spain mostly because there weren't other opportunities. I was super, you know, glad I did 'cause I, I love all the research and the knowledge that I gained with all... You know, with my master's where I learned everything about Artificial Intelligence. But at this point, I really, really wanted to go into industry. Uh, so I applied to a lot of jobs in a lot of different companies. You know, figuring out, like, my background is in biometrics and machine learning. Things like that. Data science is not a word that had ever come to my mind that I was or could be, but I was more, like, interested in, like, you know, maybe software roles related to companies that did things that I had a similar background in.Maria Puertos Calvo:For like a few months, I was looking in... I, I didn't even get calls. And I had no work experience other than, you know, I had been through college and grad school. So, I had... You know, and, and I was from Spain and from a Spanish university, and there was really nothing in my resume that was, like, oh, this is like the person we need to call. So, nobody called me. (laughs) And, and then one day, uh, I, I received a LinkedIn message from a Microsoft recruiter. And she says, "Hey, I have... I'm interested in talking to you about, uh, well, Microsoft." So I said, "Oh, my God. That sounds amazing." So, she calls me and we talk about it, and she's like, "Yeah, there's like this team at Microsoft that is like run mostly by data scientists and what they do is they help prevent fraud, abuse, and compromise for a lot of Microsoft online services." Maria Puertos Calvo:So, they, they basically use data and machine learning to do things like stopping spam for, doing, like, family safety like finding, like, things on the web that, that should be, like, not for children. They were also doing, like, phishing detection on the browser. Um, like phishing URL detection on the browser and a co- compromise detection for Microsoft Account. And so I was like, "Sure, that sounds amazing." You know? "I would love to be in the process." And I was actually lying because I did not want to move to Seattle. (laughs) Like, at that time, I was so hopeful that I will find a job at, you know, somewhere in DC on the east coast, which is like closer to Spain and where, where we lived in. But at the same time, you know, Microsoft calls and you don't say no mostly when nobody else is calling you. Maria Puertos Calvo:Um, so, so I said, "Sure, let's, you know, I, uh... The, the least I can do is, like, see how the interview goes." So, I did the phone screen and then I... They, they flew me to Seattle and I had seven interviews and a lunch inter- and a lunch kind of casual interview. So, it was like an eight hour interview. It was from 9:00 to 5:00. And, you know, everything sounded great, the role sounded great. Um, the, the team were... The things that they were doing sounded super interesting. And, to my surprise, the next day when I'm at the airport waiting for my flight to, to go back to DC, the recruiter calls me and says, "Hey, you, you know, you passed the interview and we're gonna make you an offer. You'll have an offer in the... In the mail tomorrow." I was like, "Oh, my God." (laughs) "What?" Like, I could not... This... It's crazy to me that this was, like, only seven years ago, it... But yeah.Nic Fillingham:Oh, this is seven... So, this was 2014, 2013?Maria Puertos Calvo:Uh, actually, when I did the interview, it was... It was more, more... It was longer. It was 2012. Nic Fillingham:2012. Got it.Maria Puertos Calvo:And then I... And then starting my Microsoft in 2013.Nic Fillingham:Got it.Maria Puertos Calvo:I started as a... I think at that time, they called us analysts. But it was funny because the, the team was very proud on the, the fact that they were one of the first teams doing, like, real data science at Microsoft. But there were too many teams at Microsoft calling themselves, and basically only doing, like, analytics and dashboards and things like that. So, because of that, the team that I was in was really proud, and they didn't want to call themselves data scientists, so they... I don't know. We called ourselves, like, analysts PMs, and then we were from that to decision scientists, uh, which I never understood the, the name. (laughs) Uh, but yeah. So, that's how I started.Nic Fillingham:Okay, so, so that first role was in... I heard you say So, were you in the sort of consumer email pipeline team? Is that sort of where that, that sat?Maria Puertos Calvo:Yeah. Yeah, so, uh, the team was actually called safety platform. It doesn't exist anymore, but it was a team that provided the abuse, fraud, and, and, like, malicious detections for other teams that were... At the time, it was called the Windows live division.Nic Fillingham:Yes.Maria Puertos Calvo:So, all the... All the teams that were part of that division, they were like the browser, right? Like, Internet Explorer, Hotmail, which was after named And Microsoft Account, which is the consumer ecosystem, we're all part of that. And our team, basically, helped them with detections and machine learning for their, their abusers and fraudsters and, and, you know, hackers that, that could affect their customers. So, my first role was actually in the spam team, anti-spam team. I was on outbound, outbound spam detection. So, uh, we will build models to detect when users who send spam from accounts out so we could stop that mail basically.Nic Fillingham:And I'd loved to know, like, the models that you were building and training and refining then to detect outbound spam, and then the kinds of sort of machine learning technology that you're, you're playing today. Is there any similarity? Or are they just worlds apart? I mean, we are talking seven years and, you know, seven years in technology may as well be, like, a century. But, you know, is there common threads, is there common learnings from back there, or is everything just changed?Maria Puertos Calvo:Yes, both. Like, there, there are, obviously, common threads. You know, the world has evolved, but what really has evolved is the, the, the underlying infrastructure and tools available for people to deploy machine learning models. Like, back then, we... The production machine learning models that were running either in, like, authentication systems, either in off- you know, offline in the background after the fact, or, or even for the... For the mail. The Microsoft developers have to go and, like, code the actual... Let's say that you use, like, I don't know, logistic regression, which is a very typical, easy, uh, machine learning algorithm, right? They had to, like, code that. They had to, you know... There wasn't like a... Like, library that they could call that they would say, "Okay, apply logistic regression to, to this data with these parameters. Maria Puertos Calvo:Back then, it was, like... People had to code their own machine learning algorithms from, like, the math that backs them, right? So, that was actually... Make things so much, you know, harder. They... There weren't, like, the tools to actually, like, do, like, data manipulation, visualization, modeling, tuning, the way that we have so many things today. So, that, you know, made things kind of hard. Nothing was... Nothing was, like, easy to use for the data scientists. It... There was a lot of work around, you know, how do you... Like, manual labor. It was like, "Okay, I'm gonna, like, run the model with these parameters, and then, like, you know, b- based on the results, you would change that and tweak it a little bit. Maria Puertos Calvo:Today, you have programs that do that for you. And, and then show you all the results in, like, a super cool graph that tells you, uh, you know, like, this is the exact parameters you need to use for maximizing this one, uh, you know, output. Like, if you want to maximize accuracy or precision or recall. That, that is just, like, so much easier.Nic Fillingham:That sounds really fascinating. So, Maria, you now... You now run a team. And I, I would love to sort of get your thoughts on what makes a great data scientist and, and what do you look for when you're hiring into, into your team or into sort of your, your broader organization under, uh, under identity. What perspectives and experience and skills are you trying to sort of add in and how do you find it? Maria Puertos Calvo:Oh, what a great question. Uh, something that I'm actually... That's... The, the answer of that is something I'm refining every day. The, you know, the more, uh, experience I get and the more people I hire. I, I feel like it's always a learning process. It's like, what works and what doesn't. You know, I try to be open-minded and not try to hire everybody to be like me. So, that's... I'm trying to learn from all the people that I hire that are good. Like, what are their, you know... What's, like, special about them that I should try to look in other people that I hire. But I would say, like, some common threads, I think, it's like... Really good communication skills. Maria Puertos Calvo:Like, o- obviously the basics of, you know, being... Having s- a strong background in statistical modeling and machine learning is key. Uh, but many people these days have that. The, the main knowledge is really important in our team because when you apply data science to cyber security, there are a lot of things that make the job really hard. One of them is the, the data is... What... It's called really imbalanced because there are mostly, most of the interactions with, with the system, most of the data represents good activities, and the bad activities are very few and hard to find. They're like maybe less than 1%. So, that makes it harder in general to, to, to get those detections. Maria Puertos Calvo:And the other problem is that you're in an adversarial environment, which means, you know, you're not detecting, you know, a crosswalk in, in a road. Like, it's a typical problem of, of computer vision these days. A crosswalk's gonna be a crosswalk today or tomorrow, but if I detect an attacker in the data today and then we enforce... We do something to stop that attacker or to... Or to get them detected, then the next day they might do things differently because they're going to adapt to what you're doing. So, you need to build machine learning models or detections that are robust enough that use, use what we call features or, or that look at data that it's not going to be easy... Easily gameable. Maria Puertos Calvo:And, and it's really easy to just say, "Oh, you know, there's an attack coming from, I don't know, like, pick a country, like, China. Let's just, like, make China more important in our algorithm." But, like, maybe tomorrow that same attacker just fakes IP addresses Maria Puertos Calvo:Addresses in, in a bot that, that is not in China. It's in, I don't know, in Spain. So, so, you just have to, you know, really get deep into, like, what it means to do data science in our own domain and, and, and gain that knowledge. So, that knowledge, for me, is, is important but it's also something that, that you can gain in the job. But then things like the ability to adapt and, and then also the ability to communicate with all their stakeholders what the data's actually telling us. Because it's, you know... You, you need to be able to tell a story with the data. You need to be able to present the data in a way that other people can understand it, or present the results of your research in, in a way that other people can understand it and really, uh, kind of buy your ideas or, or what you wanna express. And I think that that is really important as well.Nic Fillingham:I sort of wanted to touch on what role... Is there a place in data science for people that, that don't have a sort of traditional or an orthodox or a linear path into the field? Can you come from a different discipline? Can you come from sort of an informal education or background? Can you be self-taught? Can you come from a completely different industry? What, what sort of flexibility exists or should there exist for adding in sort of different perspectives and, and sort of diversity in, in this particular space of machine learning?Maria Puertos Calvo:Yes. There are... Actually, because it's such a new discipline, when I started at Microsoft, none of us started our degrees or our careers thinking that we wanted to go into data science. And my team had people who had, you know, degrees in economics, degrees in psychology, degrees in engineering, and then they had arrived to data science through, through different ways. I think data science is really like a fancy way of saying statistics. It's like big data statistics, right? It's like how do we, uh, model a lot of data to, like, tell us to do predictions, or, or tell us like what, how the data is distributed, or, or how different data based on different data points looks more like it's this category or this other category. So, it's all really, like, from the field of statistics.Maria Puertos Calvo:And statistics is used in any type of research, right? Like, when you... When people in medicine are doing studies or any other kind of social sciences are doing studies, they're using a lot of that, and, and they're more and more using, like, concepts that are really related to what we use in, in data science. So, in that sense, it's, it's really possible to come to a lot of different fields. Generally, the, the people who do really well as data scientists are people who have like a PhD and have then this type of, you know, researching i- but it doesn't really matter what field. I actually know that there, there are some companies out there that their job is to, like, get people that come out of PhD's programs, but they don't have like a... Like a very, you know, like you said, like a linear path to data science, and then, they kind of, like, do like a one year training thing to, like, make them data scientists, because they do have, like, the... All the background in terms of, like, the statistics and the knowledge of the algorithms and everything, but they... Maybe they're, they've been really academic and they're not... They don't maybe know programming or, or things that are more related to the tech or, or they're just don't know how to handle the data that is big. Maria Puertos Calvo:So, they get them ready for... To work in the industry, but the dat- you know, I've met a lot of them in, in, in, in my career, uh, people who have gone through these kind of programs, and some of them are PhDs in physics or any other field. So, that's pretty common. In the self-taught role, it's also very possible. I think people who, uh, maybe started as, like, software engineers, for example, and then there's so much content out there that is even free if you really wanna learn data science and machine learning. You can, you know, go from anything from Coursera to YouTube, uh, things that are free, things that are paid, but that you can actually gain great knowledge from people who are the best in the world at teaching this stuff. So, definitely possible to do it that way as well.Nic Fillingham:Awesome. Before we let you go, we talked about the perfect guacamole recipe last time because you had that in your Twitter profile.Maria Puertos Calvo:Mm-hmm (affirmative). (laughs)Nic Fillingham:Do you recall that? I'm not making this up, right? (laughs)Maria Puertos Calvo:I do. No. (laughs)Nic Fillingham:All right. So, w- so we had the perfect guacamole recipe. I wondered what was your perfect... I- is it like... I wanted to ask about tacos, like, what your thoughts were on tacos, but I, I don't wanna be rote. I don't wanna be, uh, too cliché. So, maybe is there another sort of food that you love that you would like to leave us with, your sort of perfect recipe?Maria Puertos Calvo:(laughs) That's really funny. I, I actually had tacos for lunch today. That is, uh... Yeah. (laughs)Nic Fillingham:You did? What... Tell me about it. What did you have?Maria Puertos Calvo:I didn't make them, though. I, I went out to eat them. Uh-Nic Fillingham:Were they awesome? Did you love them?Maria Puertos Calvo:They were really good, yeah. So, I think it's-Nic Fillingham:All right. Tell us about those tacos.Maria Puertos Calvo:Tacos is one of my favorite foods. But I actually have a taco recipe that I make that it's... I find it really good and really easy. So, it's shrimp tacos.Nic Fillingham:Okay. All right.Maria Puertos Calvo:So, it's, it's super easy. You just, like, marinate your shrimp in, like, a mix of lime, Chipotle... You know those, like, Chipotle chilis that come in a can and with, like, adobo sauce?Nic Fillingham:Yeah, the l- it's got like a little... It's like a half can. And in-Maria Puertos Calvo:Yeah, and it's, like, really dark, the sauce, and-Nic Fillingham:Really dark I think. And in my house, you open the can and you end up only using about a third of it and you go, "I'm gonna use this later," and then you put it in the fridge.Maria Puertos Calvo:Yes, and it's like-Nic Fillingham:And then it... And then you find it, like, six months later and it's evolved and it's semi-sentient. But I know exactly what you're talking about.Maria Puertos Calvo:Exactly. So that... You, you put, like, some of those... That, like, very smokey sauce that comes in that can or, or you can chop up some of the chili in there as well. And then lime and honey. And that's it. You marinate your shrimp in that and then you just, like, cook them in a pan. And then you put that in a tortilla, you know, like corn preferably. But you can use, you know, flour if that's your choice. Uh, and then you make your taco with the... That shrimp, and then you put, like... You, you pickle some sliced red onions very lightly with some lime juice and some salt, maybe for like 10 minutes. You put that on... You know, on your shrimp, and then you can put some shredded cabbage and some avocado, and ready to go. Delicious shrimp tacos for a week night.Nic Fillingham:Fascinating. I'm gonna try this recipe. Maria Puertos Calvo:Okay.Nic Fillingham:Sounds awesome.Maria Puertos Calvo:Let me know.Nic Fillingham:Maria, thank you again so much for your time. This has been fantastic having you back. The last question, I think it's super quick, are you hiring at the moment, and if so, where can folks go to learn about how they may end up potentially being on your team or, or being in your group somewhere?Maria Puertos Calvo:Yes, I am actually. Our team is doubling in size. I am hiring data scientists in Atlanta and in Dublin right now. So, we're gonna be, you know, a very, uh, worldly team, uh, 'cause I'm based in Seattle. So, if you go to Microsoft jobs and search in hashtag identity jobs, I think, uh, all my jobs should be listed there. Um, looking for, you know, data scientists, as I said, to work on fraud and, and cyber security and it's a... It's a great team. Hopefully, yeah, if you're... If that's something you're into, please, apply.Nic Fillingham:Awesome. We will put the link in the show notes. Thank you so much for your time. It's been a great conversation.Maria Puertos Calvo:Always a pleasure, Nic. Thank you so much. Natalia Godyla:Well, we had a great time unlocking insights into security, from research to Artificial Intelligence. Keep an eye out for our next episode.Nic Fillingham:And don't forget to tweet us @msftsecurity or email us at with topics you'd like to hear on a future episode. Until then, stay safe.Natalia Godyla:Stay secure.

Re: Tracking Attacker Email Infrastructure

Ep. 19
If you use email, there is a good chance you’re familiar with email scams. Who hasn’t gotten a shady chain letter or suspicious offer in their inbox? Cybercriminals have been using email to spread malware for decades and today’s methods are more sophisticated than ever. In order to stop these attacks from ever hitting our inboxes in the first place, threat analysts have to always be one step ahead of these cybercriminals, deploying advanced and ever-evolving tactics to stop them.On today’s podcast, hosts Nic Fillingham and NataliaGodylaare joined by Elif Kaya, a Threat Analyst at Microsoft. Elif speaks with us about attacker email infrastructure. We learn what it is, how it’s used,and how her team is combating it. She explains how the intelligence her team gathersis helpingtopredict how a domain is going to be used, even before any malicious email campaigns begin. It’s a fascinating conversation that dives deep into Elif’s research and her unique perspective on combating cybercrime.In This Episode, You Will Learn:• The meaning of the terms “RandomU” and “StrangeU”• The research and techniques used when gathering intelligence on attacker email structure• How sophisticated malware campaigns evade machine learning, phish filters,and other automated technology• The history behind service infrastructure, theNetcurstakedown, Agent Tesla, Diamond Fox,Dridox,and moreSome Questions We Ask:• What is attacker email infrastructure and how is it used by cybercriminals?• How does gaining intelligence on email infrastructures help usimprove protection against malware campaigns?• What is the difference between“attacker-owned infrastructure”and“compromised infrastructure”?• Whywasn’tmachine learning or unsupervised learning a technique used when gathering intelligence on attacker email campaigns?• What should organizationsdoto protect themselves? What solutions should they have in place?Resources: What tracking an attacker email infrastructure tells us about persistent cybercriminal operations: Kaya:’s LinkedIn:’s LinkedIn: Security Blog:[Full transcript can be found at]Nic Fillingham:Hello, and welcome to Security Unlocked, a new podcast from Microsoft where we unlock insights from the latest in news and research from across Microsoft security engineering and operations teams. I'm Nic Fillingham. Natalia Godyla:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft security, deep dive into the newest threat intel, research, and data science.Nic Fillingham:And profile some of the fascinating people working on artificial intelligence in Microsoft security. Natalia Godyla:And now, let's unlock the pod.Nic Fillingham:Hello, Natalia. Welcome to episode 19 of Security Unlocked. How are you? Natalia Godyla:I'm doing great. I'm excited to highlight another woman in our series for Woman's History month, so this'll be number two. And I'm excited to talk about email infrastructures.Nic Fillingham:Yes, I am too. Email, we use it every day. We probably use it more than we, we want. We love it. We can't live without it. What's your first memory of email? What was your first email address?Natalia Godyla:I was an AOL-er. First email was I'm super proud of that one. Nic Fillingham:What's the reference to 2002?Natalia Godyla:I'm pretty sure that's when I got my first pair of glasses (laughs).Nic Fillingham:Ah. And you- Natalia Godyla:I was very excited. I threw a cupcake party.Nic Fillingham:Oh, wow. Natalia Godyla:(laughs) Nic Fillingham:So I'm, I'm pretty old. It was sort, sort of the mid 90s, and I remember like, hitting websites where it asked for an email address, and I'm like, what is an email address? Natalia Godyla:(laughs) Nic Fillingham:I probably used the internet the best part of, you know, six months before someone explained it to me. And I worked out how to get a Hotmail address, which is called Hotmail because it was actually based on the, the acronym H-T-M-L, and they just put a couple other letters in there to expand it out to say Hotmail. And I remember being, thinking like I was the bees knees, because I was Natalia Godyla:(laughs) Nic Fillingham:We should have asked our guest Elif Kaya, who you're about to hear from, about her first email address, but we didn't. Instead, we talked about a blog that she helped co-author, uh, that was published beginning of February called, "What Tracking and Attacker email infrastructure tells us about persistent cyber criminal operations." It's a fascinating conversation, and Elif walks us through all of the research that she did here where we learn about attacker email infrastructure and how it's used and created and managed. Nic Fillingham:There's a bunch of acronyms you're going to hear. The first one, DGA, domain generation algorithm. You're going to hear StrangeU and RandomU, which are sort of collections of these automatically created domains. And if you sort of want to learn a bit more about them, it's obviously in the blog post as well. Natalia Godyla:Yes, and in addition to that, you'll hear reference to Dridex. So, as the RandomU and StrangeU infrastructure was emerging, it was parallel to the disruption of the Netcurs botnet, and those same malware operators who were running the botnet were also using malware like Dridex. And Dridex is a type of malware that utilizes macros to deliver the malware. And with that, on with the pod.Nic Fillingham:On with the pod. Nic Fillingham:Elif Kaya, welcome to the Security Unlocked podcast. Thank you for joining us.Elif Kaya:It's great to be here. Thanks for having me.Nic Fillingham:Now, you were part of the. uh, team that authored a blog post on February 1st, 2021. The blog post is "What tracking and attacker email infrastructure tells us about persistent cyber criminal operations." Loved this blog post. I've had so many questions over the years about how these malware campaigns work. What's happening behind the scenes? Where are all the, the infrastructure elements? How are they used? And this blog helped answer so much and sort of joined dots. Nic Fillingham:If you are listening to the podcast here and you're not sure what we're talking about, head to the Microsoft security blog. It is a post from Feb 1st. But Elif, could you sort of give us an overview? What was discussed in this blog post? What was sort of the key take away? What was the research that you conducted?Elif Kaya:Sure. So uh, I'm part of a, a email research and threat intelligence team, uh, that supports the defender product suite at Microsoft, and what we primarily focus on is tracking email campaigns and email trends over a long period of time and documenting those. So, this blog post kind of came along series of documentation, which we started to bubble up these trends in infrastructure, which is one of my focus areas, starting back in March and running uh, all through the end of the year, where a large series of disparate email campaigns, kind of stretching from very commodity malware that is available for like 15, 20 dollars, to things associated with big name actors, and et cetera, were being delivered with very similar characteristics, despite on the surface the malware being very different, the outcomes being very different, or the cost of the malware targets being very different. Elif Kaya:And so, we were able to see within each of these individual campaigns that the infrastructure supporting the email delivery was a consistent theme. So, it starts with when these domains that were used as email addresses to send these from, uh, started being registered to the current day and kind of what campaigns they helped facilitate, when they were registered, and et cetera. So, when people usually talk about infrastructure that supports malware, a lot of the terms get used overlapping. So, when people refer to infrastructure, they generally are referring to the see to addresses, call back addresses that the attacker that owns the malware owns. Elif Kaya:But what we've been seeing much more frequently, and what we wanted to explain with the blog post, is that in really concrete ways like you said with actual examples, is that the malware and cyber crime infrastructure is very modular. And so, when we say infrastructure we could mean who's sending the emails from their servers, who's hosting the email addresses, who's posting the phish kits, who's hosting the delivery pages that deliver the malware, and who's writing the malware. And then later, who's delivering the ransomware. Elif Kaya:And so these could, in any particular campaign or any particular incident that a sock is looking at, be entirely different people. And so, the reason we wanted to do this blog and detail kind of what we did here and go through each of the cam- malware campaigns that was delivered, was to kind of show like, if you're only focusing on each malware campaign, the next one's going to be right cued up and use all the same infrastructure to deliver maybe something maybe more evasive that, that you'll have to get on top of. Elif Kaya:And so, by doing this tracking you can kind of up level it once more, and instead of spending all you time trying to evade one particular malware strain that's going through constant development, you could put a higher focus at stopping kind of the delivery itself, which, we actually detail through the blog, was very consistent over nine months or so, but had a lot less attention focused on it. Elif Kaya:So, some of the cases that we discuss in the blog are cases like Makop, which was used very heavily, and in especially South Korea, all throughout April and all throughout the spring, and is still pretty prevalent in terms of direct delivery ransomware in that region. It's usually delivered through other means, but what we saw and what we theorized is that whenever the standard delivery mechanisms for those malware are interrupted, they'll kind of sample other infrastructure delivery providers, which is what we describe as StrangeU and RandomU in the blog. Elif Kaya:We use the term StrangeU and RandomU to differentiate two sets of DGA, or domain generation algorithm domain structures that we saw. StrangeU always uses the word strange. Not always, but nearly about 95% of the time. And Random U, couldn't find a better name, but it's just a standard random DGA algorithm, where it's just a bunch of letters and characters. We don't really have a fancy name to give it, but we were able to kind of coalesce around what that was internally, and track the domains as they were registered there. And then, shortly after they would be registered, they would start sending mail from those domains.Nic Fillingham:Elif, were you and the team surprised by how much interconnected overlap, agility, and sharing, for one of a better term, they were across these different groups and campaigns and techniques? Were you expecting to see lots of disconnected siloed activities, techniques, groups, et cetera, et cetera? Or were you expecting this amount of overlap, which we'll get to when we sort of explain the, the stuff in the blog?Elif Kaya:So, I think it was less that it was a bit of a surprise, and more that we don't often get a pristine example like this. Frequently, when we look at the connected infrastructure, they don't use domains necessarily. They'll use the botnet itself and IP addresses for delivery or other things. So, when we came across this one, we do normally handle and really do a deep dive in individual incidents and cases, so this was a little bit more of a unique example of like, hey, there's really clear patterns here. What can we learn by tracking it over a long period of time, in ways that other metrics are a little harder to track?Elif Kaya:But yeah, I, I would say that in general, most email campaigns and phishing campaigns, malware campaigns that you kind of run across, they are gonna have these threads of interconnectivity. They're just going to be at different levels. So, whether that's going to be a level that is kind of more visible for uh, blue teams like the email addresses, the domains themselves, or whether that's going to be something more femoral like IP addresses and hosting providers, or whether that's going to be something that's proxy even more so, like a cluster of compromised domains, similar to, to, you know, what Emotet uses, uh, or use to use, collected in a botnet that has a different way of clustering itself. Elif Kaya:And so for these, we were able to just kind of have something that bubbled to the top and made it easy to connect the dots, as well as other items in the header in the malware that we were able to identify. But I think through tracking this, we were able to kind of reaffirm and make a good piece of public example for blue teams that this is a very common method. This is a very common modular technique, Elif Kaya:... And it's very simple for attackers to stand this kind of thing up and offer their services to other places. And that's part of why we reference the Necurs botnet as well. Dridex makes a big appearance in the StrangeU and RandomU deliveries, especially later on in our tracking of them, and Dridex is also a prominent, um, delivery from a lot of other of these types of delivery botnets that have happened in the future, whether that's CutWail or, uh, Necurs or other, um, botnets like that. So it, it's very common but it's sometimes very hard to kind of keying in on all of the distinct components of it and evaluate like, is it worth it in this instance to key in on it, um, when our main goal is like, what is the most effective thing we can do to stop the deliveries?Natalia Godyla:I'd love to talk a little bit about the history that was described in the blog for the service infrastructure. So from what I understand, the Necurs takedown created a gap in the market where StrangeU and RandomU were able to step in and provide that in- necessary infrastructure. So why was that the replacement? Was there any connection there? And as a second part to that question, what does the evolution of these infrastructures look like? How are they accessible to operators that want to leverage them?Elif Kaya:Right. So in this one I can delve a little more into kind of just intuition and, and doing that, because my full-time role is not specifically to, you know, track all the, all the delivery botnets there are and active. The reason that we made the connection to Necurs wasn't because there was an actual connection in terms of affirming this is filling the same role that it was, or this is filling a hole. Because we don't have necessarily a clear picture of every delivery botnet there is. Because the timeframe was very close and because we were able to see shortly after, uh, StrangeU and RandomU started delivering, they initially only had pickup from commodity malware that we could find. So very cheap malware for the first few months of their delivery, such as Makop. Uh, we saw some Agent Tesla, we saw some Diamond Fox.Elif Kaya:But as it progressed on, it started picking up the bigger names like Dridex and doing larger campaigns that were more impactful as well. And so by the time that Necurs had ended, we had also seen them doing a lot of those bigger name malwares as well. And so the reason why we tried to make that comparison was largely to show that something very simple and kind of perhaps much less sophisticated and lasting for a lot less length of time as Necurs in the environment can get customers quickly. And so while we didn't do a deep dive into any of the amount of like, how is it being advertised, how are they getting the customers, what we wanted to show is that regardless of what methods they're using to get the customers, they're able to get-Elif Kaya:Basically the, the amount of research that was done for Necurs was much more in depth than the amount of research that was necessarily done here. And it was also done from a different angle, that angle was much more operator focused and our angle was much more, what was delivered, what was the impact, what were the trends between all of the different mails? And so we're mostly trying to just position it as, this fulfilled a similar, uh, outcome and got a lot of coverage of something that was very big, lasted for a very long time, many years, and something where somebody just started registering some domains, setting up some mail servers, was able to kind of get off the ground and running in just a few months for relatively low cost.Nic Fillingham:So El, if we normally start with an introduction or, or I, I got so excited about this topic that I jumped straight into my first question and I didn't give you an opportunity to introduce yourself. And I wondered, could you do that for us? I know you're, I believe you're a threat analyst or a threat hunter, is that correct?Elif Kaya:Yeah, so I'm currently a threat analyst, and you've actually had other people, I think, from my team on here already before. But yeah, I, I'm a threat analyst at Microsoft. I've been on this particular team for about a year now, specifically focusing in email threats, web threats, and I do have especially some focus in infrastructure tracking and domain, uh, generation algorithms in general and trying to make sure that our emails and campaigns that we're tracking are properly scoped and that we're able to kind of extract as many TTPs as we can from them. Elif Kaya:And so the role of our team and the role of myself in particular on the team is, when we do these individualized campaigns we look for the IOCs and things like that in it. We scope it, but what we're really looking for is, um, the trends of what's happening so that we can kind of try and pinpoint and escalate to the other teams internally the most impactful changes we could make to the product, or the most impactful changes we could recommend that customers do, if it's something that we don't have a product for or we don't have a protection for, in order to protect against the campaign. And so in this particular instance with this infrastructure, our goal here was to kind of really reiterate to customers that despite all this complexity, the spaghetti-like nature of this, at the end of the day all these different campaigns used kind of a lot of the same both delivery to deliver the email, but the Word documents that they delivered were also very similar.Elif Kaya:There, there were a lot of configurations that can be made on the endpoint to kind of really nullify a lot of these campaigns despite what we were able to see and some really evasive techniques that they were developing, the malware operators, over the time.Nic Fillingham:Yeah, I, I wonder if you could talk a little bit about how the research was actually conducted. A lot of these domains were not hosted by Microsoft infrastructure, as I, as I understand it. I think you sort of cover that a little bit in the blog. So how do you as a, in, you know, in your role, how do you go about conducting this research? Are you setting up honey pots to try and, uh, receive some of these, these emails and just sort of be a part of the campaign, and then you, you conduct your analysis from there? What, how do you go about, uh, performing this research?Elif Kaya:So the bulk of the research I think is performed with various, like some of it is honey pots and some of it's that. A lot of the research that is covered in the blog after we, uh, analyze the malware campaigns, which is a service we offer through, um, MTE, which I think there have been people from MTE that have come on as well, as well as analysis that we do, again, based on, uh, the malware samples that we receive and the email samples that we receive from reports, from externally as well as from open source intelligence. A lot of the domain research here, though, is actually done from, uh, open information. So any domain registrations that there are, the registration fingerprint, as I like to call it, which is all the metadata related to the registration, is publicly available. And so we collect a lot of that information and search it internally. Elif Kaya:And this is always something that I like to advise and encourage blue teams at any particular organization, you know, if they have a little bit of extra funding, to try and invest in as well. Because it's definitely, even though it's free and publicly available, you're generally gonna have to get a subscription or set up some kind of collection order to query the "who is" databases and the passive DNS databases that you'll need in order to do some of these pivots. But it kind of starts with finding the malware campaigns and then finding the emails, and then pivoting up towards everything else we can do. And once you have kind of a net of what you're looking for, sender domains and et cetera, you can then kind of go backwards and say, "Okay, now show, show me all the malware campaigns that we have investigated that, that have these components to them. Show me all the phishing campaigns that have these components to them."Elif Kaya:And so it's kind of going up and then going back down, but all clustered around that registration data and that domain data. Uh, because whether an attacker decides to use IP addresses or whether they decide to do domains, there's usually always some component of their campaign that they have to use attacker-owned infrastructure for, if that makes sense. We see a lot and it's very common for attackers to u- use compromised infrastructure, so WordPress sites, things like that, to host a lot of their architecture. But especially for things like C2s for mail delivery and other things, they're gonna want some resilient infrastructure that they'll own themselves. And so at what point in the chain they decide to do that is usually an opportunity for us to be able to see if there's any OPSEC errors on their part, and also see if they've conducted other campaigns with that same infrastructure. Yeah, and so differate- differentiating between attacker-owned infrastructure and compromised infrastructure is an additional critical component.Natalia Godyla:Now I'm trying to decide which question to go forward with. Can you describe the distinction between those two?Elif Kaya:Right. So attacker-owned infrastructure would be something the attacker sets up themselves. So they have to think of the, and populate the data in the domain address and the registration and the tenant themselves. So this encompasses both when attackers use free trial subscriptions for cloud services, it's whenever they go log into Namecheap and they register their own domains, as well as when they have dedicated IP hosting or bulk group hosting as well that they have decided like, "For this portion of my campaign," whether that's command and control, whether that's delivery or et cetera, "I need to make sure that I'm in control of this." We have seen examples where compromised infrastructure, which is the reverse of that where especially small businesses, parked domains, and other insecure WordPress sites, sites that have other types of vulnerabilities, will be compromised and used to, again, do any, any component of that kill chain, whether that's sending mails, hosting the malware, and will be used to do those things as well.Elif Kaya:So compromised infrastructure is when the attacker will utilize someone else. The benefit for attackers is it's definitely a lot harder for defenders to identify or take action against that, especially because they don't know how long it'll be compromised for, if it'll ever not be compromised, if the attacker's only leasing access to the compromised domain through a, a kind of, uh, cyber crime as a service provider or not. It becomes harder for the defenders to defend against and detect, because it has less points of contact and familiarity with other compromised domain. If somebody compromises a blog about kittens and a blog about race cars, it's gonna be pretty hard for a lot of things to pick up exactly what's similar about them, because some Elif Kaya:... other human worlds apart has made the whole blog but if one attacker has-Nic Fillingham:Probably Natalia GodylaElif Kaya:... made five to 15 different sites in a day. (laughs) Yeah, it's a, it's going to have a lot more in common. But the downside of compromised domains for attackers is a, they often have to lease them from the people that initially compromise them and c, those compromised domains could become uncompromised, they have to now maintain access to something they didn't make. And we did also see that with OMO Tech, over the summer when it had come back after being quiet for very long, and people had replaced their payloads on compromised sites with, uh, I think chips with CAATs, something like that. We're back to CAATs.Nic Fillingham:You're speaking our language here, like we're, we're, we're on the edge of our seat, you said CAAT like twice in like a minute.Natalia Godyla:(laughs)Elif Kaya:But when an attacker comprises a lot of their infrastructure on compromised infrastructure, other attackers could compromise it, defenders could compromise it, anyone can kind of... They have to now protect it, whereas if they made it from nowhere and no one owned it, except for them, it's kind of a lot easier for them to just hang out. Because then the kind of only person that's looking out for them a lot of the time, is if somebody is connecting the dots on the infrastructure or the hosting providers, like I think the ones that we cover here is like, IronNet, Namecheap, et cetera, if they're looking out for somebody hosting on their, their infrastructure. But if somebody is just sitting there, they're just being quiet, they're just sending mail, nobody's going to notice that they're compromised probably. Whereas if you're a small business owner and your site ends up on a block list, you're going to go start asking questions, you're going to start trying to get that fixed or take your site down.Nic Fillingham:Elif, I'd love to come back to what you talked about with the way that you conducted this research and you, you, you said that getting subscriptions to Huawei Services and DNS records, this is all public record. But there is still some tools required to pass through that information and, and create the pivots. We were talking offline, before we started recording, I'll paraphrase here and please correct me, that you didn't utilize really machine learning as a tool to discover this techniques. Is that, is that correct? Can you talk more about what techniques you did use and didn't use and why something like machine learning or unsupervised learning was not either necessary in this space or wasn't necessary to discover these techniques?Elif Kaya:Yeah, I mean, I could talk to the, the techniques that I used and well, I can't say explicitly like why machine learning would or would not be helpful here because I'm not an expert on machine learning. I think in the different campaigns that I've worked on in my career in security, whether it's this one or before I came to Microsoft, I did some more independent research on a large set of Chrome extensions that were also connected by various, uh, commonalities to get those taken down. A lot of this research that can be pretty impactful and pretty widespread doesn't require ML in order to parse and to navigate. And I think part of the reason that ML is a bit unsuited for this at the moment, is because there hasn't been as much manually focused research. And there's been a lot of research done by independent researchers and people in the security community but I have seen a lot less focus in terms of data from tech companies in doing and making publicly available some of this infrastructure surrounded research. Elif Kaya:And so what I mean by that is that a lot of security companies focus a lot on the actor name. They focus a lot on the reverse engineering of the malware and those are critical components. In part because that's what the products that they're sometimes selling is AV Surfaces and things like that and that's the point in time that they are protecting against the threat. But when it comes to the infrastructure, companies that would be the most positioned to protect against that threat or have products to protect against that threat, aren't necessarily doing the manual body of research currently necessary I think, in order to guide ML to kind of identify this work. And so right now to say, " Oh, would this be something that ML would be suited to step in?" Elif Kaya:And I think that it could in the future be suited to step in slightly but I also think that the way that this works, is currently operating at a level that actually does benefit from, from manual analysis at this time. In part, because it, it doesn't actually take tools that are generally above or beyond what is in a lot of analyst tool set with basic scripting and things like that. Because right now there has been such a non focus from security companies and blue teams, I think on infrastructure and infrastructure commonalities and the way that these campaigns are so modular that, for lack of a better word, there's not a lot of sophistication in it. Most of the sophistication we see in these campaigns are designed to evade automated technology. They're designed to evade ML. They're designed to evade phish filters. They're not really designed to evade humans looking at them, because I think you and me looking at those strange new domains, like you can look at a cluster of them and be like, "These aren't real sites, they're not real."Natalia Godyla:(laughs)Nic Fillingham:Yeah. I'm not, I'm not going to visit a website called, I'm gonna pick one up here like, eninaquilio.u... Maybe I would actually, that, that looks really cool. (laughs) Okay, gonesa.usastethkent, it's got like no vowels, like he replies strange secure world.Elif Kaya:And so we don't actually see a lot of, I guess, advancement in that space from attackers. A lot of the advancement is there in different parts that aren't necessarily bubbled up, but it's happening in the malware itself, in order to evade AV in order to not get alerts that fire on them. It's not necessarily happening to use something other than a macro or send from something other than an obvious phishing email or if obvious phishing source. And a lot of times, uh, one part that's one of my favorite part is these, these registrations frequently use the, .us domain. Many top level domains actually prohibit different parts of obfuscation for the registration record. And so when you register a domain, obviously the attacker kind of doesn't want to use real data, it's not the real name. But they'll use like memes and other things in the registration information, because it's fake data but then you can go and pivot and find where they've used the same meme before. And so-Nic Fillingham:Look for old domains registered by Rick Astley. Natalia Godyla:(laughs)Elif Kaya:Yeah, I think there was one-Nic Fillingham:You might be too young for that, me and my friend-Elif Kaya:There was, there was one that I think was used, I forget for which one of these malware campaigns where a lot of the registrations were actually happening under a registered email, that was something like, or something (laughs)or like, youcan' And I was like-Nic Fillingham:Try me. Natalia Godyla:Challenge accepted.Nic Fillingham:It's like a big red, a big red arrow pointing at them. Elif Kaya:What is happening in the infrastructure space for a lot of these things is happening pretty rapidly, it's happening at pretty low costs. And it's also happening and looking a lot different and is in a way a lot less glamorous, than a lot of the reverse engineering that is necessarily done but it's very critical. Or the more nation state tracking that is, uh, very popular when or companies are selling threat intelligence products to customers. But when it comes to like security, kind of in a sock, a lot of put is going to get through the doors, regular phishing emails.Natalia Godyla:So if the campaigns are targeting the automation that's built in, like you said, the phishing filters, what should organizations be doing to protect themselves? What solution should they have in place, processes?Elif Kaya:So some of the big things that I remember from these particular campaigns, um, is if you are rolling any kind of mail protection service or mail service in general, please periodically check your allow lists. The allow lists will frequently have entire IP ranges, entire domain ranges and so even domains like these ones that are very randomized and they're strange and you've never received an email before in your life. Sometimes the configurations of your allow lists for emails can completely cause the mails to bypass other filters. So definitely whether you're running Microsoft for your mail protection or not, please periodically check your allow lists and your filters and kind of have a good understanding of like, do I have any instances where phishing or malware would bypass other protections? Have I set that up? So that's one thing that I think does cut down a lot on some of these, making it to inboxes.Elif Kaya:And other as we... And part of the reason why we highlighted at each of the malware campaigns involved here is, uh, the suite of... I always forget the acronym, ASR rules, advanced security rules or configurations that Microsoft offers for office in particular for macro executions and malicious office executions, routinely outside of this blog and other, it's still office word documents, it's still Office Excel documents, it's still macro buttons. And so re-evaluating your controls there and your protections there, especially looking at some of the automatic configurations that we have available now to just turn on, that is going to help there a lot as well. I think are the two biggest like controls that I would recommend people for these kind of items, is checking kind of your allow lists pretty periodically and what your filtering policies are. And checking your, specifically, if you are using Office 365 internally, whether you have configurations set up to not necessarily even just restrict but there are more granular configurations now that you can set up to specifically restrict DLL and other execution from office macros as well.Nic Fillingham:Elif, in the section of the blog where it talks about the dry decks campaigns big and small June to July and beyond. It reads here, that it feels like you uncovered a section of sort of experimentation and testing of sort of new techniques. There's references to Shakespeare, there's something I've never heard of called, VBA stomping. Can you talk a little bit about what kinds of experimentation and creativity that you stumbled upon as part of this research? First of all, and what is VBA stomping?Elif Kaya:Uh, so VBA stomping, I think we might've actually met VBA purging in the blog. I'm trying to remember Elif Kaya:...whether, I think it might've been VBA purging, but surprisingly VBA stomping and purging are separate, but they fulfill the same kind of function, which is to try and make that macro, that like spicy button that everybody wants to press a little harder for malware detection engines to detect. So VBA stomping and purging both operate a little bit differently, but their main goal is to kind of obfuscate the initial VBA code from the actual amount malicious code in general. So that when antivirus engines try and examine it, they're going to see all that Shakespeare text and they're not going to see the malware. And as for the Shakespeare text, (Laughs) it's actually still on virus total. I think if people go and check for any of the files that reach out to the and DFIR, the blog did a great writeup called I believe "Tried X toward dominance" which actually covers in their sandbox what happened after they ran this doc. Which was eventually moved to a PowerShell empire attempts within their sandbox. Elif Kaya:But yeah, as far as I can tell from the Shakespeare use for this, it's actually not the first time that poetry (Laughs) and kind of Shakespeare has been used to obfuscate malware. There have been other rats in the past that have used this. Uh, we couldn't find any similarity like this, this was not those. But oddly enough, there is occasionally every now and then poetry or Shakespeare, other things that is used as obfuscation techniques to kind of pat out documents. And in this case, what we actually found is every iteration of the word document that we could find, had all of the functions and pretty much all the code within the document was replaced by different random lines. Elif Kaya:So there wasn't actually any contiguous lines within it. So if you looked at two docs, one might have some lines from Hamlet, one might have some lines from some other kind of literature document as well. But I imagine that it was more so just additional stuff to make it. If you're looking for a function in this document, it's gonna look different in this one. If I had to guess, I would say it's probably something similar to an actual defensive technique that we, we being, I guess, myself-Nic Fillingham:(Laughs)Elif Kaya:...had a few talks on conferences before called I believe Polyverse the company, um, coined the term, but Poly scripting where you use each iteration of something is gonna have a different function name and a different code. But it's all internally, um, it's all going to, the interpreter is going to still interpret it, even though it's random text from externally. In order to help protect against in the case of polyverse and polyscripting, protect WordPress sites from easy exploit. But in the case of the Shakespeare document, probably to prevent against easy YARA rules and things to detect their code, don't click the spicy button. (Laughs) Nic Fillingham:Elif. What do we know about these domains that have all been identified? The StrangeYou, the RandomYou, are, they still active? Have they been shut down? Do they get sent back to the DNS registrar? What's the process? What does it look like?Elif Kaya:So we have made sure that at least on our end, and turn to our products, that these domains and any new iterations of them, of these particular strains that we identify are blocked, as well as the malware we cover in the report. Those are within our products. As for the domains, because they're not hosted on Microsoft infrastructure, we kind of report them and that's, that's about as much as we can do in terms of their activity. I have no doubt that the operators behind this, will probably just create a new strain, but is also not necessarily set in stone, that the operators behind RandomYou and StrangeYou are the same operators. It could be that they are just operating in a similar kind of space and time to fulfill similar functions. Elif Kaya:There was a few campaigns where they both sent the same campaign, which lends a bit of credence to them potentially being at least similarly operated, but nothing concrete. So it is very highly likely that, that they'll just continue to operate under new strains. Uh, and probably the next strain that they'll have will either be more of these, uh, or they'll create a new one. And by a new one, I mean, instead of the word strange, maybe they'll use the word. I don't know, doc.Nic Fillingham:How about cat? Elif Kaya:Could be cat.Nic Fillingham:Or has that been exhausted. Elif Kaya:It could be cat. We haven't exhausted the number of cat domains that there could be. Nic Fillingham:So it sounds like, uh, you know, one of the things you said in the blog, and I think you mentioned it earlier that paying attention to infrastructure can actually allow uh, Defenders, SOCs, Blue teams to get ahead of a new campaign if a campaign is leveraging existing infrastructure. And so is that the takeaway from this blog post for those folks listening to the podcast right now and reading the blog, is your one sentence takeaway here, like pay attention to infrastructure? Don't forget about the infrastructure? Is that, is that sort of what you'd like folks to come away with? Elif Kaya:Yeah, I absolutely. And that's kind of my secret wish with the blog and my secret wish with most of the work that I do, is that it'll make Defenders and Blue teams focus less on the glamor and less on the kind of actor attribution and more on what is working right now. What do I need in my environment? What do I not need environment, my environment? And one of the key points I'll hone in on in order to kind of demonstrate that is these .us domains .us is a, a t- top level domain frequently used, uh, maliciously, but it's also frequently used for reasonable good purposes. What some of our tracking internally does and tracking that I've done before I went to Microsoft, is that attackers have trends of top level domains that they prefer to use from month to month. Certain malware strains, like using some top level domains, other, over others for a variety of reasons. Elif Kaya:But if you are running SOC and you were running Blue team get kind of creative about how you can take different steps to either monitor track or block infrastructure that is unnecessary to your organization. Not to impede or cause any kind of interference from productivity, but to kind of keep an eye on attacks and trends that you don't know about yet. For example, .su domains or .icu domains, uh, you might not have almost any benign presence for that in your environment. And so you might want to create custom alerts or custom rules to say like, "Hey, if I see this, maybe this could be the next malware campaign that Microsoft or somebody else hasn't written about but I'm a target of." And so kind of get creative about that, uh, especially if you have those kinds of capabilities within your network to filter on a mail comes in or mail comes out. Natalia Godyla:So just stepping away from the block for a minute, what about yourself personally speaking, what are you most passionate about in your work right now? What are you looking to achieve? What is your big goal I guess? Elif Kaya:So for myself and the reason that I, I'm still kind of in this field and at Microsoft doing the job that I'm doing right now is, I, I would really like to use these kinds of examples to bubble up what Blue teams that have less funding that are less glamorous and individual people can use in order to find threats. So I really want to try and shift the focus away from big groups or big actors or attribution and more towards what I consider the end goal for security. For me, which is how can I stop people from getting impacted. And so for myself and my own passions and interests insecurity outside of just what I do for work, I'm very focused in web security and browser security, I think there is a big gap that a lot of people focused as well as consumer security. Elif Kaya:A lot of these issues that we consistently pop up over and over again, kind of happen in part because of a lack of focus in consumer security. And by consumer, I kind of mean individual non corporations or small corporations. And so kind of the lack of focus in that and leaves a lot of people with the knowledge, but without the tools and resources easily available in order to kind of set themselves up for success. That's kind of a state of compromised websites that are used for botnets and et cetera. Right now, as well as, you know, privacy and security issues that individual users face in their regular day-to-day life with browser extensions and et cetera, where a lot of times browser extension research and browser research in general might get deprioritized due to its focus on individual consumer privacy versus things like malware, which focus a lot of the time on enterprise. Elif Kaya:But at least from my perspective, I'm very passionate about malvertising and, and the ways the advertising and web security and email security kind of coalesce around using a lot of the success that they have on individual people in order to leverage those attacks against bigger corporations later. That's where I like to focus a lot of my energy and research. Nic Fillingham:Uh, Elif Kaya, thank you so much for your time and thank you for, uh, contributing this great blog posts and helping us wrap our heads around email infrastructure. Elif Kaya:Thanks for having me. Natalia Godyla:Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode. Elif Kaya:And don't forget to tweet us at msftsecurity or email us at with topics you'd like to hear on a future episode. Until then stay safe. Natalia Godyla:Stay secure.

Celebrating Women in Security

Ep. 18
Today is International Women’s Day, and we are celebrating with a very special episode of Security Unlocked. Hosts Nic Fillingham and Natalia Godyla revisit their favorite interviews with some of the prominent women featured previously on the podcast.We speak with Holly Stewart, a Principal Research Lead at Microsoft and known in the Defender organization as “The Queen of AI.” Holly shares how building a security team with different perspectives helps to better understand and stop threats. Next, we talk with Dr. Anna Bertiger, a Senior Applied Scientist at Microsoft. Anna has an incredible passion for math and explains how she’s using math to catch villains and make computer networks safer.Finally, we explore what it’s like to hunt down threats with Sam Schwartz, a Program Manager with Microsoft Threat Experts. She came to Microsoft right out of college and didn’t even know what malware was; now she’s helping coordinate a team of threat hunters on the cutting edge of attack prevention.Security Unlocked will be highlighting female security leaders at Microsoft throughout the month of March. Subscribe now to make sure you don’t miss an episode!In This Episode, You Will Learn:• How math is used to help analyze attack trends• How AI and ML help identify patterns that can stop attacks• How threat hunters are tracking down the newest security risks• Why Microsoft Threat Experts are focused on human adversaries, not malwareSome Questions We Ask:• How do AI and ML factor into solving complicated security problems?• What’s next on the horizon for data science?• How do you use math to determine if an action is dangerous or benign?• Why do threat hunters need to limit the scope of their work?• What skills do you need to be a security program manager?Resources:Sam Schwartz’s LinkedIn: Anna Bertiger’s LinkedIn: Stewart’s LinkedIn:’s LinkedIn:’s LinkedIn: Security Blog:[Full transcript can be found at]Nic Fillingham:Hello, and welcome to Security Unlocked, a new podcast from Microsoft, where we unlock insights from the latest in news and research from across Microsoft security engineering, and operations teams. I'm Nic Fillingham.Natalia Godyla:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft Security. Deep Dive into the newest threat intel, research, and data science. Nic Fillingham:And, profile some of the fascinating people working on artificial intelligence in Microsoft Security. Natalia Godyla:And now, let's unlock the pod.Nic Fillingham:Hello Natalia, welcome to a very special episode of Security Unlocked, how are you doing?Natalia Godyla:I'm doing great, and it is a very special episode. It is International Women's Day today and, we are going to be celebrating that with our compilation episode, pulling together a few of the awesome women that we have been interviewing throughout the course of the podcast.Nic Fillingham:Yeah, we have taken, uh, three interviews that actually went live, uh, in episodes one, four, and seven respectively. So, if you haven't made your way back through the archive, if you haven't binged the Security Unlocked series so far, uh, you may have missed these ones. And, they are amazing, uh, interviews so we wanted to sort of, bring them out of the archive and pull them together for this special episode. First up, you're gonna hear from Holly Stewart, who was the first person that we profiled on the, on the podcast on the first episode. Holly is affectionately known inside the Defender Org as the Queen of AI, She gives a sort of, a wonderful perspective on, on ML and AI. Nic Fillingham:Then we hear from Dr. Anna Bertiger, who has a PhD in Math and has this incredible energy and passion for how she uses her math to catch villains, I think you'll, you'll love that perspective. And then, we round it out with Sam Schwartz, who provides a wonderfully fresh viewpoint on security and coming into security as someone that's a little, sort of, newer in career into, in the cyber security space. I think it's gonna be great episode. Natalia Godyla:Yes, and it doesn't stop there. We will be highlighting women throughout the month. So, we'll be covering different Deep Dive topics with female security leaders at Microsoft as well as profiling a few women in their careers. Nic Fillingham:On with the pod?Natalia Godyla:On with the pod.Nic Fillingham:Welcome to the podcast, Holly Stewart. Hi Holly, thanks for your time today.Holly Stewart:Hello, thank you for having me. Nic Fillingham:Awesome. So, let's start with if you could just give us your title at Microsoft but, maybe more interestingly, sort of, walk us through what the day to day function is of your role?Holly Stewart:Sure. So, I am a Principle Research Lead at Microsoft, and I work in the endpoint protection side of research. And, I like to say, sort of, our teams super power is using AI to help protect people. Machine learning and data science techniques are used everywhere within our research team, but with our team we have a primary focus on using those techniques to try to help people and keep them safe. Nic Fillingham:That's awesome. And, you run a team is that right Holly, how big's the team?Holly Stewart:It's about 25 now. Nic Fillingham:Yep, and they're all in the, sort of, AI data science, sort of, realm?Holly Stewart:Yeah, actually they're this super interesting mix of researchers, data, and data scientists and they come from all walks of life. We have folks who are security experts, who really understand what threats do, how they work, some of them understand criminal undergrounds and other things like that. And then, we have data scientists that come from many different facets, many of them not particular experienced in security, but some may be an expert in deep learning, another person may be more on the anomaly detection side. But, you know, you take all these folks with different perspectives and different strengths and you put them together and really cool things happen.Nic Fillingham:So holly, you talked about learning French and, sort of, what you studied at college, what other things in your, your education, your history pre Microsoft do you, sort of, feel, sort of brought you to where you are now, and that you're, sort of, using in your day? Perhaps things that, that maybe seem a little unorthodox.Holly Stewart:You know, I'll say that I, I grew up with a really strong work ethic, my family actually comes from farming. And, you know, my father has this really strong work ethic, he gets these guilt complexes about... if he's not doing something productive, he hasn't made... th- you know, day is not complete. And, and somehow I'm instilled with that and so when I got into security, I kept seeing so many problems, just sort of the threat de jure, every single day we're just bombarded with information, it's, it's sort of an overload. And I always thought, how can we better solve this problem? How can we help people really understand what matters? And when I started getting into data science, I thought, this is the way this is how we can make better decisions, help people make better decisions, and help protect them in a way where, you know, sort of focusing on the problem de jure, really wasn't getting us anywhere, really wasn't moving the needle.Nic Fillingham:So perhaps that drive that maybe thought you were going to the Peace Corps, you're, you're sort of utilizing a similar motivation there, but now in the data science realm. Holly Stewart:Yeah, absolutely. I mean, I love being able to say that I go to work and the work that my team does, we are trying to help people every single day to keep them safe, keep them protected. It's, it's something that I feel good about. Natalia Godyla:That's great. And and how does AI and ML factor into that when you're thinking about all of these big complex problems you want to take on? Holly Stewart:Yeah, it's a great question. Like if you think about how maybe we traditionally approach security research where a researcher might reverse engineer some malicious program, figure out what it does, find some heuristic techniques to be able to detect that in the future, make sure those heuristic techniques don't detect the good things that want our computers to run. That takes a lot of time. And the truth is that malware has become so complex, that there's literally hundreds of millions of features that feed into what makes malware malware. It's really difficult for the human brain to wrap your mind around all these permutations, but that's the beauty of machine learning and AI, it's built for that. Holly Stewart:And so we take this incredible ecosystem diversity from, you know, benign applications to malicious applications, we feed that information into the machine learning systems, we train them how to recognize good from bad, and they can come up with these permutations that the human brain wouldn't be able to wrap their heads around. And that, that's really how I connect all those things together in our day to day. Natalia Godyla:Got it. And so what types of... when we say AI and ML, that's a relatively broad set of acronyms there you know, what type of techniques, what type of approaches do you and your team use, or where you sort of heavily invested?Holly Stewart:We invest in lots of things, so if I break down, and I'll say AI in quotes because I, I kind of use it interchangeably, to really just mean data science , it means data science approach. We use many different techniques from what you call supervised machine learning to unsupervised machine learning. With supervised machine learning, you're using signals to help teach the machine how to detect something new. So I may take a set of say, 100 files and 10 of them are bad and 90 of them are good, I extract a bunch of features from those files and then I feed that into machine learning system to teach it how to detect new things that are similar to those files in the future. So that's what you call supervised. Holly Stewart:Unsupervised, is really good at finding what we call the unknown unknown. So, you know with supervised learning, you're teaching it something that you already know and it just gets better at that. With unsupervised, you're trying to find those pockets of uncertainty that maybe haven't even been classified before, or maybe should be clustered together. Or perhaps you know, using past data you find that, "Hey this is an anomaly, something I haven't seen before that doesn't have a label, but that could indicate that something bad is going on." And so we really use a combination of all of these approaches to help train machines to amplify human knowledge and also find the things that maybe as humans we were not thinking about in the first place. Natalia Godyla:Can you share a couple examples so how this AI and ML Natalia Godyla:I was driving some of the Microsoft products, even products that, like Nick said, we use day to day.Holly Stewart:Yeah, absolutely. So there are a lot of files that use what we call social engineering to try to trick people into opening them. So one example that we saw over the past year is these attackers were using local business names and making it look like they were sending an invoice for that local business name, I think it was, uh, a landscaping firm or something like that. And so they were using that invoice that looked like it was from a local landscaper, sending it to these other businesses to try to trick them into opening up this invoice. And so inside it, it led to this phishing site and they would try and collect their credentials. Uh, and so, you know, when you're just looking at this file, you may not see that it looks benign, but the-the machine learning system because it was able to extract all these different features from that file, it was able to see, Hey, this-this is not a normal type of invoice that I would see from a legitimate business, and it was able to flag that as malicious and help keep those customers protected.Natalia Godyla:So Holly, what's next on the horizon, what are you most passionate about trying to solve next?Holly Stewart:Sure. So today we've done a pretty good job of using AI to help discriminate malicious software from benign software, not perfect but we've made a lot of progress in that area. But what's next on the horizon for us is really deeper than that, so it's great to discriminate malicious from bad but what more can I learn from that. Say for example, if we understand the entire Kill Chain of-of that malicious activity from how it arrived, to the victim, to what it did after, if the victim installed it or clicked it, to the file, sort of, motive of the attacker. And if we can understand that entire story, we can look at all of the pieces in that, what we call Kill Chain, and be able to provide protective guidance and automate protections to essentially learn from what attackers are doing today, and make our defenses stronger and stronger over time. And that's really the evolution of AI in security, is to help automate that for the customer. Because the amount of threats that we're facing, the amount of security information is an overload. And we have to get better, we have to automate, and we have to use AI to do it, to really get to where we need to go.Natalia Godyla:And how far away do you think this next step in the evolution is?Holly Stewart:I'm sure I'll be working on it for the rest of my life. (laughs).Natalia Godyla:(laughs).Nic Fillingham:Holly, do you have a Twitter account, do you have a blog, do you have anything you want to promote if folks want to learn more about you, your team, if you're hiring?Holly Stewart:So we post all of our content on the Microsoft Security blog, so you can find it there. And we are hiring data scientists, uh, here in the next week or so, we should have the postings up.Nic Fillingham:Great, so you would find them on the Microsoft careers website, probably under data science?Holly Stewart:Under data science or look for defender and data science, and you'll find us.Natalia Godyla:Thank you, Holly for your time today, it was fantastic to hear about your insights on AI.Nic Fillingham:Yeah thank you Holly, uh, you know, your time is busy, you're running a big team, doing some great work. We really appreciate you coming on the podcast.Holly Stewart:Thank you.Nic Fillingham:It was great to revisit that conversation with Holly, I'm really glad we got to pull that one out of the archive and bring it to newer listeners of the podcast. Up next, Dr. Anna Bertiger who tells us about her superpowers, which are utilizing math to catch villains. So I hope you enjoyed the conversation.Nic Fillingham:Dr Anna Bertiger, thank you so much for joining us. Welcome to the Security Unlocked podcast.Dr Anna Bertiger:Thank you so much for having me.Nic Fillingham:Um, if we could start with what is your title, and what does that really mean in sort of day to day terms. What do you do with Microsoft?Dr Anna Bertiger:So my title is senior applied scientist, but what I do is I find villains.Nic Fillingham:You find villains, how do you find villains?Dr Anna Bertiger:So I-I find villains in computer network, it's all the benefits of the job as a superhero with none of the risks. And I do that using a combination of security expertise, and mathematics and statistics.Nic Fillingham:So you find villains with math?Dr Anna Bertiger:Yes, exactly.Nic Fillingham:Got it. And so, let's talk about math, what is your path to Microsoft, because I know it heavily involves math. How did you get here, and maybe what other sort of interesting entries might be on your LinkedIn profile?Dr Anna Bertiger:So, I got here by math, I guess.Nic Fillingham:(laughs).Dr Anna Bertiger:So, I come from academic mathematics, I have a PhD in math, and then I had a postdoctoral fellowship in the department of combinatorics and optimization at the University of Waterloo, in Waterloo Ontario, Canada.Nic Fillingham:Could you explain what that is because I, I heard syllables that I understood, but not words?Dr Anna Bertiger:(laughs). So that is the department unique to the University of Waterloo. So, optimization is, you know, maximizing, minimizing type problems.Nic Fillingham:Got it.Dr Anna Bertiger:And combinatorics is a fancy word for counting things.Nic Fillingham:Combinatorics.Dr Anna Bertiger:Yeah, which you can do in fancy and complicated ways, and so-so that's what I did when I was not going to make mathematician, is I counted things in fancy and complicated ways that told me interesting things frequently about geometry. And then I decided that I wanted to see the impact of what I did in mathematics in the real world, in a timeframe that I could see, and not on the sort of like, you think of beautiful thoughts, it's really lovely it's a lot of fun. And then hopefully someone uses them eventually. And so I looked for jobs outside of academia. And then one day, a friend at Microsoft, uh, sent me a note that said, if you like your job that's great but if you don't, my team wants to hire somebody with a PhD in combinatorics. And I said, That's me. (laughs).Nic Fillingham:(laughs).Dr Anna Bertiger:And so, I, uh, you know, it took a while. I flew out for an interview, they asked me lots of questions. I, when I'm interviewing for a job, I evaluate how cool the job is by how cool the questions they asked me are. If they asked me interesting questions, that's a good sign. If they asked me boring questions, maybe I don't want to work there.Natalia Godyla:Was there something that drew you to the cybersecurity industry when your friend showed you this job wo-, did you see security and go, Yeah that's cool?Dr Anna Bertiger:So I didn't actually see security in that job, like that team was, didn't only work on fraud, we worked on, we also worked on a bunch of marketing related problems. But I really loved the fraud related problems, I really loved the adversarial problems, I-I like having an adversary. I view it as this like comforting, friendly thing, like you solve the problem. Don't worry, they'll make you a new oneNic Fillingham:(laughs).Dr Anna Bertiger:It's true.Nic Fillingham:So hang on, so you, you go to bed at night and sleep soundly knowing that there are more villains out there?Dr Anna Bertiger:I mean, I would kind of like to get rid of all the villains, but also like, they're building me some really old problems, like on a-Nic Fillingham:Yeah, you-you're a problem solver and they're throwing some good challenges at you.Dr Anna Bertiger:Right, I'm gonna like make the world a better place. School of thought, I would like them all to disappear off the face of the planet. On the like entertaining me portion, problems are pretty good. And so I worked a bunch on-on credit card fraud related problems on that team, and at some point a PM joined that team, who had a, who was a cybersecurity person who had migrated to fraud. And I said, well, you know, I'm not a cybersecurity person. And he said, Oh no, you are. It's a personality type and it's you.Nic Fillingham:(laughs).Dr Anna Bertiger:And then, and then I worked at some other things, you know, worked on some other teams at Microsoft, did some windows quality related things. And it-it just wasn't as much fun, and I found my way back to cybersecurity and I've been here since.Natalia Godyla:How do you use AI or ML tools to solve some of these problems?Dr Anna Bertiger:So, the AI and ML is about learning what's normal. And then when you say, Hey, this isn't normal, that might be malicious. Someone should look at it. So, our AI and ML is human in a loop driven. We don't act on the basis of the AI and ML the way that some other folks might, and there are certainly security teams that have AI and ML that makes decisions, and then acts on them on its own. That is not the case. My team builds AI and ML that powers humans Dr Anna Bertiger:... who work in security operation centers, to look at the results. And so, I use ML to learn what's normal. Then, what's not normal, I say, "Hey, you might want to look at this because it's a little squiffy looking." And, then a person acts on it. Dr Anna Bertiger:And so, I use a lot of statistical modeling to figure out what's normal. So, if it, uh, a statistical distribution to some numerical data about the way the world is working. And, then calculate a P-value, that you might remember from Stat 1 if that's something you've done, to say, "Oh, yeah. Well, there's, you know, only a tenth of a percent chance that, like, this many bites transferred between these pair of machines under normal behavior. Someone should look at that. That's a lot of data moving." Dr Anna Bertiger:And, there, I like to use a group of methods called spectro-methods. So, they're about, if I have this graph, I have a bunch of vertices and I can have edges between them, I could make a matrix that has a one in cell IJ, if there's a vert- if there's a edge between vertex I and vertex J. Let me know if I am getting too technically deep here.Nic Fillingham:You are but keep going.Dr Anna Bertiger:And (laughs), and then, now I have a giant matrix. And so, I can apply all the tools of linear algebra class to it. And, one of the things I can do is look at its eigenvalues and eigenvectors. And, one way you might, sort of, compress this data is to project along the eigenvectors corresponding to large and absolute value- eigenvalues. And, now, you know, we can say things like, "All the points that are likely to be connected end up close together."Dr Anna Bertiger:And, we can try and learn something about the structure of the network and what's strange. And, we've done a bunch of research in that direction. That is stuff I'm particularly proud of.Natalia Godyla:What are you most interested in solving next. What are you really passionate about?Dr Anna Bertiger:I'm really passionate about two things. One of which is, sort of, broadly speaking, finding- finding villains. Finding bad guys. So, part of what I do is dictated by what they do. Right? They- They change their- change their games, I have to change mine, too. And then, also, I have a collection of tools that I think are really mathematically beautiful that I'm really passionate about. And, those are spectral methods on graphs and, sort of, graphs in general.Dr Anna Bertiger:And so, I'm really passionate about finding good applications for those. I'm passionate about understanding the structure of how computers, people, what have you, connect with each other and interact. And, how that tells us things about what is typical and what is atypical and potentially ill-behaved on computer networks. And, using that information to find horrible people.Dr Anna Bertiger:I think- I stopped being surprised by what our adversaries can do. Because, they are smart people who work hard. Sometimes, I'm disappointed in the sense of like, "Damn, I thought I solved that problem and they're back." But that's (laughs) I mean, and that's mostly just you feel like the, like, sad balloon three days after the party.Natalia Godyla:At the end of the day, why do you do what you do?Dr Anna Bertiger:I think there are two reasons I do what I do. Uh, the first which is I want to make the world a better place with the ways I spend my time. And, I think that catching horrible people on computer networks makes the world a better place. And, the other of which is that it's really just a ton of fun. I- I really do have a lot of fun. We- We think about really cool things. Neat concepts in computing and beautiful mathematics. And, I get to do that all day, every day, with other smart people. Who wouldn't want to sign up for that?Natalia Godyla:You've called Mathematics beautiful a couple of times. Can you elaborate? What do you find beautiful about Math? What draws you to Math?Dr Anna Bertiger:I find the ideas in Math really beautiful. And, I think that's a very common thing for people who have a bunch of exposure to Advanced Mathematics. But, isn't a thing we filter to folks in school as well as I would like. The- If you think about the Pythagorean theorems, that's a theorem that most people learned in high school. Geometry that says that-Nic Fillingham:I know that one.Dr Anna Bertiger:... square of the lengths of the two sides of a right- two legs of a right triangle equals the- sum together equals the square of the hypotenuse lengths. And, if you-Nic Fillingham:Correct.Dr Anna Bertiger:That is-Natalia Godyla:(Laughs)Dr Anna Bertiger:... a fact. Okay. And, if you learn it as a piece of trivia then you go, "Okay, that's a thing that I know for the test. And, you write it down and you put it on a flash card or whatever. But, what I think is really beautiful, is the idea of, "How do you think that up?" And, the, sort of, human ingenuity in figuring out that thats's true. And, the- the beautiful ways you can show that that is true. For sure, there's some really, really beautiful ways to be able to prove to yourself that that is true.Nic Fillingham:Changing topics, sort of, slightly. Are you all Math all the time? You know, do you have a TV show you're binging on Netflix? Do you have computer games you like to play? Are you a rock climber? What's the other side of the Math brain?Dr Anna Bertiger:So, the other side of the Math brain for me is things that force my brain to focus on something that is entirely not work. And so, I really love horses and I have a horse. And, I love spending time with her and I love riding her. She's both a wonderful pet and just a thrill to ride.Nic Fillingham:Awesome.Natalia Godyla:Well, Anna, it was a pleasure to have you on the show today. Thank you for sharing your love of Math and horses and hopefully we'll be able to bring you back to the show another time.Dr Anna Bertiger:Thank you so much for having me.Natalia Godyla:I'm so thankful we go to re-listen to Anna's episode. Up next, we'll be talking with Sam Schwartz who is a program manager for the Microsoft Threat Experts team. But, she wasn't always targeting security. She started out as a chemical engineer. So, hope you enjoy hearing about her career from chemistry to security.Natalia Godyla:Hello, everyone. We have Sam Schwartz on the podcast today. Welcome, Sam.Sam Schwartz:Hi, thanks for having me.Natalia Godyla:It's great to have you here. So, uh, you are a security PM at Microsoft. Is that correct?Sam Schwartz:That is correct.Natalia Godyla:Awesome. Well, can you tell us what that means? What does that role look like? What is your day to day function?Sam Schwartz:Yeah. So, I support, currently, a product called the Microsoft Threat Experts. And, what I'm in charge of is insuring that the incredible security analysts that we have, that are out saving the world every day, have the correct tools and processes and procedures and connections to be the best that they can be.Natalia Godyla:So, what do some of those processes look like? Can you give a couple examples of how you're helping to shape their day to day?Sam Schwartz:Yeah. So, what Microsoft Threat Experts does is it is a manged threat hunting service provided by our Microsoft defender ETP product. And, what they do is our hunters will go through our customer data in a compliant safe way and they will find bad guys, human adversaries, inside of the customer telemetry. And, then they notify our customers via a service called the Targeted Attack Notification Service. So, we'll send an alert to our customers and say, "Hey, you have a adversary in your network. Please go do these following things. Also, this is the story about what happened. How they got there and how you can fix it." Sam Schwartz:So, what I do is I try to make their lives easier by initially providing them with the best amount of data that they can have when they pick up an incident. So, when they pick up an incident, how do they have an experience where they can see all of the data that they need to see. Instead of just seeing one machine that could have potentially been affected, how do they see multiple machines that have been affected inside of a single organization? So, they have an easier time putting together the kill chain of this attack.Sam Schwartz:So, getting the data and then also be- having place to visualize the data and easily make a decision as to whether or not they want to tell as customer about it. Does it fit the criteria? Does it not? Is this worth our time? Is this not worth our time? And then, also, providing them with a path to, with that data, quickly create an alert to our customers that they know what they're doing.Sam Schwartz:So, rather than our hunters having to sit and write a five-paragraph essay about what happened and how it happened, have the ability to take the data that we already have, create words in a way that are intuitive for our customers and then send it super quickly within an hour to two hours of us finding that behavior. Sam Schwartz:So, all of those little tools and tracking and metrics Sam Schwartz:... and easier, like, creating ... from data, creating words, sending it to the customers, all of that is what I plan from a higher level to make the hunters be able to do that.Nic Fillingham:Tell us about how you found yourself in the security space and, maybe it's a separate story, maybe it's the same story, and how you got to Microsoft. We'd love to learn your journey, please.Sam Schwartz:It is the same story. Growing up, I loved chemistry.Nic Fillingham:That's too far back, too far back.Sam Schwartz:I know, I know, I know.Nic Fillingham:Oh, sorry, sorry.Sam Schwartz:I loved-Nic Fillingham:No, let's start there.Sam Schwartz:I loved chemistry. I loved like molecules and building things and figuring out how that all works. So when I went to college, I was like, "I want to study chemical engineering." Um, so I, through my education, became a chemical engineer (laughing). But I found that I really liked coding. Uh, we had to take a- a fundamentals class at the beginning and I really enjoyed the immediate feedback that you got from coding, like you did something wrong, it tells you immediately that you messed up. And also when you messed up and you're super frustrated and you're like, "Why didn't this work?" like, "I did it right," you didn't do it right. It messed up for a reason, and I really liked that and I thought it was super interesting. And I found myself like gravitating towards jobs that- that involved coding. So I worked for girls who code for a summer. I worked for Dow Chemical Company, but in their robotics division so I was still like chemical engineering, but I got to do robots.Sam Schwartz:And then when I graduated, I was like, "I think I want to work in- in computer science. I don't like this chemical engineering." It was quite boring. Even though they said it would get more fun, it never did. We ended up watching water boil for a lot of my senior year of college and I was like, "I want- I want to join a tech company." And I looked at Microsoft and they're one of the only companies that provide a program management job for college hires so ... And I interviewed, I was like, "I want to be a PM," sounds fun, get to hang out with people and I ended up getting the job, which is awesome. And I walked on my first day, my team and they're like, "You're on a threat intelligence team." I was like, "What does that mean?" (Laughs) And-Nic Fillingham:Oh, hang on. So did you not know what PM role you were actually going to get?Sam Schwartz:Nope. They told me that I was slated for the Windows ... I was going to be on a Windows team. So in my head like that entire summer, I was telling people (laughing) I was going to work on the start button just 'cause like that's what I ... I was like, "If I'm going to get stuck anywhere, I'm going to have to do the start button." Like that's what my-Nic Fillingham:That's all there is. Windows is just now ... It's just a start button. So yeah.Sam Schwartz:I was like that's what ... I was like, "Guaranteed, I'm going to get the start button," or like Paint. Actually, I probably would've enjoyed Paint a lot, but the start button (laughing). And I came and they're like, "You're on a threat intelligence team," and I was like, "Oh, fun." And it was incredible. It was an incredible start of something that I had no idea what anyone was talking about. When they were first trying to explain it to me like in layman's terms, they're like, "Oh, well, there's malware and we have to decide how it gets made and how we stop it." And I was like, "What's malware? Like I don't ... " I was like, "You need to like really dumb it down (laughs). I have no idea what we're talking about." Sam Schwartz:And initially when I started on this threat intelligence team, there were only five of us. So I was a PM and they had been really wanting a PM. They, apparently before they met me, weren't ... were happy to get a PM, but weren't so happy I was a college hire. They're like, "We need ... " They were like, "We need s-Nic Fillingham:Who had never heard of malware.Sam Schwartz:"We need structure." (Laughs)Nic Fillingham:And thought Windows was just a giant anthropomorphic start menu button.Sam Schwartz:They're like, "We need structure and we need a person to help us." And I was like, "Hi. Nice to meet you all." And so we had two engineers who were building tools for our two analysts. Um, and it was ... We called ourself like a little startup inside of, uh, security research, inside of the security and compliance team 'cause we were kind of figuring it out. We we're like, "Threat intelligence is a big market. How do we provide this notion of actionable threat intelligence?" So rather than having static indicators of compromise, how do we actually provide a full story and tell customers to configure to harden their machines and tell a story around the acts that you take to initiate all of these- These configurations are going to help you more than just blocking IOCs that are months old. So figuring out how to best provide ... give our analysts tools, our TI analysts and then, allow us to better Microsoft products as a whole. So based on the information that our analysts have, how do we kind of spread that message ac- across the teams in Microsoft and make our products better? Sam Schwartz:So we were kinda figuring it out and I shadowed a lot of analysts and I read a lot of books and watched a lot of talks. I would watch talks and write just like a bunch of questions and finally, as you're around all these incredibly intelligent security people, you start to pick it up. And after about a year or so, I would sit in meetings and I would listen to myself speak and I was like, "Did I say that?" Like, "Was that- was that me that, one, understood the question that was asked of me and then also was able to give an educated answer?" It was very shocking and quite fun, and I still feel that way sometimes. But I guess that's my journey into security.Natalia Godyla:Do you have any other suggestions for somebody who is in their last years of college or just getting out of college and they're listening to this and saying, "Heck, yes. I want to do what Sam's doing?" Uh, any other applicable skills or tricks for getting up to speed on the job?Sam Schwartz:I think a lot of the PM job is the ability to work with people and the ability to communicate, and understand what people need and be able to communicate that in a way that maybe they can't communicate, see people's problems and be able to fix them. But I think a lot of the PM skills you can get by working collaboratively in groups and that, you can do that in jobs. You can do that in- in classes. There's ample opportunity to work with different people: volunteering, mentoring, working with people and being able to communicate effectively and connect to people and understand, be empathetic, understand their issues and try to help is something that everyone can do and I think everyone can be an effective PM.Sam Schwartz:On the security side, I think reading and listening. I mean even the fact that ... I mean the hypothetical is someone listening to this podcast are already light years ahead of I was when I- when I started, but just listening, keeping up to date, reading what's going on in the news, understanding the threats, scouring Twitter for all the- all the goodness going on. (Laughing) Sam Schwartz:That's a way to- to stay on- on top.Nic Fillingham:Tell us about your role and how you interface with data scientists that are building machine learning models and sort of AI systems. Where- where are you able to ... Are you a consumer of those models and systems? Are you contributing to them? Are you helping design them? What's ... How do you- how do you fit into that picture?Sam Schwartz:So a little bit of all of the things that you mentioned. Being a part of- of our MTE service, we have so many parts that would love some- some data science, ML, AI help, and we are both consumers and contributors to that. So we have data scientists who are creating those traps that I was talking about earlier for us, who are creating the indicators of malicious, anomalous behavior that our hunters then key off of. Our hunters also grade these traps and then, we can provide that back to the data scientists to make their algorithms better. So we provide that grading feedback back to them to have them then make their traps better. And our hope is that eventually, their traps, so these low fidelity signals, become so good and so high fidelity that we actually don't even need them in our service. We can just put them directly in the product. So we work, we start from the- the incubation, we provide feedback and then we, hopefully, see our- our anomaly detection traps grow and- and become product detections, which is an awesome lifecycle to be a part of.Natalia Godyla:Thank you, Sam for joining us on the show today. It was great to chat with you.Sam Schwartz:Thank you so much for having me. I've had such a fun time.Natalia Godyla:Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode.Nic Fillingham:And don't forget to tweet us @msftsecurity or email us at with topics you'd like to hear on a future episode. Until then, stay safe.Natalia Godyla:Stay secure.

Digital Crimes Investigates: Counterfeit Tales

Ep. 17
Digital crime-fighterDonal Keatingrevisits the podcast, but this time… it’s personal. *cue dramatic crime-fighting music* The Director of Innovation and Research of the Digital Crimes Unit (DCU) at Microsoft joins hostsNic FillinghamandNatalia Godylato regale us with the origin story of the DCU and his captivating career exploits.Whether it’s tales of his early days preventing Windows 98 counterfeits in Ireland or the many international law enforcement raids he’s participated in…there’s no shortage to Donal’s crime-fighting adventures.In This Episode, You Will Learn:• The mission of Microsoft’s DCU and the techniques used to combat fraud• The events and needs that led to the creation of a forensic analytic lab at Microsoft• How counterfeiting and intellectual property crime have evolved over the years with advanced technology• What it’s like partnering with law enforcement to take down criminals around the worldSome Questions We Ask:• What does a day in the life of Donal look like in the DCU?• Was there ever a counterfeit example that shocked Donal at just how good it was?• With so many shifts in Donal’s work, what in his background has prepared him to stay on top of the changes?• What does a digital crime fighter do in their time off?Resources:Donal’sLinkedIn’s LinkedIn’s LinkedIn Security Blog transcript can be found at Fillingham:Hello, and welcome to Security Unlocked, a new podcast from Microsoft where we unlock insights from the latest in news and research from across Microsoft Security engineering and operations teams. I'm Nic Fillingham.Natalia Godyla:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories for Microsoft Security, deep dive into the latest threat intel, research and data science-Nic Fillingham:And profile some of the fascinating people working on artificial intelligence in Microsoft Security.Natalia Godyla:And now, let's unlock the pod. Hi, Nic. How's it going?Nic Fillingham:Hello, Natalia. It's going well. How are you?Natalia Godyla:It's going well. I am super-excited for this episode, because it will be a trip down memory lane. We're gonna be talking about counterfeiting CDs and Beanie Babies. Well, Beanie Babies aren't covered in this episode, but they're counterfeited.Nic Fillingham:So we were, we were having a conversation before we started recording about, you know, things that have been counterfeited, and one of the examples that we stumbled upon was Beanie Babies, and I said, "What's a Beanie Baby?" And Natalia said "How do you not know what Beanie Babies are?" So 15 minutes ago, you, you educated me on a Beanie Baby, and I've learned something about you, is that you collected Beanie Babies. Is that right? You were in the Beanie Baby fad. You were in the, the trend.Natalia Godyla:Oh, yes. Yes. Beanie Babies and Pokemon cards. I definitely collected them.Nic Fillingham:Do you still have your Pokemon cards?Natalia Godyla:Yes. Yes, I do.Nic Fillingham:And do you still have your Beanie Babies?Natalia Godyla:I've got one Beanie Baby left.Nic Fillingham:Do you know, with certainty, that it is not a counterfeit Beanie Baby?Natalia Godyla:I don't, but I don't think I wanna find out.Nic Fillingham:If only there were some kind of technology. Maybe a, a hologram or something, embedded into the Beanie Baby for you to have a high degree of certainty-Natalia Godyla:(laughs)Nic Fillingham:That it was real.Natalia Godyla:(laughs)Nic Fillingham:And I'm talking about holograms because our guest on the podcast today, Donal Keating from the DCU, walks us through his journey into security, and his path to Microsoft, and how he spent a lot of his career in the anti-counterfeiting space. And we talked about CDs, we talked about counterfeiting CDs and optical discs. This was very exciting for me. We talk about the period in time when I was actually joining Microsoft, which was when Windows XP was coming out, and so the whole, you know, hologram on the CD, and you hold it up to the light, and there'd be different colors and pictures, like...Nic Fillingham:That was all very exciting. I guess that must have been early 2000s. That was a-, that was super-exciting when that was happening, so this was a, this was a great conversation, and I think we also talk about chickens at some point, too. I don't, I d-, I'm not sure how we got there, but we cover a lot of ground in this conversation.Natalia Godyla:And with that, I feel like we shouldn't keep people hanging. On to the pod.Nic Fillingham:On to the pod.Natalia Godyla:Hi, Donal. Welcome back to Security Unlocked. Thanks for joining us for a second time.Donal Keating:Thank you. I'm delighted to be here.Natalia Godyla:So Donal, you are the director of research and innovation of the Digital Crime Unit. I know that you've talked a little bit about what you did in our last episode, but would you mind giving the audience a refresher? What does a, a day in the life of Donal in the Digital Crimes Unit?Donal Keating:Well, e-each day is different, obviously, because when you're sort of working on the, on the side of security and crime-fighting, people evolve very rapidly, so there is no set pattern of what I do every day. But I am lucky to have a relatively unique position in the DCU, we call it the Digital Crimes Unit, in that I work across all of the different pillars th-, that we fight, and I also the opportunity to work, uh, work across the company, so... And we're always looking for new techniques, new data sources, and new crime mechanics, and I tend to get involved in, in the things that are new. So it's a very interesting job. As someone said, there's not many jobs where you wake up in the morning and look at the news and say, "What's going to be on my plate today?" But-Natalia Godyla:(laughs)Donal Keating:Working in this space tends to be that sort of a job.Natalia Godyla:And how did you end up in this role? What has been your path not just to Microsoft, but security? I know, a big question.Donal Keating:Oh, my. (laughs)Natalia Godyla:(laughs)Donal Keating:Once upon a time, Mammy Keating and Daddy Keating met. E-e-e-, um...Natalia Godyla:(laughs)Donal Keating:So, if I start-Nic Fillingham:And where was that, Donal?Natalia Godyla:(laughs)Donal Keating:Sorry? Wh- where was that?Nic Fillingham:Yeah, where was that?Donal Keating:That was in, that was in Ireland. So I, I grew up in-Nic Fillingham:Paint, paint us the picture. Like, it's, tell... I want beautiful, rolling green countrysides. I want-Donal Keating:(laughs)Nic Fillingham:Paint me that beautiful picture of Ireland.Donal Keating:Uh, well, uh... (laughs)Natalia Godyla:(laughs)Donal Keating:I don't know if I'm gonna go back that far. It's, that's before Moses was a boy.Nic Fillingham:(laughs)Natalia Godyla:(laughs)Donal Keating:So my parents are Irish. Uh, father an engineer, my grandfather an artist. My other grandfather was a blacksmith. So sort of technology had always been in the family. When I was growing up, uh, I guess my parents had been a product of the, of the war, and Ireland, at the best of times, didn't have very much, so the, the ability to make things and figure things out from first principles was always p-pretty important, uh, in my family.Donal Keating:So I grew up. My brother's, uh, an engineer. A c-, a civil engineer, built a very successful company in civil engineering. So I guess I was the black sheep of the family. I became a physicist, and when I graduated from physics, it was in the 1980s. I won't say exactly when, but the unemployment rate in Ireland at the time was in the high 20s, I believe, and for new graduates, there was pretty much two, three jobs a year going, and I certainly wasn't in the top two or three percent of the graduates coming out of the country, so I emigrated, like a lot of Irish people do, and my first stop was the UK.Donal Keating:So I got a job as a young, very green physicist. The only advantage I have is I had done applied physics, so I was to run a lathe as well as do some calculations, and I started to work for a, a UK company that was a venture capital-funded start up, looking at some very interesting optical technology. So my major was in opto-electronics, and this company was involved in the research into storage media. And at the time, CD audio had been quite the technology. C-, recordable CD had not been yet invented, but there was a space in the market for what was considered archival media, and this company had some very innovative and patented technology which we called Mothi. It was a, a recordable media that effectively made a mechanical mark. So it wasn't just a change of reflection. There was actually a mechanical mark on the media. And b- (laughs), I won't even go into the capacities of these things in, in today's world.Nic Fillingham:Almost like a vinyl record?Donal Keating:Uh, a-, well, uh, almost like a vinyl record, but at a nano scale. So a laser would... What normally it would do with m-, recordable media is, a dye would absorb, or not absorb and a-allow light through to the reflective layer beneath. The trick of this technology, called WORM, uh, write once read many, was a layer that looked a little bit like an egg box, and when the laser hit the texture, it would blow a bubble in the egg box, therefore making it reflective, and the company name was Plasmon, which actually refers to a physical phenomenon that means a surface that the, uh, incident light gets redirected along the surface of the incident plane.Donal Keating:So i-, it was just an interesting piece of technology. I worked for that company for six years, starting out knowing nothing, and worked for an incredible mentor engineer, a guy by the name of Bob Longman, who taught many engineers like me. He was quite a legend.Donal Keating:And through that company, it was like pure R&D work. We knew what the end goal was, but how to get there was entirely uncharted.So we got to work on all sorts of interesting, uh, technologies. But that really was the beginning of a skillset that I think everyone in security needs, and, uh, particular in research innovation. It's, when there aren't train tracks, how can you look at a problem, split it into smaller problems, and do things that you can measure, observe... Uh, basically articulate, "Well, okay, these three things happen. Therefore, what does it mean for the bigger picture?" So that reframing the question was training that I got right when I, when I graduated. So that was the start.Nic Fillingham:I think I i-interrupted you, Donal, but what was the... Did you tell us? What was the capacity? What was the storage capacity of this early CD-Donal Keating:(laughs) -Nic Fillingham:Technology. Nic Fillingham:I'm assuming it was small.Donal Keating:it was-Nic Fillingham:I'm assuming that's, that's Nic Fillingham:... the giggle your-Donal Keating:It was small, yeah.Natalia Godyla:(laughs)Donal Keating:540 megabytes was considered this huge enormous storage capacity.Nic Fillingham:But that's much smaller than the, the theoretical max of uh, cd's. No, it didn't say you only get to about 714 meg or something?Donal Keating:Yeah. Yeah but that, that was yeah but that was a CDR, and now we got DVDR, and yeah but these are capacities like if you pick a USB now, the tiny, tiny, tiny surface area will contain ten times that capacity. You know you look at floppy discs and you know, you look at the evolution of it. Really truly the laws of physics are being uh, like hard disc drives which I, at one stage I worked for Seagate, I'm like come to the, my narrative, but even when I was at Seagate in the 90's, the idea that you were coming close to the capacity of what a platter could hold.Donal Keating:They continue, hard drives, continue to push the limits. They're still uh, following Moore's at a phenomenal rate. Like if you look at a technology like hard drive, and you had to start that from scratch, people would say that's impossible. That is absolutely impossible to get that performance you know, even if using a huge design team.Donal Keating:But that's the great thing about evolution, you start off with something small you tweak it, you tweak it, you tweak it you put economic pressure on it to make it faster and bigger and you end up with here we have hard drives today same with Solid State. Solid State technology in another 20 years time. There will still be Solid State and it'll be faster and bigger and better than all the rest of it.Nic Fillingham:I thought you were sort of going to be comparing that early technology. That mech, that mechanical I forget the, the words you use but that mechanical mark on the disc. I thought you might have been comparing that to sort of later uh, technologies for writing to a CD. But you were, you were talking about CD's in general. Yes the capacity of a CD is, is obviously very very small.Donal Keating:Yeah. So the, the sort of people that were interested in it were people who needed archival technology. So uh, they worked with the British Library for instance was one of their um, audiences. But also company records and you know things that needed very good archival life. So, what you might not know is that your CDR um, if you've kept them in a drawer for 20 years will not be producing all the pictures that you thought you'd put onto your CDR.Donal Keating:And those technologies break down relatively quickly. So this was a, a technology that they said would um, stay on the shelf for a long time.Nic Fillingham:Why was that? The material is sort of susceptible to pressure change, temperature change, what, what is it?Donal Keating:Well with a recordable CD for instance is a dyeing. And dyes tend not to be, not to be stabled. You know you look at an old book even when it's closed up. The pictures in the, in your old books would be faded from what they were. Well if you need that high contrast and, and you have fading with your dye, you're gonna loose fidelity.Donal Keating:That's really just comparing this technology and CDR which is you know, but, the bit that I'm getting to is, you might have recording mechanisms that store data for a long time but the drives that read those do not store for a long time. Donal Keating:So, back then it was all scuzzy interfaces. To find a PC with a scuzzy interface now would be a, would be a whole, a whole piece of work. So, the reason the Cloud is gonna be so much better for storing data is regardless of what the readout technology is going to evolve with the Cloud.Donal Keating:I was kinda lucky in my career in that I was at the right place at the right time. So I worked for a number of companies that basically built CD manufacturing in Ireland. I hopped around those companies being part of the supply chain to Microsoft. So the very first indication of security, Microsoft introduced what we called an Innerband Hologram on I want to say was Windows 98.Donal Keating:It was a security feature to try and make counterfeiting of the Windows 98 dix, more difficult. Long story short, Microsoft decided themselves that they wanted a CD manufacturing plant. And they recruited me. At the time I really want to work for Microsoft. I had been a supplier to them and they had been pretty aggressive as customers. So I, I wasn't a terribly keen employee but they made it worth my while to join Microsoft to build them a CD-ROM plant in Dublin which I did.Donal Keating:We got that up and running. And just at that time, a team in the US wanted even more secure CD manufacturing. So at the time, one of the great ways of making money very easily was to produce either Office 97 or Windows 98 CD's and sell them. Now, you could make money in different ways. You could just bootleg them and make recordable CD's, but people then knew that they were buying something cheap and cheerful. There was, you get a few bucks for it but you weren't gonna make big dollars. Donal Keating:But the more sophisticated criminals did is they made visual pass offs, like very very good pass offs of the product. Packaged them up and even it into the supply chain. So today everyone is conscious of supply chain attacks. Solar winds being an example and in the recent past supply chain attacks have been all over the business. But if you go back to those times, people didn't really consider the supply chain attack. And one of the significant vulnerabilities in the software industry back then was, there was this whole world of people prepared to make very, very sophisticated counterfeits. Donal Keating:So, I was working for Microsoft at the time and there had been some legal cases chasing down counterfeiters and the, they had a newly appointed attorney in Europe looking after the counterfeiting team and we got talking and it was just one of those things that you know, you suddenly meet someone who knows what they really want to do and I knew how the product was made. And I said, "Look. All, all of the, the way you're going about this identification of counterfeit is all wrong." You know. The, the example I think was that if something was misprinted, it was, if it was badly printed disc it must be counterfeit. Donal Keating:I've run en, enough CD plants to know you can have a bad day in printing discs. So that was the start of the concept of a proper forensic analytic lab that would look at product and say, "This is genuine or counterfeit." And that really was the start of getting into the security space. And then I guess was in the year 2000-2001 maybe.Natalia Godyla:What was your next step within Microsoft. What, what brought you to the role you have today?Donal Keating:Yeah, so actually at the time when, when I met the legal team for the first time I, I was transitioning out from running the CD plant to working on the anti-counterfeiting technologies. In fact I used to, I kinda had a role that was mostly based in the US uh, looking at hologram technology, fingerprinting technology, just a variety of technologies that are going to be used to protect our products. Donal Keating:But it became more and more interesting to me to chase the criminals rather than to try and protect the product. There was lots of people focused on protecting the product. There was very few people uh, focused on, on locking up the crooks. And I think that was from one side, from the traditional counterfeiting side. One of the things that you got to learn is the economics of being a, a criminal.Donal Keating:And they would save themselves as, as people but what's their motivation? How do they do it? You know, how do they communicate? So, that was way back then that seemed to be very interesting and exciting. So I did more and more of that. Like I said I went around the world. I was in raids all over the world of, of plants producing counterfeit discs.Nic Fillingham:Can you share any examples?Natalia Godyla:(laughs)Donal Keating:Yeah. yeah, yeah, yeah I can so, the, the more recent one actually that's back in 2013 because we pretty much stopped em' physically counterfeiting but back in 2013-2014, there was a plant in the Ukraine that had been, it, it had belonged to the old regime. There's a new regime comes in so they re-raid the plant and I, I got called in just because I knew about how to obtain evidence from a CD plant. So they just wanted a kind of an expert from Microsoft to help them obtain Donal Keating:... obtain the evidence from the plant. But I arrived at this factory, brought there by law enforcement, and they had these huge doors, big, enormous, big steel doors. But the bit that appealed to me was (laughing) two feet to the right of the door, there was actually a hole blown in the wall. The cops said that to do the raid, he said, "That door is too secure but the wall's not so secure." So they went through the wall.Donal Keating:I- I've done cases in- in Russia also. So everyone knows that counterfeiting is a problem, but one of the ways you- you protect yourself is if you have someone who is on the law enforcement side of the house who will not raid plants, that they are kind of under their protection. But what happens when you stop paying the protection money? So it turns out that Microsoft got pulled in because someone wasn't paying their protection money, uh, anymore, and law enforcement raided the f- facility. Donal Keating:I went there to analyze the evidence and testify that yes, this in fact was a Microsoft product that was being counterfeited. When the plant that had been raided realized that the law enforcement were taking it seriously, they obviously paid their dues again. So I'm in this police station in the morning, uh, we're taking the evidence, y- you know, gathering up the notes. And when you're handling evidence, you have these tags, so you take something out, do your analysis, and then you seal the bag and- and sign it. Donal Keating:Suddenly, there's an urgent request to go to lunch at, you know, 11:30 or something. Never a man to dodge lunch, we went off to lunch.Natalia Godyla:(laughs).Donal Keating:But the lunch went on about three hours, and when we came back I'm looking at my pile and I see all this stuff that I had already examined, but they're not my seals, it's not my signature. And I said, "Th- this is not what I looked at this morning." (laughs). "Oh yeah, that's- that's what you looked at this morning." (laughs).Donal Keating:It was the sort of environment where you don't- don't go and argue with anyone, so we just stepped away from that. There was some- some follow-up, but there was no confirmation that what that plant had been producing was Microsoft counterfeits and it all got swept under the carpet.Nic Fillingham:Donal, when I hear the word raid though, I think of paramilitary, I think of guns and- and- and all that. Is my mental image accurate? What, how- how sort of scary, how dangerous were these- these raids that you were a part of? Or are they a bit more sort of... Well, yeah, that- that- that's my question.Donal Keating:So generally with counterfeiting, they tend to be, they're not dangerous. So sometimes, mostly I would get called in after the raid had happened, so therefore there's no danger, the environment is secure. Remember, these manufacturers are doing it on behalf of someone else. It- it's like malware today, there's a whole bunch of different individuals in the supply chain. My specialization at the time was the- the actual plants themselves, so we were going to sites that it was a regular manufacturer who was just breaking the law. There wasn't that risk.Donal Keating:But since I came to the US, I moved for Microsoft to the US in 2013, I got hauled into a raid where someone was selling product keys, and for some reason the case was a Homeland Security case. And that's the first time that I've ever seen, I actually wrote up a report afterwards, um, I was there with a- a Microsoft colleague and he was ex-FBI, and to him, it was perfectly normal. But to an Irishman who has grown up on American TV, it looked like the real thing. Donal Keating:They had an address and we were going in to the address, but there's a briefing beforehand that has a SWAT and a whole bunch of agents that are going there now. We're invited along as the- the analysts, like to- to analyze what they find. But there's this briefing that starts off with, you know, if there's- if there's shootings, here's where the hospitals are. If it's, you know, serious, here's where the helicopters land. You kinda get this mental image built up that you're going to raid a super-secure and heavily-armed target.Donal Keating:In this case, the entire team arrive up (laughing), and the guy arrives out in his dressing gown. And- and his first words to law enforcement was, "I haven't counterfeited for a year." (laughs). Natalia Godyla:(laughs).Nic Fillingham:(laughs). Donal Keating:Working that closely with law enforcement was quite a buzz, but all of that was sort of intellectual property crime that I was focused on, and since then, since 2013, 2014, I have changed my focus pretty much entirely to protecting Microsoft customers. So taking all of those techniques and, you know, understanding about the way people behave, and looking at behavior of criminals. Donal Keating:And using data, in essence, to- to look for, I used to look forensically for evidence of did it come from an authorized supply chain or an unauthorized supply chain? We built some special technology to do that with microscopes and image matching and stuff. So taking a lot of those concepts and then applying it to data streams. Is this a normal behavior for this type of data? Where's the anomaly? What's the cause of it? All of those sorts of things.Natalia Godyla:Was there ever a counterfeit example that shocked you, that was just so close to truth that you were surprised? Like just awed at the counterfeit artistry?Donal Keating:Well, I will say absolutely, I'm- I'm in awe of the ability for people to make things that look so visually identical. And a- a counterfeit never, they never manufacture things in exactly the same way that we did it, so we would emboss a hologram, uh, the counterfeiters by and large produce labels. But boy, were those labels good visual pass-offs. You know, it became, I wouldn't say impossible but it actually became, you know, you need to put your glasses on to look at the th- thing and say, "Oh yeah, that's counterfeit."Donal Keating:But again, that's to someone who has knowledge of the product. Uh, I think a- a thing that a lot of people forget, specialists, people who look at this stuff all the time will look at it and say, "Oh, well, that's, you know, it's missing the T and I've got a small I here. And look, this- this color is a bit off." To someone who buys this product once every three years or once every two years there's no build-up of a reference library of, "You know what? If it looks good, it must be genuine. And in fact, there's a little sticker on it that says this is genuine." (laughs). Therefore, you're socially engineered into thinking yes, it's genuine.Donal Keating:Uh, I love- I love when you g- get products from Amazon and you, a little card comes out that says, "This is an authentic product because, you know, we've got the card that says it's an authentic product."Nic Fillingham:The certificate of authenticity, which is a little matchbox-Donal Keating:Uh, yeah.Nic Fillingham:... square of cardboard that, uh, (laughing)-Donal Keating:Yeah, yeah. Nic Fillingham:... has been printed on an inkjet printer. (laughing). And cut out with scissors.Donal Keating:Yeah. One of the things that criminals are very good at is social engineering people into thinking they're doing the right thing, in- in whatever area it is. Like they would give people additional stuff in counterfeit packages, and made them feel even better about themselves getting this really good deal online. Uh, you know, it- it's just the- the psychology of- of people, we're just not designed to be suspicious of everything. Which is great, but unfortunately for people who work in this space, you get suspicious of everything.Nic Fillingham:So we're rapidly moving away from physical media. My Xbox doesn't even have a- doesn't even have a disc drive anymore, so it's, you know, it's entirely- it's entirely online digital distribution. But I see there is still, there are still counterfeiters out there. There are still, you know, it's still probably big business in some parts of the world. Is that, are- are, do you still have your finger on the pulse or have you fully, uh, left that- that space?Donal Keating:I have fully left that space, but absolutely, you know, there- as long as there is a dollar to be made there will be people in that space. But it's just not- it's just not what Microsoft focuses our effort on. You know, there- there will always be people who wanna go and pick up- up Windows on a CD-R. What I would say is then they know the risks that they're taking. You know, they're- they're a self-selecting group. Donal Keating:You know, we always talk about make sure that you're patched and have everything updated and use good password security. Well, you can- you can lose all that if you choose to obtain your software on a recordable CD where it says, you know, "This- this is real stuff." You know, e- especially on a, at the OS level. When you're installing an OS from a disc before anything has been turned on and all your signatures have been updated, it- it- Donal Keating:It's really easy to- to build a device with a lot of malware on it. Therefore, that is an area that I have concerns about, is that your supply chain for your hardware is, y- you're not buying the thing that you can get for the cheapest price. You're- you're buying from your authorized channel, you're buying from people that are reputable. Donal Keating:I think one of the really important things in security is the reputation and, you know, trustworthiness of your supply chain. So that's not an area that we spend a huge amount of time in, but it certainly is a thing that, um, is- is of concern to me.Nic Fillingham:And Donal, I think you've already said this, but to reiterate, the- the- the principles and the learning from your time in- in forensics and in physical, uh, disk manufacturing and- and- and anti-counterfeit work is that the sort of human psychology and the social engineering that was a big part of that business continues to this day. And you were sort of bringing a lot of those learnings and principles forward, and you're just now applying them to, uh, new supply chains and- and new technologies. Is that- is that accurate?Donal Keating:That's accurate. The- the one other thing, we did start to get into what I would consider big data in 2013, 2014, when we started to take activation behavior. So as devices touch Microsoft's servers for activation or validation, starting to do analysis on- on, at a large scale. So there were a lot of indications back when that you could identify countries that had relatively high rates of what I would considered piracy, and they correlated well with what, with encounter rates of malware coming from Defender and th- the various AV companies.Donal Keating:So it- it started out as a narrative, uh, in 2013, 2014, that we had high piracy rates. You als- also had high levels of- of security issues on the devices. I think that has- that has continued to some extent, but now as we move to a more digital, and- and hopefully more secure, supply chain, that opportunity for people to, you know, put large volumes of physical product that have malicious doors on them is hopefully being removed. But the skillset that I learned in, you know, analyzing very large volumes of data, that sort of was the start of it.Donal Keating:In fact, the Digital Crimes Unit built some analytic environments, uh, originally on, you know, on-prem servers, and now we've moved over to Azure. That allows us to do very large-scale analytics of huge datasets. That was sort of borne of our analysis of activation and validation, um, six, seven years ago.Natalia Godyla:You've had a couple notable shifts. What else other than your background in analytics has prepared you, or have you done to prepare for these changes? Do you have any recommendations to somebody who might be experiencing a similar shift and wants to get up to speed for this type of role?Donal Keating:W- well, if it's in Microsoft, we are incredibly lucky in that we have some very, very smart people. I'd say that the number one skillset that you need in navigating this is your ability to pick up the phone and talk to someone and admit that you know nothing about it. You really do have to talk to people who have expert knowledge in the area. Because you can be great at cultivating data, but unless you understand really what it means down to a very, very granular level, not the- not the 101 version of it but the 201 and 301 version of what do these things mean? And in Microsoft, we also have the people from Microsoft Research. I've been helped enormously on the AI and ML side from people who have done this clustering on short strings. Donal Keating:There is no magic to any of this. You've gotta have the data, you've gotta have the right data, you've gotta have the cleaned data, but there are tooling that, once you have everything that you want, allow you to represent it in a way that is easy to manipulate and- and highlight the things that are important. So I would say what have I done? I've talked to a lot of people in Microsoft about how they do what they're specialized at.Nic Fillingham:And what about when you're not working on this stuff? What's, what do you- what do you like to do, Donal, in your- in your spare time? And does any of that, uh, bleed over into your professional life? Do you, uh, do you like to do your thinking when you're climbing walls or- or something? That was a terrible example, but- but what (laughing)- what-Natalia Godyla:(laughs).Nic Fillingham:What do you- what do you do for fun?Donal Keating:Well, when I'm working, my 150-pound dog, who really is a- a- a slobbering sweetheart-Natalia Godyla:(laughs). Nic Fillingham:Type of dog, breed?Donal Keating:He's an Anatolian Shepherd, specifically a Kangal. So-Nic Fillingham:I have a Great Pyrenees, which I believe is a- a distant cousin. Donal Keating:Oh, yes. Yes. Uh, his name is Pamuk, it's a Turkish breed and pamuk means cotton in Turkish. But when I'm working, um, he does kinda, because he's a big dog, I kinda like to think that, you know, hey, if we had a security team that just looked, you know, dangerous would people mess with our product? Natalia Godyla:(laughs).Donal Keating:So that's one thing that, you know, I- I- I do like to think about my job when I walk the dog. But I'm also something of an urban farmer. I have three chickens and I like to grow potatoes, because I'm an Irishman, and turnips and leeks and stuff in my tiny little garden. So.Nic Fillingham:Are your chickens laying at the moment? Because we have ducks, and my ducks have gone on strike and I'm not getting any eggs out of them at the moment.Natalia Godyla:(laughs).Nic Fillingham:I'm wondering if- if you're... I know- I know chickens and ducks are a- a different bird, so I am aware of that, but just wondering if it's, what are you seeing in your- in your chickens?Donal Keating:You know, I'm a data guy, so, um, we went from one egg per chicken per day in the summer to kind of nothing in the late fall, and then starting luckily on the 21st of December, we got a- a burst of eggs. And then we now, out of three chickens I get one a day. I'm not exactly sure which one is doing, is... Natalia Godyla:(laughs).Donal Keating:If one is producing all of 'em or they're firing every third day. But, um, yeah, we're- we're- we're production again. Nic Fillingham:I think we need some machine learning algorithms to, uh, monitor the egg producing habits of chickens and/or ducks to see if we can, uh, increase output. Donal Keating:Uh, for- for sure.Natalia Godyla:(laughs). Donal Keating:It- it's- it's the only way to go about it, eh? The problem though with AI is we'd need to get about half a million chickens, and then we'd have a pretty good answer.Natalia Godyla:(laughs).Nic Fillingham:(laughs). Natalia Godyla:Well, we definitely thank you for that, Donal. And thanks for joining us again on Security Unlocked. Donal Keating:You're very welcome. Thank you for having me back. Natalia Godyla:Well, we had a great time unlocking insights into security. From research to artificial intelligence, keep an eye out for our next episode.Nic Fillingham:And don't forget to tweet us @msftsecurity, or email us at, with topics you'd like to hear on a future episode. Until then, stay safe.Natalia Godyla:Stay secure.

Judging a Bug by Its Title

Ep. 16
Most people know the age-old adage, “Don’t judge a book by its cover.” I can still see my grandmother wagging her finger at me when I was younger as she said it. But what if it's not the book cover we’re judging, but the title? And what if it’s not a book we’re analyzing, but instead a security bug? The times have changed, and age-old adages don’t always translate well in the digital landscape.In this case, we’re using machine learning (ML) to identify and “judge” security bugs based solely on their titles. And, believe it or not, it works! (Sorry, Grandma!)Mayana Pereira, Data Scientist at Microsoft, joins hosts Nic Fillingham and Natalia Godyla to dig into the endeavors that aresaving security experts’ time. Mayana explains how data science and security teams have come together to explore ways that ML can help software developers identify and classify security bugs more efficiently. A task that, without machine learning, has traditionally provided false positives or led developers to overlook misclassified critical security vulnerabilities.In This Episode, You Will Learn:• How data science and ML can improve security protocols and identify and classify bugs for software developers• How to determine the appropriate amount of data needed to create an accurate ML training model• The techniques used to classify bugs based simply on their titleSome Questions We Ask:• What questions need to be asked in order to obtain the right data to train a security model?• How does Microsoft utilize the outputs of these data-driven security models?• What is AI for Good and how is it using AI to foster positive change in protecting children, data and privacy online?Resources:Microsoft Digital Defense Report “Identifying Security Bug Reports Based Solely on Report Titles and Noisy Data”’s LinkedIn’s LinkedIn’s LinkedIn Security Blog: transcript can be found at Fillingham:Hello, and welcome to Security Unlocked, a new podcast from Microsoft where we unlock insights from the latest in news and research from across Microsoft Security engineering and operations teams. I'm Nic Fillingham-Natalia Godyla:And I'm Natalia Godyla. In each episode we'll discuss the latest stories from Microsoft Security, deep dive into the newest threat, intel, research and data science-Nic Fillingham:And profile some of the fascinating people working on artificial intelligence in Microsoft Security.Natalia Godyla:And now let's unlock the pod.Natalia Godyla:Hello, Nic. How's it going?Nic Fillingham:Hello, Natalia. Welcome back. Well, I guess welcome back to Boston to you. But welcome to Episode 16. I'm confused because I saw you in person last week for the first time. Well, technically it was the first time for you, 'cause you didn't remember our first time. It was the second time for me. But it was-Natalia Godyla:I feel like I just need to justify myself a little bit there. It was a 10 second exchange, so I feel like it's fair that I, I was new to Microsoft. There was a lot coming at me, so, uh-Nic Fillingham:Uh, I'm not very memorable, too, so that's the other, that's the other part, which is fine. But yeah. You were, you were here in Seattle. We both did COVID tests because we filmed... Can I say? You, you tell us. What did we do? It's a secret. It is announced? What's the deal?Natalia Godyla:All right. Well, it, it's sort of a secret, but everyone who's listening to our podcast gets to be in the know. So in, in March you and I will be launching a new series, and it's a, a video series in which we talk to industry experts. But really we're, we're hanging with the industry experts. So they get to tell us a ton of really cool things about [Sec Ups 00:01:42] and AppSec while we all play games together. So lots of puzzling. Really, we're just, we're just getting paid to do puzzles with people cooler than us.Nic Fillingham:Speaking of hanging out with cool people, on the podcast today we have Mayana Pereira whose name you may have heard from a few episodes ago Scott Christiansen was on talking about the work that he does. And he had partnered Mayana to build and launch a, uh, machine learning model that looked at the titles of bugs across Microsoft's various code repositories, and using machine learning determined whether those bugs were actually security related or not, and if they were, what the correct severity rating should be. Nic Fillingham:So this episode we thought we'd experiment with the format. And instead of having two guests, instead of having a, a deep dive upfront and then a, a profile on someone in the back off, we thought we would just have one guest. We'd give them a little bit extra time, uh, about 30 minutes and allow them to sort of really unpack the particular problem or, or challenge that they're working on. So, yeah. We, we hope you like this experiment.Natalia Godyla:And as always, we are open to feedback on the new format, so tweet us, uh, @msftsecurity or send us an email Let us know what you wanna hear more of, whether you like hearing just one guest. We are super open. And with that, on with the pod?Nic Fillingham:On with the pod.Nic Fillingham:Welcome to the Security Unlocked podcast. Mayana Pereira, thanks for joining us.Mayana Pereira:Thank you for having me. I'm so happy to be here today, and I'm very excited to share some of the things that I have done in the intersection of [ML 00:03:27] and security.Nic Fillingham:Wonderful. Well, listeners of the podcast will have heard your name back in Episode 13 when we talked to Scott Christiansen, and he talked about, um, a fascinating project about looking for or, uh, utilizing machine learning to classify bugs based simply on, on their title, and we'll get to that in a minute. But could you please introduce you- yourself to our audience. Tell us about your title, but sort of what does that look like in terms of day-to-day and, and, and the work that you do for Microsoft?Mayana Pereira:I'm a data scientist at Microsoft. I've been, I have been working at Microsoft for two years and a half now. And I've always worked inside Microsoft with machine learning applied to security, trust, safety, and I also do some work in the data privacy world. And this area of ML applications to the security world has always been my passion, so before Microsoft I was also working with ML applied to cyber security more in the malware world, but still security. And since I joined Microsoft, I've been working on data science projects that kinda look like this project that we're gonna, um, talk today about. So those are machine learning applications to interesting problems where we can either increase the trust and the security Microsoft products, or the safety for the customer. You know, you would develop m- machine learning models with that in mind. Mayana Pereira:And my day-to-day work includes trying to understand which are those interesting programs across the company, talk to my amazing colleagues such as Scott. And I have a, I have been so blessed with an amazing great team around me. And thinking about these problems, gathering data, and then getting, you know, heads down and training models, and testing new machine learning techniques that have never been used for a specific applications, and trying to understand how well or if they will work for those applications, or if they're gonna get us to better performance, or better accuracy precision and those, those metrics that we tend to use in data science works. And when we feel like, oh, this is an interesting project and I think it is interesting enough to share with the community, we write a paper, we write a blog, we go to a conference such as RSA and we present it to the community, and we get to share the work and the findings with colleagues internal to Microsoft, but also external. So this is kinda what I do on a day-to-day basis.Mayana Pereira:Right now my team is the data science team inside Microsoft that is called AI For Good, so the AI for Good has this for good in a sense of we want to, to guarantee safety, not only for Microsoft customers, but for the community in general. So one of my line of work is thinking about how can I collaborate with NGOs that are also thinking about the security or, and the safety of kids, for example. And this is another thing that I have been doing as part of this AI for Good effort inside Microsoft.Natalia Godyla:Before we dive into the bug report classification project, can you just share a couple of the projects that your team works for AI for Good? I think it would be really interesting for the audience to hear that.Mayana Pereira:Oh, absolutely. So we have various pillars inside the AI for Good team. There is AI for Health, AI for Humanitarian Action, AI for Earth. We have also been collaborating in an effort for having a platform with a library for data privacy. It is a library where we have, uh, various tools to apply the data and get us an output, data with strong privacy guarantees. So guaranteeing privacy for whoever was, had their information in a specific dataset or contributed with their own information to a specific research and et cetera. So this is another thing that our team is currently doing.Mayana Pereira:And we have various partners inside and outside of Microsoft. Like I mentioned, we do a lot of work in NGOs. So you can think like project like AI for Earth several NGOs that are taking care of endangered species and other satellite images for understanding problems with the first station and et cetera. And then Humanitarian Action, I have worked with NGOs that are developing tools to combat child sexual abuse and exploration. AI for Health has so many interesting projects, and it is a big variety of projects. Mayana Pereira:So this is what the AI for Good team does. We are, I think right now we're over 15 data scientists. All of us are doing this work that it is a- applied research. Somehow it is work that we need to sit down with, with our customers or partners, and really understand where the problem is. It's usually some, some problems that required us to dig a little deeper and come up with some novel or creative solution for that. So this is basically the overall, the AI for Good team.Nic Fillingham:Let's get back in the way back machine to I think it was April of 2020, which feels like 700 years ago.Mayana Pereira:(laughs) Nic Fillingham:But you and Scott (laughs) published a blog. Scott talked about on Episode 13 called securing Nic Fillingham:The s- the software development lifecycle with machine learning, and the thing that I think both Natalia and I picked up on when Scott was talking about this, is it sounded first-, firstly it sounded like a exceptionally complex premise, and I don't mean to diminish, but I think Natalia and I were both "oh wow you built a model that sort of went through repro steps and passed all the logs inside of security bugs in order to better classify them but that's not what this does", this is about literally looking at the words that are the title of the security bug, and then building a model to try and determine whether it was truly security or something else, is that right?Mayana Pereira:That's exactly it. This was such an interesting project. When I started collaborating with Scott, and some other engineers in the team. I was a little skeptical about using only titles, to make prediction about whether a bug has, is security related or not. And, it seems. Now that I have trained several models and passed it and later retrained to- to get more of a variety of data in our model. I have learned that people are really good at describing what is going on in a bug, in the title, it feels like they really summarize it somehow so it's- it's doing a good job because, yes, that's exactly what we're doing, we are using bug titles only from several sources across Microsoft, and then we use that to understand which bugs are security related or not, and how we can have an overall view of everything that is happening, you know in various teams across different products. And, that has given a lot of visibilities to some unknown problems and some visibility to some things that we were not seeing before, because now you can scan, millions of bugs in a few seconds. Just reading titles, you have a model that does it really fast. And, I think it is a game changer in that sense, in the visibility and how do you see everything that is happening in that bug world.Natalia Godyla:So what drove that decision? Why are we relying only on the titles, why can't we use the- the full bug reports? Mayana Pereira:There are so many reasons for that. I think, the first reason was the fact that the full bug report, sometimes, has sensitive information. And we were a little bit scared about pulling all that sensitive information which could include passwords, could include, you know, maybe things that should not be available to anyone, and include that in a- in a VM to train a model, or, in a data science pipeline. And, having to be extremely careful also about not having our model learning passwords, not having that. So that was one of the big, I think incentives off, let's try titles only, and see if it works. If it doesn't work then we can move on and see how we can overcome the problem of the sensitive information. And it did work, when we saw that we had a lot of signal in bug titles only, we decided to really invest in that and get really good models by u- utilizing bug titles only. Nic Fillingham:I'm going to read from the blog just for a second here, because some of the numbers here, uh, are pretty staggering, so, again this was written 2020, uh, in April, so there's obviously, probably updated numbers since then but it said that Microsoft 47,000 developers generate nearly 30,000 bugs a month, which is amazing that's coming across over 100 Azure DevOps and GitHub repositories. And then you had it you, you actually have a count here saying since 2001 Microsoft has collected 13 million work items and bugs which I just thinks amazing. So, do you want to speak to, sort of, the volume of inputs and, sort of, signals here in to building that model and maybe some of the challenges, and then a follow on question is, is this model, still active today, is this- is this work still ongoing, has it been incorporated into a product or another, another process?Nic Fillingham:Do you want to start with, with numbers or. Mayana Pereira:Yes, I think that from my data scientist point of view, having such large numbers is absolutely fantastic because it gives us a historical data set, very rich so we can understand how data has evolved over time. And also, if this- the security terminology has changed the law, or how long will this model last, in a sense. And it was interesting to see that you can have different tools, different products, different things coming up, but the security problems, at least for, I would say for the past six, seven years, when it comes to terminology, because what I was analyzing was the terminology of the security problems. My model was a natural language processing model. It was pretty consistent, so that was really interesting to see from that perspective we have. And by having so much data, you know, this amazing volume. It helped us to build better classifiers for sure. So this is my- my data scientist side saying, amazing. I love it so much data.Nic Fillingham:What's the status of this project on this model now.? Is it- is it still going? Has it been embedded into another- another product, uh, or process?Mayana Pereira:Yes, it's still active. It's still being used. So, right now, this product. This, not the product- the product, but the model is mainly used by the customer security interest team in [Sila 00:16:16], so they use the model in order to understand the security state of Microsoft products in general, and, uh, different products and looking at specific products as well, are using the model to get the- the bugs statistics and security bugs statistics for all these different products across Microsoft. And there are plans on integrating the- this specific model or a variation of the model into other security lifecycle pipelines, but this is a decision that is more on CST customer Security Trust side and I have, um, only followed it, but I don't have specific details for that right now. But, I have seen a lot of good interesting results coming out of that model, good insights and security engineers using the results of the model to identify potential problems, and fix those problems much faster.Natalia Godyla:So, taking a step back and just thinking about the journey that your team has gone on to get the model to the state that it's in today. Uh, in the blog you listed a number of questions to figure out what would be the right data to train the model. So the questions were, is there enough data? How good is the data? Are there data usage restrictions? And, can data be generated in a lab? Natalia Godyla:So can you talk us through how you answered these questions like, as a- as a data scientist you were thrilled that there was a ton of data out there, but what was enough data? How did you define how good the data was? Or, whether it was good enough.Mayana Pereira:Great. So, those were questions that I asked myself before even knowing what the project was about, and the answer to is there enough data? It seemed very clear from the beginning that, yes, we had enough data, but those were questions that I brought up on the blog, not only for myself but for anyone else that was interested in replicating those experiments in their company or maybe university or s- anywhere any- any data scientist that is interested to train your own model for classification, which questions should be asked? Once you start a project like this. So the, is there enough data for me? Was clear from the beginning, we had several products so we had a variety of data sources. I think that when you reach, the number of millions of samples of data. I think that speaks for itself. It is a high volume. So I felt, we did have enough data.Mayana Pereira:And, when it came to data quality. That was a more complex question. We had data in our hands, bugs. We wanted to be able to train a model that could different- differentiate from security bugs and non security bugs, you know. And, for that, Usually what we do with machine learning, is we have data, that data has labels, so you have data that represents security bugs, data that represents non security bugs. And then we use that to train the model. And those labels were not so great. So we needed to understand how the not so great labels was going to impact our model, you know, we're going to train a model with labels that were not so great. So Mayana Pereira:That was gonna happen. So that was one of the questions that we asked ourselves. And I did a study on that, on understanding what is the impact of these noisy labels and the training data set. And how is it gonna impact the classification results that we get once using this, this training data? So this was one of the questions that I asked and we, I did several experiments, adding noise. I did that myself, I, I added noise on purpose to the data set to see what was the limits of this noise resilience. You know, when you have noisy labels in training, we published it in a, in an academic conference in 2019, and we understood that it was okay to have noisy labels. So security bugs that were actually labeled as not security and not security bugs labeled as security. There was a limit to that.Mayana Pereira:We kinda understood the limitations of the model. And then we started investigating our own data to see, is our own data within those limits. If yes, then we can use this data confidentially to train our models. If no, then we'll have to have some processes for correcting labels and understanding these data set a little bit better. What can we use and what can we not use to train the models. So what we found out is that, we didn't have noisy labels in the data set. And we had to make a few corrections in our labels, but it was much less work because we understood exactly what needed to be done, and not correct every single data sample or every single label in a, an enormous data set of millions of entries. So that was something that really helped. Mayana Pereira:And then the other question, um, that we asked is, can we generate data in the lab? So we could sometimes force a specific security issue and generate some, some box that had that security description into titles. And why did we include that in the list of questions? Because a lot of bugs that we have in our database are generated by automated tools. So when you have a new tool being included in your ecosystem, how is your model going to recognize the bugs that are coming from this new tool? So does our, ma- automatically generated box. And we could wait for the tool to be used, and then after a while we gathered the data that the tool provided us and including a retraining set. But we can also do that in the lab ecosystem, generate data and then incorporate in a training set. So this is where this comes from.Nic Fillingham:I wanted to ask possibly a, a very rudimentary question, uh, especially to those that are, you know, very familiar with machine learning. When you have a data set, there's words, there is text in which you're trying to generate labels for that text. Does the text itself help the process of creating labels? So for example, if I've got a bug and the name of that bug is the word security is in the, the actual bug name. Am I jump-starting, am I, am I skipping some steps to be able to generate good labels for that data? Because I already have the word I'm looking for. Like I, I think my question here is, was it helpful to generate your labels because you were looking at text in the actual title of the bug and trying to ascertain whether something was security or not?Mayana Pereira:So the labels were never generated by us or by me, the data scientists. The labels were coming from the engineering systems where we collected the data from. So we were relying on what- whatever happened in the, in the engineering team, engineering group and relying that they did, uh, a good job of manually labeling the bugs as security or not security. But that's not always the case, and that doesn't mean that the, the engineers are not good or are bad, but sometimes they have their own ways of identifying it in their systems. And not necessarily, it is the same database that we had access to. So sometimes the data is completely unlabeled, the data that comes to us, and sometimes there are mistakes. Sometimes you have, um, specific engineer that doesn't have a lot of security background. The person sees a, a problem, describes the problem, but doesn't necessarily attribute the problem as a security problem. Well, that can happen as well.Mayana Pereira:So that is where the labels came from. The interesting thing about the terminology is that, out of the millions and millions of security bugs that I did review, like manually reviewed, because I kinda wanted to understand what was going on in the data. I would say that for sure, less than 1%, even less than that, had the word security in it. So it is a very specific terminology when you see that. So people tend to be very literal in what the problem is, but not what the problem will generate. In a sense of they will, they will use things like Cross-site Scripting or passwords in clear, but not necessarily, there's a security pr- there's a security problem. But just what the issue is, so it is more of getting them all to understand that security lingual and what is that vocabulary that constitutes security problems. So that's wh- that's why it is a little bit hard to generate a list of words and see if it matches. If a specific title matches to this list of words, then it's security.Mayana Pereira:It was a little bit hard to do that way. And sometimes you have in the title, a few different words that in a specific order, it is a security problem. In another order, it is not. And then, I don't have that example here with me, but I, I could see some of those examples in the data. For example, I think the Cross-site Scripting is a good example. Sometimes you have site and cross in another place in the title. It has nothing to do with Cross-site Scripting. Both those two words are there. The model can actually understand the order and how close they are in the bug title, and make the decision if it is security or not security. So that's why the model is quite easier to distinguish than if we had to use rules to do that.Natalia Godyla:I have literally so many questions. Nic Fillingham:[laughs].Natalia Godyla:I'm gonna start with, uh, how did you teach at the lingo? So what did you feed the model so that it started to pick up on different types of attacks like Cross-site Scripting?Mayana Pereira:Perfect. The training algorithm will do that for me. So basically what I need to guarantee is that we're using the correct technique to do that. So the technique will, the machine learning technique will basically identify from this data set. So I have a big data set of titles. And each title will have a label which is security or non-security related to it. Once we feed the training algorithm with all this text and their associated labels, the training algorithm will, will start understanding that, some words are associated with security, some words are associated with non-security. And then the algorithm will, itself will learn those patterns. And then we're gonna train this algorithm. So in the future, we'll just give the algorithm a new title and say, "Hey, you've learned all these different words, because I gave you this data set from the past. Now tell me if this new ti- if this new title that someone just came up with is a security problem or a, a non-security problem." And the algorithm will, based on all of these examples that it has seen before, will make a decision if it is security or non-security.Natalia Godyla:Awesome. That makes sense. So nothing was provided beforehand, it was all a process of leveraging the labels. Mayana Pereira:Yes.Natalia Godyla:Also then thinking about just the dataset that you received, you were working with how many different business groups to get this data? I mean, it, it must've been from several different product teams, right?Mayana Pereira:Right. So I had the huge advantage of having an amazing team that is a data center team that is just focused on doing that. So their business is go around the company, gather data and have everything harmonized in a database. So basically, what I had to do is work with this specific team that had already done this amazing job, going across the company, collecting data and doing this hard work of harvesting data and harmonizing data. And they had it with them. So it is a team that does that inside Microsoft. Collects the data, gets everything together. They have their databases updated several times a day, um, collecting Mayana Pereira:... Data from across the company, so it is a lot of work, yeah.Natalia Godyla:So do different teams treat bug reports differently, meaning is there any standardization that you had to do or anything that you wanted to implement within the bug reports in order to get better data?Mayana Pereira:Yes. Teams across the company will report bugs differently using different systems. Sometimes it's Azure DevOps, sometimes it can be GitHub. And as I mentioned, there is a, there was a lot of work done in the data harmonization side before I touched the data. So there was a lot of things done to get the data in, in shape. This was something that, fortunately, several amazing engineers did before I touched the data. Basically, what I had to do once I touched it, was I just applied the data as is to the model and the data was very well treated before I touched it. Nic Fillingham:Wow. So many questions. I did wanna ask about measuring the success of this technique. Were you able to apply a metric, a score to the ... And I'm, I, I don't even know what it would be. Perhaps it would be the time to address a security bug pre and post this work. So, did this measurably decrease the amount of time for prioritized security bugs to be, to be addressed?Mayana Pereira:Oh, definitely. Yes, it did. So not only it helped in that sense, but it helped in understanding how some teams were not identifying specific classes of bugs as security. Because we would see this inconsistency with the labels that they were including in their own databases. These labels would come to this big database that is harmonized and then we would apply the model on top of these data and see that specific teams were treating their, some data points as non-security and should have been security. Or sometimes they were treating as security, but not with the correct severity. So it would, should have been a critical bug and they were actually treating it as a moderate bug. So, that, I think, not only the, the timing issue was really important, but now you have a visibility of behavior and patterns across the company that the model gives us.Nic Fillingham:That's amazing. And so, so if I'm an engineer at Microsoft right now and I'm in my, my DevOps environment and I'm logging a bug and I use the words cross- cross scripting somewhere in the bug, what's the timing with which I get the feedback from your model that says, "Hey, your prioritization's wrong," or, "Hey, this has been classified incorrectly"? Are we at the point now where this model is actually sort of integrated into the DevOps cycle or is that still coming further down the, the, the path?Mayana Pereira:So you have, the main customer is Customer Security and Trust team inside Microsoft. They are the ones using it. But as soon as they start seeing problems in the data or specific patterns and problems in specific teams' datasets, they will go to that team and then have this, they have a campaign where they go to different teams and, and talk to them. And some teams, they do have access to the datasets after they are classified by our model. Right now, there's, they don't have the instant response, but that's, that's definitely coming.Nic Fillingham:So, Mayana, how is Customer Security and Trust, your organization, utilizing the outputs of this model when a, when a, when a bug gets flagged as being incorrectly classified, you know, is there a threshold, and then sort of what happens when you, when you get those flags?Mayana Pereira:So the engineering team, the security engineering team in Customer Security and Trust, they will use the model to understand the overall state of security of Microsoft products, you know, like the products across the company, our products, basically. And they will have an understanding of how fast those bugs are being mitigated. They'll have an understanding of the volume of bugs, and security bugs in this case, and they can follow this bugs in, in a, in a timely manner. You know, as soon as the bug comes to the CST system, they bug gets flagged as either security or not security. Once it's flagged as security, there, there is a second model that will classify the severity of the bug and the CST will track these bugs and understand how fast the teams are closing those bugs and how well they're dealing with the security bugs.Natalia Godyla:So as someone who works in the AI for Good group within Microsoft, what is your personal passion? What would you like to apply AI to if it, if it's not this project or, uh, maybe not a project within Microsoft, what is, what is something you want to tackle in your life?Mayana Pereira:Oh, love the question. I think my big passion right now is developing machine learning models for eradication of child sexual abuse medias in, across different platforms. So you can think about platform online from search engines to data sharing platforms, social media, anything that you can have the user uploading content. You can have problems in that area. And anything where you have using visualizing content. You want to protect that customer, that user, from that as well. But most importantly, protect the victims from those crimes and I think that has been, um, something that I have been dedicating s- some time now. I was fortunate to work with an NGO, um, recently in that se- in that area, in that specific area. Um, developed a few models for them. She would attacked those kind of medias. And these would be my AI for Good passion for now. The other thing that I am really passionate about is privacy, data privacy. I feel like we have so much data out there and there's so much of our information out there and I feel like the great things that we get from having data and having machine learning we should not, not have those great things because of privacy compromises. Mayana Pereira:So how can we guarantee that no one's gonna have their privacy compromised? And at the same time, we're gonna have all these amazing systems working. You know, how can we learn from data without learning from specific individuals or without learning anything private from a specific person, but still learn from a population, still learn from data. That is another big passion of mine that I have been fortunate enough to work in such kind of initiatives inside Microsoft. I absolutely love it. When, when I think about guaranteeing privacy of our customers or our partners or anyone, I think that is also a big thing for me. And that, that falls under the AI for Good umbrella as well since that there's so much, you know, personal information in some of these AI for Good projects. Natalia Godyla:Thank you, Mayana, for joining us on the show today.Nic Fillingham:We'd love to have you back especially, uh, folks, uh, on your team to talk more about some of those AI for Good projects. Just, finally, where can we go to follow your work? Do you have a blog, do you have Twitter, do you have LinkedIn, do you have GitHub? Where should, where should folks go to find you on the interwebs?Mayana Pereira:LinkedIn is where I usually post my latest works, and links, and interesting things that are happening in the security, safety, privacy world. I love to, you know, share on LinkedIn. So m- I'm Mayana Pereira on LinkedIn and if anyone finds me there, feel free to connect. I love to connect with people on LinkedIn and just chat and meet new people networking.Natalia Godyla:Awesome. Thank you. Mayana Pereira:Thank you. I had so much fun. It was such a huge pleasure to talk to you guys.Natalia Godyla:Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode. Nic Fillingham:And don't forget to Tweet us at MSFTSecurity or email us at with topics you'd like to hear on a future episode. Until then, stay safe. Natalia Godyla:Stay secure.

Enterprise Resiliency: Breakfast of Champions

Ep. 15
Prior to the pandemic,workdaysused to look a whole lot different.If you had a break,youcouldtake a walk to stretch your legs, shake the hands of your co-workers,orget some 1-on-1 face timewith the boss. Ahh... those were the days. Thatclose contact we once had is now somethingthat manyof usyearn for aswe’vehad to abruptlylift andshift fromliving in our office to working from our home.But communicating and socializing aren’t the only things that were easier back then. The walls of your office have expanded, and with them, the boundaries of your security protocols. Small in-office tasks like patching a server have now become multi-step processes that require remote management, remote updates, and remote administrative control. With that comes the prioritization of resilience and what it means for enterprises, customers, and security teams alike.That’swhere remote enterprise resiliency comes into play.Today on the pod,we explore the final chapter of the MDDR.Irfan Mirza,Director of Enterprise Continuity and Resilience atMicrosoft, wrapsupthe observationsfrom the report bygivinghostsNic FillinghamandNatalya Godylathe rundown on enterprise resiliencyand discusses how we canensure the highest levels of security while working from home.Irfan explains theZero trustmodel and how Microsoft is working to extend security benefits to your kitchen or home office, or...thatmake-shiftworkspacein your closet.In the second segment,Andrew Paverd,Senior Researcheron the Microsoft Security Response Center Teamandjackof all trades,stops by…andwe’renot convinced he’s fully human.He’shere to tell us about the many hats he wears,from safe systemsprogramming to leveraging AI to helpwith processes within the MSRC,andshares how he has to think like a hacker to prevent attacks. Spoiler alert:he’sa big follower of Murphy’s Law.In This Episode, You Will Learn:•How classical security models are being challenged•What the Zero Trust Model is and how it works•The three critical areas of resilience: extending the enterprise boundary, prioritizing resilient performance, and validating the resilience of our human infrastructure.•How hackers approach our systems and technologiesSome Questions We Ask:•How has security changed as a product of the pandemic?•Do we feel like we have secured the remote workforce?•What frameworks exist to put a metric around where an organization is in terms of its resiliency?•What is Control Flow Guard (CFG) and Control-Flow Integrity?•What’sthe next stage for the Rust programming language?Resources:Microsoft Digital Defense Report:’s LinkedIn’s LinkedIn’s LinkedIn’s LinkedIn Security Blog: transcript can be found at Fillingham:Hello, and welcome to Security Unlocked, a new podcast from Microsoft, where we unlock insights from the latest in news and research from across Microsoft security, engineering and operations teams. I'm Nic Fillingham.Natalia Godyla:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft Security, deep dive into the newest threat intel, research and data science. Nic Fillingham:And profile some of the fascinating people working on artificial intelligence in Microsoft Security. Natalia Godyla:And now let's unlock the pod. Hi Nic, I have big news.Nic Fillingham:Big news. Tell me a big news.Natalia Godyla:I got a cat. Last night at 8:00 PM, I got a cat. Nic Fillingham:Did it come via Amazon Prime drone? Natalia Godyla:No.Nic Fillingham:Just, that was a very specific time. Like 8:00 PM last night is not usually the time I would associate people getting cats. Tell me how you got your cat. Natalia Godyla:It was a lot more conventional. So I had an appointment at the shelter and found a picture of this cat with really nubby legs and immediately-Nic Fillingham:(laughs).Natalia Godyla:... fell in love obviously. And they actually responded to us and we went and saw the cat, got the cat. The cat is now ours. Nic Fillingham:That's awesome. Is the cat's name nubby. Natalia Godyla:It's not, but it is on the list of potential name changes. So right now the cat's name is tipper. We're definitely nervous about why the cat was named tipper. Nic Fillingham:(laughs).Natalia Godyla:We're hiding all of the glass things for right now. Nic Fillingham:How do we get to see the cat? Is there, will there be Instagram? Will there be Twitter photos? This is the most important question.Natalia Godyla:Wow. I haven't planned that yet.Nic Fillingham:You think about that and I'll, uh, I'll start announcing the first guest on this episode.Natalia Godyla:(laughs).Nic Fillingham:On today's episode, we speak with Irfan Mirza, who is wrapping up our coverage of the Microsoft Digital Defense Report with a conversation about enterprise resiliency. Now, this is really all of the chapters that are in the MDDR, the nation state actors, the increase in cyber crime sophistication, business email compromise that you've heard us talk about on the podcast, all gets sort of wrapped up in a nice little bow in this conversation where we talk about all right, what does it mean, what does it mean for customers? What does it mean for enterprises? What does it mean for security teams? And so we talk about enterprise resiliency. And we actually recorded this interview in late 2020, but here we are, you know, two months later and those findings are just as relevant, just as important. It's a great conversation. And after that, we speak with-Natalia Godyla:Andrew Paverd. So he is a senior researcher on the Microsoft Security Response Center team. And his work is well, well, he does a ton of things. I honestly don't know how he has time to pull all of this off. So he does everything from safe systems programming to leveraging AI, to help with processes within MSRC, the Microsoft Security Response Center. And I just recall one of the quotes that he said from our conversation was hackers don't respect your assumptions, or something to that effect, but it's such a succinct way of describing how hackers approach our systems and technology. So another really great conversation with a, a super intelligent researcher here at Microsoft.Nic Fillingham:On with the pod.Natalia Godyla:On with the pod. Today, we're joined by Irfan Mirza, Director of Enterprise Continuity and Resilience, and we'll be discussing the Microsoft Digital Defense Report and more specifically enterprise resilience. So thank you for being on the show today, Irfan.Irfan Mirza:Thanks so much glad to be here. And hope we have a, a great discussion about this. This is such an important topic now. Natalia Godyla:Yes, absolutely. And we have been incrementally working through the Microsoft Digital Defense Report, both Nic and I have read it and have had some fantastic conversations with experts. So really looking forward to hearing about the summation around resilience and how that theme is pulled together throughout the report. So let's start it off by just hearing a little bit more about yourself. So can you tell us about your day-to-day? What is your role at Microsoft? Irfan Mirza:Well, I lead the enterprise continuity and resilience team and we kind of provide governance overall at the enterprise. We orchestrate sort of all of the, the risk mitigations. We go and uncover what the gaps are, in our enterprise resilience story, we try to measure the effectiveness of what we're doing. We focus on preparedness, meaning that the company's ready and, you know, our critical processes and services are always on the ready. It's a broad space because it spans a very, very large global enterprise. And it's a very deep space because we have to be experts in so many areas. So it's a fun space by saying that.Natalia Godyla:Great. And it's really appropriate today then we're talking about the MDDR and enterprise resilience. So let's start at a high level. So can you talk a little bit about just how security has changed as a product of the pandemic? Why is resilience so important now? Irfan Mirza:Yeah, it's a great question. A lot of customers are asking that, our field is asking that question, people within the company are asking. Look, we've been 11 months under this pandemic. Maybe, you know, in some places like China, they've been going through it for a little bit longer than us, you know, a couple of months more. What we're finding after having sort of tried to stay resilient through this pandemic, uh, one obviously is on the human side, everyone's doing as much as we possibly can there. But the other part of it is on the enterprise side. What is it that we're having to think about as we think of security and as we think of enterprise resilience?Irfan Mirza:There are a couple of big things that I think I would note, one is that, look, when this pandemic hit us, our workforce lifted and shifted. I mean, by that, I mean that we, we, we got up out of our offices and we all left. I mean, we took our laptops and whatever we could home. And we started working remotely. It was a massive, massive lift and shift of personnel, right? We got dispersed. Everybody went to their own homes and most of us have not been back to the office. And it's not just at Microsoft, even, even a lot of our customers and our partners have not gone back to the office at all, right? So that, that's a prolong snow day, if you want to call it that.Irfan Mirza:The other thing that happened is our workload went with us. Wasn't just that, "Hey, you know, I'm taking a few days off, I'm going away or going on vacation and, and I'll be checking email periodically." No, I actually took our work with us and we started doing it remotely. So what that's done is it's created sort of a, a need to go back and look at what we thought was our corporate security boundary or perimeter.Irfan Mirza:You know, in the classical model, we used to think of the corporation and its facilities as the, the area that we had to go and secure. But now in this dispersed workforce model, we have to think about my kitchen as part of that corporate perimeter. And all of a sudden we have to ensure that, that my kitchen is as secure as the corporate network or as the facilities or the office that I was working from. That paradigm is completely different than anything we'd thought about before. Nic Fillingham:And so Irfan, in the MDDR, uh, this section, um, and if you've got the report open, you're playing along at home, I believe it's page 71. This enterprise resiliency is sort of a wrap-up of, of a lot of the observations that are in the MDDR report. It's not a new section. It's as you're getting towards the end of the report, you're looking for, okay, now what does this mean to me? I'm a CSO. I need to make new security policies, security decisions for my organization. This concept of enterprise resiliency is sort of a wrap up of everything that we've seen across cyber crime, across the nation state, et cetera, et cetera. Is that, is that accurate? Is that a good way to sort of read that section in the report? Irfan Mirza:Yeah. It is really the, the way to think of it, right.? It's sort of like a, the conclusion, so what, or why is this relevant to me and what can I do about it? When you think about the report and the way that it's structured, look, we, you know, the report goes into great detail about cyber crime as you called out Nic. And then it talks about nation state threats.Irfan Mirza:These are newer things to us. We've certainly seen them on the rise, actors that are well-trained, they're well-funded they play a long game, not necessarily a short game, they're looking, they're watching and they're waiting, they're waiting for us to make mistakes or to have gaps, they look for changes in tactics, either ours, uh, they themselves are quite agile, right? Irfan Mirza:So when you think about the environment in which we have to think about resilience, and we have to think about security, that environment itself has got new vectors or new threats that are, that are impacting it, right? In addition to that, our workforce has now dispersed, right? We're all over the, all over the globe. We see emerging threats that are, that are, non-classical like ransomware. We see attacks on supply chain. We continue to see malware and malware growing, right? Irfan Mirza:And, and so when you think about that, you have to think if I need to secure now my, my dispersed corporate assets and resources, my people, the workload, the data, the services and the processes that are all there, what are the, the sort of three big things I would need to think about? And so this report sort of encapsulates all, all of that. It gives the details of what, what's happening. And, and then page 71 is you say that resilience piece sort of comes back and says, "Look, your security boundaries extended. Like it or not, it is extended at this point. You've got to think beyond that on-site perimeter that we were thinking about before."Irfan Mirza:So we have to start thinking differently. And th- there's three critical areas that are sort of called out, acknowledging the security boundary has increased, thinking about resilience and performance, and then validating the resilience of our human infrastructure. This is like new ideas, but these are all becoming imperatives for us. We're having to do this now, whether we like it or not. Irfan Mirza:And so this report sort of gives our customers, and, and it's a reflection of what we're doing in the company. It's an open and honest conversation about how we propose to tackle these challenges that we're facing.Nic Fillingham:And so Irfan if we can move on to that critical area, number two, that prioritizing resilient performance. When I say the word performance and resilient performance, is that scoped down just to sort of IT infrastructure, or does that go all the way through to the humans, the actual people in the organization and, um, how they are performing their own tasks, their own jobs and the tasks that are part of their, their job and et cetera, et cetera? What's the, I guess what's the scope of that area too?Irfan Mirza:As we were thinking about resilience, as you know, shortly after we dispersed the workforce, we started thinking about, about what should be included in our classical understanding of resilience. But when you think about, about typical IT services and online services, and so on, a lot of that work is already being done with the life site reviews that we do and people are paying very close attention to service performance. We have SLAs, we have obligations, we have commitments that we've made that our services will be performing to a certain degree, but there are also business processes that are associated with these services very closely. Irfan Mirza:When you think about all of the processes that are involved and services that are involved from the time a customer thinks of buying Office, uh, 365, as an example, to the time that they provision their first mailbox, or they receive their first email, there are dozens of process, business processes. Irfan Mirza:Every single service in that chain could be working to 100% efficiency. And yet if the business processes, aren't there, for instance, to process the deal, to process the contract, to process, uh, the customer's payment or, uh, acknowledge receipt of the payment in order to be able to provision the service, all of these processes, all of a sudden have to, we have to make sure that they're also performing.Irfan Mirza:So when we start thinking about resilience, up to now, business continuity has focused on, are you ready? Are you prepared? Are your dependencies mapped? Have you, have you done a business impact analysis? Are you validating and testing your preparedness? You know, are you calling down your call tree for instance? But I think where we're going now with true enterprise resilience, especially in this sort of modern Irfan Mirza:... day, we're, we're looking at performance, right? What, what is your preparedness resulting in? So if you stop and you think about a child at school, they get homework. Well, the homework really, they bring it home. They do it. They take it back to the teacher. They get graded on it. That's wonderful. This means that the child is ready. But at some point in time, the class or the teacher is going to give them a test, and that test is going to be the measure of performance, right? Irfan Mirza:So we need to start thinking of resilience and continuity in the same way. We're prepared. We've done all our homework. Now let's go and see how many outages did you have? How critical were the outages? How long did they last? How many of them were repeat outages? How many of the repeat outages were for services that are supposed to have zero downtown, like services that are always supposed to on like your DNS service or your identity auth- authentication service, right? So, when you start thinking about, uh, resilience from that perspective, now you've got a new set of data that you have to go and capture, or data that you're capturing, you have to now have to have insights from it. You've got to be able to correlate your preparedness, meaning the homework that you've done with your actual performance, your outage and your, and your gap information. All right?Irfan Mirza:So that, that's what prioritizing resilient performance is all about. It's about taking realtime enterprise preparedness and mapping it to real time enterprise performance. That tells you if your preparedness is good enough or not, or what it is that you need to do. There's a loop here, a feedback loop that has to be closed. You can't just say that, well, you know, we've done all the exercises theoretically. We're good and we're ready to take on any sort of a crisis or, or, or disaster. Yeah, that's fine. Can we compare it to realtime what you're doing? Can we break glass and see what that looks like? Can we shut you down and or shut down parts of your operation as in the event of an earthquake for instance, or a hurricane wiping out, uh, access to a data center, right? Can we do those things and still be resilient when that happens? So this is what performance and resilience come together in that space.Natalia Godyla:So am I right in understanding that beyond, like you said, the theoretical where you think about the policies that you should have in place, and the frameworks that you should have in place, you have the analytics on, you know, the state of, the state of how performant your systems are to date. And then in addition, is there now the need for some sort of stress testing? Like actually figuring out whether an additional load on a system would cause it to break, to not be resilient? Is that now part of the new approach to resilience?Irfan Mirza:Yeah. There are, there are several, several things to do here, right? You absolutely said it. There's a stress test. Actually, this pandemic has, is already a stress test in and of itself, right? It's stressing us in a many ways. It's stressing, obviously the psyche and, and, you know, our whole psychology, and our ability to sustain in quarantine, in isolated, in insulated environments and so on. But it's also testing our ability to do the things that we just so, uh, so much took for granted, like the ability to patch a server that's sitting under my desk in the office whenever I needed to, right? That server now has to become a managed item that somebody can manage remotely, patch remotely, update remotely when needed, control administrative access and privileges remotely. But yes, for resilience, I think we need to now collect all of the data that we have been collecting or looking at and saying, can we start to create those correlations between our preparedness and between our real performance? Irfan Mirza:But there's another area that this dovetails into which is that of human resilience, right? We talked a little bit earlier about, you know, sort of the whole world enduring this hardship. We need to first and foremost look at our suppliers, subcontractors, people that we're critically dependent on. What is their resilience look like? That's another aspect that we have to go back. In the areas where we have large human resources or, or workforces that are working on our behalf, we need to make sure that they're staying resilient, right? Irfan Mirza:We talked on a lot about work/life balance before. Now I think the new buzzword in HR conference rooms is going to be work/life integration. It's completely integrated, and so we need to start thinking about the impact that would have. Are we tracking attrition of our employees, of certain demographics within the employees? Are we looking at disengagement? People just sort of, "Yeah, I'm working from home, but I'm not really being fully engaged." Right? The hallway conversations we used to have are no longer there. And we need to start thinking, are people divesting? Our resources, are they divesting in the workplace? Are they divesting in their, in their work or work/life commitment? These measures are all now having to be sort of like... Irfan Mirza:We used to rely on intuition, a look, a hallway gaze, look at the, the snap in somebody's walk as they walked away from you or out of your office. We don't have that anymore. Everybody's relatively stagnant. We're, we're, we're seated. We don't get to see body language that much. We don't get to read that. There's a whole new set of dynamics that are coming into play, and I think smart corporations and smart companies will start looking at this as a very important area to pay attention to.Nic Fillingham:How are we measuring that? What tools or sort of techniques, or, or sort of frameworks exist to actually put a metric around this stuff, and determine sort of where, where an organization is in terms of their level of resiliency?Irfan Mirza:This question is actually the whole reason why we brought this enterprise resilience sort of a conclusion to this fourth chapter, and, and, you know, the summation of this, of this report. Irfan Mirza:What we're doing now is we're saying, look. Things that used to be fundamentally within the domain of IT departments, or used to be fundamentally with, within the domain of live site, or used to be fundamentally in the domain of human resource departments are now all floating up to be corporate imperatives, to be enterprise imperatives. I think the thinking here is that we need to make sure that the data that we've been collecting about, as an example to answer your question, attrition, right? A certain demographic. Millennials, uh, changing jobs, leaving the company, just to pick an example more than anything else. This is no longer just data that the HR Department is interested in, or that recruiting would be interested in, or, or retention would be interested. This is data that's about to significantly impact the enterprise, and it needs to be brought into the enterprise purview.Irfan Mirza:Our classical and traditional models of looking at things in silos don't allow us to do that. What we're recommending is that we need to have a broader perspective and try to drive insights from this that do tell a more comprehensive story about our ent- enterprise resilience. That story needs to include the resilience of our services, our business processes, our suppliers, our human capital, our infrastructure, our extended security boundary, our data protection, uh, prevention of data loss, our intrusion detection. I mean, there's such a broad area that we have to cover. That's we're saying. And, and as we implement this new sort of zero trust model, I think the, the effectiveness of that model, how much progress we're making is becoming an enterprise priority, not just something that the IT department is going to go around on it's own.Nic Fillingham:Irfan, I wonder if I could put you on the spot, and were there any interesting bits of data that you saw in those first couple months of the shift to remote work where like, yeah, the number of unique devices on the Microsoft corporate network quadrupled in 48 hours. Like any, anything like that? I'd just wondering what, what little stats you may have in hand.Irfan Mirza:Yeah. The number of devices and sort of the flavors of devices, we've always anticipated that that's going to be varied. We're cognizant of that. Look, we have, you know, people have PCs. They have MACs. They have Linux machines, and, and they have service o- operating software. There's a lot of different flavors. And, and it's not just the device and the OS that matters, it's also what applications you're running. Some applications we can certify or trust, and others perhaps we can't, or that we still haven't gotten around to, to verifying, right? And all of these sit, and they all perform various functions including intruding and potentially exfiltrating data and Spyware and Malware and all of that. So when you think about that, we've always anticipated it. Irfan Mirza:But the one thing that, that we were extremely worried about, and I think a lot of our Enterprise customers were worried about, is the performance of the workforce. What we found very early on in, in the, in the lift and shift phase was that we needed to have a way of measuring is our, our built processes working? Are we checking in the same amount of code as we were before? And we noted a couple of interesting things. We looked at our, our VPN usage and said, what are those numbers look like? Are they going up and down?Irfan Mirza:And I think what we found is that initially, the effect was quite comparable to what we had, uh, when we experienced snow days. Schools are shut down. People don't go to work. They're slipping and sliding over here. We're just not prepared for snow weather in, in this state like some of the others. So what happened is, we saw that we were, we were sort of seeing the same level of productivity as snow days. We say that we had the same level of VPN usage as snow days, and we were worried because that, you know, when, when it snows, people usually take the day off, and then they go skiing. Irfan Mirza:So what happened? Well, after about a week things started picking back up. People got tired of sort of playing snow day and decided that, you know what? It's time to, to dig in, and human nature, I think, kicked in, the integrity of the workforce kicked in. And sure enough, productivity went up, VPN usage went up, our number of sessions, the duration of sessions. Meetings became shorter.Nic Fillingham:Can I tell you hallelujah? (laughs) Irfan Mirza:(laughs) Nic Fillingham:That's one of the, that's one of the great-Irfan Mirza:Absolutely.Nic Fillingham:... upsides, isn't it? To this, this new culture of remote work is that we're all meeting for, for less amount of time, which I think, I think is fantastic.Irfan Mirza:Look, you know, in times of crisis, whether it's a natural disaster, or a pandemic, or, or a manmade situation such as a war or a civil war, or whatever, I, I think what happens is the amount of resources that you are customarily used to having access to gets limited. The way in which you work shifts. It changes. And so the, the true test of resilience, I think, is when you are able to adapt to those changes gracefully without requiring significant new investment and you're able to still meet and fulfill your customer obligations, your operational expectations. That really is.Irfan Mirza:So what you learn in times of hardship are to sort of live, you know, more spartan-like. And that spartan-ism, if there's such a word as that, that's what allows you to stay resilient, to say what are the core things that I need in order to stay up and running? And those fundamental areas become the areas of great investment, the areas that you watch over more carefully, the areas that you measure the performance of, the areas that you look for patterns and, and trends in to try to predict what's happening, right?Irfan Mirza:So that is something that carries over from experiences of being in the front lines of a, uh, a war or, or from being, uh, you know, in the midst of a hurricane trying to recover a data center, or an earthquake, or any other, uh, type of power outage, right? These are all the sort of key scenarios that we would be going to look at. And that's one of the things they all have in common. It's really that you don't have the resources or access to the resources that you thought you did, and now you've got to be able to do some things slightly differently.Natalia Godyla:Thank you for joining us on the podcast today. It's been great to get your perspective on enterprise resilience. Really fascinating stuff. So, thank you.Irfan Mirza:Thank you, Natalia. And, and thank you, Nick. It's been a great conversation. As I look back at this discussion that we had, I feel even, even stronger now that the recommendations that we're making, and the guidance that we're giving our customers and sharing our experiences, becomes really, really important. I think this is something that we're learning as we're going along. We're learning on the journey. We're uncovering things that we didn't know. We're looking at data in a different way. We're, we're trying to figure out how do we sustain ourselves, Nic Fillingham:... not just through this pandemic, but also beyond that. And I think the, whatever it is that we're learning, it becomes really important to share. And for our customers and people who are listening to this podcast to share back with us what they've learned, I think that becomes incredibly important because as much as we like to tell people what we're doing, we also want to know what, what people are doing. And so learning that I think will be a great, great experience for us to have as well. So thank you so much for enabling this conversation. Natalia Godyla:And now let's meet an expert from the Microsoft security team to learn more about the diverse backgrounds and experiences of the humans creating AI and tech at Microsoft. Welcome back to another episode of Security Unlocked. We are sitting with Andrew Paverd today, senior researcher at Microsoft. Welcome to the show, Andrew. Andrew Paverd:Thanks very much. And thanks for having me. Natalia Godyla:Oh, we're really excited to chat with you today. So I'm just doing a little research on your background and looks like you've had a really varied experience in terms of security domains consulting for mobile device security. I saw some research on system security. And it looks like now you're focused on confidential computing at Microsoft. So let's start there. Can you talk a little bit about what a day in the life of Andrew looks like at Microsoft? Andrew Paverd:Absolutely. I think I have one of the most fascinating roles at Microsoft. On a day-to-day basis, I'm a researcher in the confidential computing group at the Microsoft Research Lab in Cambridge, but I also work very closely with the Microsoft Security Response Center, the MSRC. And so these are the folks who, who are dealing with the frontline incidents and responding to reported vulnerabilities at Microsoft. But I work more on the research side of things. So how do we bridge the gap between research and what's really happening on the, on the front lines? And so I, I think my position is quite unique. It's, it's hard to describe in any other way than that, other than to say, I work on research problems that are relevant to Microsoft security. Natalia Godyla:And what are some of those research problems that you're focused on? Andrew Paverd:Oh, so it's actually been a really interesting journey since I joined Microsoft two years ago now. My background, as you mentioned, was actually more in systems security. So I had, I previously worked with technologies like trusted execution environments, but since joining Microsoft, I've worked on two really, really interesting projects. The, the first has been around what we call safe systems programming languages. Andrew Paverd:So to give a bit more detail about it in the security response center, we've looked at the different vulnerabilities that Microsoft has, has patched and addressed over the years and seen some really interesting statistics that something like 70% of those vulnerabilities for the pa- past decade have been caused by a class of vulnerability called memory corruption. And so the, the question around this is how do we try and solve the root cause of problem? How do we address, uh, memory corruption bugs in a durable way? Andrew Paverd:And so people have been looking at both within Microsoft and more broadly at how we could do this by transitioning to a, a different programming paradigm, a more secure programming language, perhaps. So if you think of a lot of software being written in C and C++ this is potentially a, a cause of, of memory corruption bugs. So we were looking at what can we do about changing to safer programming languages for, for systems software. So you might've heard about new languages that have emerged like the Rust programming language. Part of this project was investigating how far we can go with languages like Rust and, and what do we need to do to enable the use of Rust at Microsoft.Natalia Godyla:And what was your role with Rust? Is this just the language that you had determined was a safe buyable option, or were you part of potentially producing that language or evolving it to a place that could be safer? Andrew Paverd:That's an excellent question. So in, in fact it, it was a bit of both first determining is this a suitable language? Trying to define the evaluation criteria of how we would determine that. But then also once we'd found Rust to be a language that we decided we could potentially run with, there was an element of what do we need to do to bring this up to, let's say to be usable within Microsoft. And actually I, I did quite a bit of work on, on this. We realized that, uh, some Microsoft security technologies that are available in our Microsoft compilers weren't yet available in the Rust compiler. One in particular is, is called control flow guard. It's a Windows security technology and this wasn't available in Rust. Andrew Paverd:And so the team I, I work with looked at this and said, okay, we'd like to have this implemented, but nobody was available to implement it at the time. So I said, all right, let me do a prototype implementation and, uh, contributed this to the open source project. And in the end, I ended up following through with that. And so I've, I've been essentially maintaining the, the Microsoft control flow guide implementation for the, the Rust compiler. So really an example of Microsoft contributing to this open source language that, that we hope to be using further.Nic Fillingham:Andrew, could you speak a little bit more to control flow guard and control flow integrity? What is that? I know a little bit about it, but I'd love to, for our audience to sort of like expand upon that idea. Andrew Paverd:Absolutely. So this is actually an, an example of a technology that goes back to a collaboration between the MSRC, the, the security response center and, and Microsoft Research. This technology control flow guard is really intended to enforce a property that we call control flow integrity. And that simply means that if you think of a program, the control flow of a program jumps through two different functions. And ideally what you want in a well-behaved program is that the control always follows a well-defined paths. Andrew Paverd:So for example, you start executing a function at the beginning of the function, rather than halfway through. If for example, you could start executing a function halfway through this leads to all kinds of possible attacks. And so what control flow guard does is it checks whenever your, your program's going to do a bronch, whenever it's going to jump to a different place in the code, it checks that that jump is a valid call target, that you're actually jumping to the correct place. And this is not the attacker trying to compromise your program and launch one of many different types of attacks.Nic Fillingham:And so how do you do that? What's the process by which you do en- ensure that control flow?Andrew Paverd:Oh, this is really interesting. So this is a technology that's supported by Windows, at the moment it's only available on, on Microsoft Windows. And it works in conjunction between both the compiler and the operating system. So the compiler, when you compile your program gives you a list of the valid code targets. It says, "All right, here are the places in the program where you should be allowed to jump to." And then as the program gets loaded, the, the operating system loads, this list into a highly optimized form so that when the program is running it can do this check really, really quickly to say, is this jump that I'm about to do actually allowed? And so it's this combination of the Windows operating system, plus the compiler instrumentation that, that really make this possible. Andrew Paverd:Now this is quite widely used in Windows. Um, we want in fact as much Microsoft software as possible to use this. And so it's really critical that we enable it in any sort of programming language that we want to use. Nic Fillingham:How do you protect that list though? So now you, isn't that now a target for potential attackers?Andrew Paverd:Absolutely. Yeah. And, and it becomes a bit of a race to, to-Nic Fillingham:Cat and mouse.Andrew Paverd:... protect different-Natalia Godyla:(laughs).Andrew Paverd:A bit of, a bit of a cat, cat and mouse game. But at least the nice thing is because list is in one place, we can protect that area of memory to a much greater degree than, than the rest of the program. Natalia Godyla:So just taking a step back, can you talk a little bit about your path to security? What roles have you had? What brought you to security? What's informing your role today? Andrew Paverd:It's an interesting story of how I ended up working in security. It was when I was applying for PhD programs, I had written a PhD research proposal about a topic I thought was very interesting at the time on mobile cloud computing. And I still think that's a hugely interesting topic. And what happened was I sent this research proposal to an academic at the University of Oxford, where I, I was looking to study, and I didn't hear anything for, for a while. Andrew Paverd:And then, a fe- a few days later I got an email back from a completely different academic saying, "This is a very interesting topic. I have a project that's quite similar, but looking at this from a security perspective, would you be interested in doing a PhD in security on, on this topic?" And, so this was my very mind-blowing experience for me. I hadn't considered security in that way before, but I, I took a course on security and found that this was something I was, I was really interested in and ended up accepting the, the PhD offer and did a PhD in system security. And that's really how I got into security. And as they say, the rest is history.Natalia Godyla:Is there particular part of security, particular domain within security that is most near and dear to your heart?Andrew Paverd:Oh, that's a good question.Natalia Godyla:(laughs).Andrew Paverd:I think, I, I think for me, security it- itself is such a broad field that we need to ensure that we have security at, at all levels of the stack, at all, places within the chain, in that it's really going to be the weakest link that an attacker will, will go for. And so I've actually changed field perhaps three times so far. This is what keeps it interesting. My PhD work was around trusted computing. And then as I said, I, since joining Microsoft, I've been largely working in both safe systems programming languages and more recently AI and security. And so I think that's what makes security interesting. The, the fact that it's never the same thing two days in a row.Natalia Godyla:I think you hit on the secret phrase for this show. So AI and security. Can you talk a little bit about what you've been doing in AI and security within Microsoft? Andrew Paverd:Certainly. So about a year ago, as many people in the industry realized that AI is being very widely used and is having great results in so many different products and services, but that there is a risk that AI algorithms and systems themselves may be attacked. For example, I, I know you had some, some guests on your podcast previously, including Ram Shankar Siva Kumar who discussed the Adversarial ML Threat Matrix. And this is primarily the area that I've been working in for the past year. Looking at how AI systems can be, can be attacked from a security or a privacy perspective in collaboration with researchers, from MSR, Cambridge. Natalia Godyla:What are you most passionate about? What's next for a couple of these projects? Like with Rust, is there a desire to make that ubiquitously beyond Microsoft? What's the next stage? Andrew Paverd:Ab- absolutely. Natalia Godyla:Lots of questions. (laughs).Andrew Paverd:Yeah. There's a lot of interest in this. So, um, personally, I'm, I'm not working on the SSPL project myself, or I'm, I'm not working on the safe systems programming languages project myself any further, but I know that there's a lot of interest within Microsoft. And so hopefully we'll see some exciting things e- emerging in that space. But I think my focus is really going to be more on the, both the security of AI, and now we're also exploring different areas where we can use AI for security. This is in collaboration, more with the security response center. So looking into different ways that we can automate different processes and use AI for different types of, of analysis. So certainly a lot more to, to come in that space.Nic Fillingham:I wanted to come back to Rust for, for a second there, Andrew. So you talked about how the Rust programming language was specifically designed for, correct me on taxonomy, memory integrity. Is that correct?Andrew Paverd:For, for memory safety, yeah. Nic Fillingham:Memory safety. Got it. What's happening on sort of Nic Fillingham:... and sort of the, the flip side of that coin in terms of instead of having to choose a programming language that has memory safety as sort of a core tenet. What's happening with the operating system to ensure that languages that maybe don't have memory safety sort of front and center can be safer to use, and aren't threats or risks to memory integrity are, are sort of mitigated. So what's happening on the operating system side, is that what Control Flow Guard is designed to do? Or are there other things happening to ensure that memory safety is, is not just the responsibility of the programming language?Andrew Paverd:Oh, it's, that's an excellent question. So Control Flow Guard certainly helps. It helps to mitigate exploits once there's been an, an initial memory safety violation. But I think that there's a lot of interesting work going on both in the product space, and also in the research space about how do we minimize the amount of software that, that we have to trust. If you accept that software is going to have to bugs, it's going to have vulnerabilities. What we'd like to do, is we'd like to trust as little software as possible.Andrew Paverd:And so there's a really interesting effort which is now available in, in Azure under the, the heading of Confidential Computing. Which is this idea that you want to run your security sensitive workloads in a hardware enforced trusted execution environment. So you actually want to take the operating system completely out of what we call the trusted computing base. So that even if there are vulnerabilities in, in the OS, they don't affect your security sensitive workloads. So I think that there's this, this great trend towards confidential computing around compartmentalizing and segmenting the software systems that we're going to be running.Andrew Paverd:So removing the operating system from the trusted computing. And, and indeed taking this further, there's already something available in Azure, you can look up Azure Confidential Computing. But there's a lot of research coming in from the, the academic side of things about new technologies and new ways of, of enforcing separation and compartmentalization. And so I think it's part of this full story of, of security that we'll need memory safe programming languages. We'll need compartmentalization techniques, some of which, uh, rely on new hardware features. And we need to put all of this together to really build a, a secure ecosystem.Nic Fillingham:I only heard of Confidential Computing recently. I'm sure it's not a new concept. But for me as a sort of a productized thing, I only sort of recently stumbled upon it. I did not realize that there was this gap, there was this delta in terms of data being encrypted at rest, data being encrypted in transit. But then while the data itself was being processed or transformed, that that was a, was a gap. Is that the core idea around Confidential Computing to ensure that at no stage the data is not encrypted? Is, is that sort of what it is?Andrew Paverd:Absolutely. And it's one of the key pieces. So we call that isolated execution in the sense that the data is running in a, a trusted environment where only the code within that environment can access that data. So if you think about the hypervisor and the operation system, all of those can be outside of the trusted environment. We don't need to trust those for the correct computation of, of that data. And as soon as that data leaves this trusted environment, for example if it's written out of the CPU into the DRAM, then it gets automatically encrypted.Andrew Paverd:And so we have that really, really strong guarantee that only our code is gonna be touching our data. And the second part of this, and this is the really important part, is a, a protocol called remote attestation where this trusted environment can prove to a remote party, for example the, the customer, exactly what code is going to be running over that data. So you have a, a very high degree of assurance of, "This is exactly the code that's gonna be running over my data. And no other code will, will have access to it."Andrew Paverd:And the incredibly interesting thing is then, what can we build with these trusted execution environment? What can we build with Confidential Computing? And to bring this back to the, the keyword of your podcast, we're very much looking at confidential machine learning. How do we run machine learning and AI workloads within these trusted execution environments? And, and that unlocks a whole lot of new potential.Nic Fillingham:Andrew, do you have any advice for people that are m- maybe still studying or thinking about studying? Uh, I see so you, your initial degree was in, not in computer engineering, was it?Andrew Paverd:No. I, I actually did electrical engineering. And then electrical and computer engineering. And by the time I did a PhD, they put me in a computer science department, even though-Nic Fillingham:(laughs).Andrew Paverd:... I was doing software engineering.Nic Fillingham:Yeah. I, so I wonder if folks out there that, that don't have a software or a computer engineering degree, maybe they have a, a different engineering focus or a mathematics focus. Any advice on when and how to consider computer engineering, or sort of the computing field?Andrew Paverd:Yeah. Uh, absolutely. Uh, I think, eh, in particular if we're talking about security, I'd say have a look at security. It's often said that people who come with the best security mindsets haven't necessarily gone through the traditional programs. Uh, of course it's fantastic if you can do a, a computer science degree. But if you're coming at this from another area, another, another aspect, you bring a unique perspective to the world of cyber security. And so I would say, have a look at security. See if it's something that, that interests you. You, you might find like I did that it's a completely fascinating topic.Andrew Paverd:And the from there, it would just be a question of seeing where your skills and expertise could best fit in to the broad picture of security. We desperately need people working in this field from all different disciplines, bringing a diversity of thought to the field. And so I, I'd highly encourage people to have a look at this.Natalia Godyla:And you made a, quite a hard turn into security through the PhD suggestion. It, like you said, it was one course and then you were off. So, uh, what do you think from your background prepared you to make that kind of transition? And maybe there's something there that could inform others along the way.Andrew Paverd:I think, yes, it, it's a question of looking at, uh, of understanding the system in as much detail as you possibly can. And then trying to think like, like an attacker. Trying to think about what could go wrong in this system? And as we know, attackers won't respect our assumptions. They will use a system in a different way in which it was designed. And that ability to, to think out of the box, which, which comes from understanding how the system works. And then really just a, a curiosity about security. They call it the security mindset, of perhaps being a little bit cautious and cynical. To say-Natalia Godyla:(laughs).Andrew Paverd:... "Well, this can go wrong, so it probably will go wrong." But I think that's, that's the best way into it.Natalia Godyla:Must be a strong follower of Murphy's Law.Andrew Paverd:Oh, yes.Natalia Godyla:(laughs).Nic Fillingham:What are you watching? What are you binging? What are you reading? Either of those questions, or anything along in that flavor.Andrew Paverd:I'll, I'll have to admit, I'm a, I'm a big fan of Star Trek. So I've been watching the new Star Trek Discovery series on, on Netflix. That's, that's great fun. And I've recently been reading a, a really in- interesting book called Atomic Habits. About how we can make some small changes, and, uh, how these can, can help us to build larger habits and, and propagate through.Nic Fillingham:That's fascinating. So that's as in looking at trying to learn from how atoms and atomic models work, and seeing if we can apply that to like human behavior?Andrew Paverd:Uh, no. It's just the-Nic Fillingham:Oh, (laughs).Andrew Paverd:... title of the book.Natalia Godyla:(laughs).Nic Fillingham:You, you had me there. Natalia Godyla:Gotcha, Nick.Nic Fillingham:I was like, "Wow-"Natalia Godyla:(laughs).Nic Fillingham:" ... that sounds fascinating." Like, "Nope, nope. Just marketing." Marketing for the win. Have you always been Star Trek? Are you, if, if you had to choose team Star Trek or team Star Wars, or, or another? You, it would be Star Trek?Andrew Paverd:I think so. Yeah.Nic Fillingham:Yeah, me too. I'm, I'm team Star Trek. Which m- may lose us a lot of subscribers, including Natalia.Andrew Paverd:(laughs).Nic Fillingham:Natalia has her hands over her mouth here. And she's, "Oh my gosh." Favorite Star Trek show or-Andrew Paverd:I, I have to say, it, it would've been the first one I watched, Deep Space Nine.Nic Fillingham:I love Deep Space Nine. I whispered that. Maybe that-Natalia Godyla:(laughs).Nic Fillingham:... it's Deep Space Nine's great. Yep. All right, cool. All right, Andrew, you're allowed back on the podcast. That's good.Andrew Paverd:Thanks.Natalia Godyla:You're allowed back, but I-Nic Fillingham:(laughs).Natalia Godyla:... (laughs).Andrew Paverd:(laughs).Nic Fillingham:Sort of before we close, Andrew, is there anything you'd like to plug? I know you have a, you have a blog. I know you work on a lot of other sorta projects and groups. Anything you'd like to, uh, plug to the listeners?Andrew Paverd:Absolutely, yeah. Um, we are actually hiring. Eh, well, the team I work with in Cambridge is, is hiring. So if you're interested in privacy preserving machine learning, please do have a look at the website, And submit an application to, to join our team.Natalia Godyla:That sounds fascinating. Thank you.Nic Fillingham:And can we follow along on your journey and all the great things you're working at, at your website?Andrew Paverd:Eh, absolutely, yeah. And if you follow along the, the Twitter feeds of both Microsoft Research Cambridge, and the Microsoft Security Response Center, we'll, we'll make sure to tweet about any of the, the new work that's coming out.Nic Fillingham:That's great. Well, Andrew Paverd, thank you so much for joining us on the Security Unlocked Podcast. We'd love to have you come back and talk about some of the projects you're working on in a deep-dive section on a future episode.Andrew Paverd:Thanks very much for having me.Natalia Godyla:Well, we had a great time unlocking insights into security, from research to artificial intelligence. Keep an eye out for our next episode.Nic Fillingham:And don't forget to tweet @MSFTSecurity. Or email us at with topics you'd like to hear on a future episode. Until then, stay safe.Natalia Godyla:Stay secure.