Security Unlocked

Share

Judging a Bug by Its Title

Ep. 16

Most people know the age-old adage, “Don’t judge a book by its cover.” I can still see my grandmother wagging her finger at me when I was younger as she said it. But what if it's not the book cover we’re judging, but the title? And what if it’s not a book we’re analyzing, but instead a security bug? The times have changed, and age-old adages don’t always translate well in the digital landscape. In this case, we’re using machine learning (ML) to identify and “judge” security bugs based solely on their titles.  And, believe it or not, it works! (Sorry, Grandma!) 


Mayana Pereira, Data Scientist at Microsoft, joins hosts Nic Fillingham and Natalia Godyla to dig into the endeavors that are saving security experts’ time. Mayana explains how data science and security teams have come together to explore ways that ML can help software developers identify and classify security bugs more efficiently. A task that, without machine learning, has traditionally provided false positives or led developers to overlook misclassified critical security vulnerabilities. 

 

In This Episode, You Will Learn:

• How data science and ML can improve security protocols and identify and classify bugs for software developers 

• How to determine the appropriate amount of data needed to create an accurate ML training model 

• The techniques used to classify bugs based simply on their title 

 

Some Questions We Ask:

• What questions need to be asked in order to obtain the right data to train a security model? 

• How does Microsoft utilize the outputs of these data-driven security models?  

• What is AI for Good and how is it using AI to foster positive change in protecting children, data and privacy online? 

 

Resources: 

Microsoft Digital Defense Report 

https://www.microsoft.com/en-us/security/business/security-intelligence-report 

 

Article: “Identifying Security Bug Reports Based Solely on Report Titles and Noisy Data” 

https://docs.microsoft.com/en-us/security/engineering/identifying-security-bug-reports 

 

Mayana’s LinkedIn 

https://www.linkedin.com/in/mayana-pereira-2aa284b0 

 

Nic’s LinkedIn    

https://www.linkedin.com/in/nicfill/    

    

Natalia’s LinkedIn    

https://www.linkedin.com/in/nataliagodyla/    

    

Microsoft Security Blog:     

https://www.microsoft.com/security/blog/ 


Related:

Security Unlocked: CISO Series with Bret Arsenault

https://SecurityUnlockedCISOSeries.com


Transcript

(Full transcript can be found at https://aka.ms/SecurityUnlockedEp16)


Nic Fillingham:

Hello, and welcome to Security Unlocked, a new podcast from Microsoft where we unlock insights from the latest in news and research from across Microsoft Security engineering and operations teams. I'm Nic Fillingham-


Natalia Godyla:

And I'm Natalia Godyla. In each episode we'll discuss the latest stories from Microsoft Security, deep dive into the newest threat, intel, research and data science-


Nic Fillingham:

And profile some of the fascinating people working on artificial intelligence in Microsoft Security.


Natalia Godyla:

And now let's unlock the pod.


Natalia Godyla:

Hello, Nic. How's it going?


Nic Fillingham:

Hello, Natalia. Welcome back. Well, I guess welcome back to Boston to you. But welcome to Episode 16. I'm confused because I saw you in person last week for the first time. Well, technically it was the first time for you, 'cause you didn't remember our first time. It was the second time for me. But it was-


Natalia Godyla:

I feel like I just need to justify myself a little bit there. It was a 10 second exchange, so I feel like it's fair that I, I was new to Microsoft. There was a lot coming at me, so, uh-


Nic Fillingham:

Uh, I'm not very memorable, too, so that's the other, that's the other part, which is fine. But yeah. You were, you were here in Seattle. We both did COVID tests because we filmed... Can I say? You, you tell us. What did we do? It's a secret. It is announced? What's the deal?


Natalia Godyla:

All right. Well, it, it's sort of a secret, but everyone who's listening to our podcast gets to be in the know. So in, in March you and I will be launching a new series, and it's a, a video series in which we talk to industry experts. But really we're, we're hanging with the industry experts. So they get to tell us a ton of really cool things about [Sec Ups 00:01:42] and AppSec while we all play games together. So lots of puzzling. Really, we're just, we're just getting paid to do puzzles with people cooler than us.


Nic Fillingham:

Speaking of hanging out with cool people, on the podcast today we have Mayana Pereira whose name you may have heard from a few episodes ago Scott Christiansen was on talking about the work that he does. And he had partnered Mayana to build and launch a, uh, machine learning model that looked at the titles of bugs across Microsoft's various code repositories, and using machine learning determined whether those bugs were actually security related or not, and if they were, what the correct severity rating should be.


Nic Fillingham:

So this episode we thought we'd experiment with the format. And instead of having two guests, instead of having a, a deep dive upfront and then a, a profile on someone in the back off, we thought we would just have one guest. We'd give them a little bit extra time, uh, about 30 minutes and allow them to sort of really unpack the particular problem or, or challenge that they're working on. So, yeah. We, we hope you like this experiment.


Natalia Godyla:

And as always, we are open to feedback on the new format, so tweet us, uh, @msftsecurity or send us an email securityunlocked@microsoft.com. Let us know what you wanna hear more of, whether you like hearing just one guest. We are super open. And with that, on with the pod?


Nic Fillingham:

On with the pod.


Nic Fillingham:

Welcome to the Security Unlocked podcast. Mayana Pereira, thanks for joining us.


Mayana Pereira:

Thank you for having me. I'm so happy to be here today, and I'm very excited to share some of the things that I have done in the intersection of [ML 00:03:27] and security.


Nic Fillingham:

Wonderful. Well, listeners of the podcast will have heard your name back in Episode 13 when we talked to Scott Christiansen, and he talked about, um, a fascinating project about looking for or, uh, utilizing machine learning to classify bugs based simply on, on their title, and we'll get to that in a minute. But could you please introduce you- yourself to our audience. Tell us about your title, but sort of what does that look like in terms of day-to-day and, and, and the work that you do for Microsoft?


Mayana Pereira:

I'm a data scientist at Microsoft. I've been, I have been working at Microsoft for two years and a half now. And I've always worked inside Microsoft with machine learning applied to security, trust, safety, and I also do some work in the data privacy world. And this area of ML applications to the security world has always been my passion, so before Microsoft I was also working with ML applied to cyber security more in the malware world, but still security. And since I joined Microsoft, I've been working on data science projects that kinda look like this project that we're gonna, um, talk today about. So those are machine learning applications to interesting problems where we can either increase the trust and the security Microsoft products, or the safety for the customer. You know, you would develop m- machine learning models with that in mind.


Mayana Pereira:

And my day-to-day work includes trying to understand which are those interesting programs across the company, talk to my amazing colleagues such as Scott. And I have a, I have been so blessed with an amazing great team around me. And thinking about these problems, gathering data, and then getting, you know, heads down and training models, and testing new machine learning techniques that have never been used for a specific applications, and trying to understand how well or if they will work for those applications, or if they're gonna get us to better performance, or better accuracy precision and those, those metrics that we tend to use in data science works. And when we feel like, oh, this is an interesting project and I think it is interesting enough to share with the community, we write a paper, we write a blog, we go to a conference such as RSA and we present it to the community, and we get to share the work and the findings with colleagues internal to Microsoft, but also external. So this is kinda what I do on a day-to-day basis.


Mayana Pereira:

Right now my team is the data science team inside Microsoft that is called AI For Good, so the AI for Good has this for good in a sense of we want to, to guarantee safety, not only for Microsoft customers, but for the community in general. So one of my line of work is thinking about how can I collaborate with NGOs that are also thinking about the security or, and the safety of kids, for example. And this is another thing that I have been doing as part of this AI for Good effort inside Microsoft.


Natalia Godyla:

Before we dive into the bug report classification project, can you just share a couple of the projects that your team works for AI for Good? I think it would be really interesting for the audience to hear that.


Mayana Pereira:

Oh, absolutely. So we have various pillars inside the AI for Good team. There is AI for Health, AI for Humanitarian Action, AI for Earth. We have also been collaborating in an effort for having a platform with a library for data privacy. It is a library where we have, uh, various tools to apply the data and get us an output, data with strong privacy guarantees. So guaranteeing privacy for whoever was, had their information in a specific dataset or contributed with their own information to a specific research and et cetera. So this is another thing that our team is currently doing.


Mayana Pereira:

And we have various partners inside and outside of Microsoft. Like I mentioned, we do a lot of work in NGOs. So you can think like project like AI for Earth several NGOs that are taking care of endangered species and other satellite images for understanding problems with the first station and et cetera. And then Humanitarian Action, I have worked with NGOs that are developing tools to combat child sexual abuse and exploration. AI for Health has so many interesting projects, and it is a big variety of projects.


Mayana Pereira:

So this is what the AI for Good team does. We are, I think right now we're over 15 data scientists. All of us are doing this work that it is a- applied research. Somehow it is work that we need to sit down with, with our customers or partners, and really understand where the problem is. It's usually some, some problems that required us to dig a little deeper and come up with some novel or creative solution for that. So this is basically the overall, the AI for Good team.


Nic Fillingham:

Let's get back in the way back machine to I think it was April of 2020, which feels like 700 years ago.


Mayana Pereira:

(laughs)


Nic Fillingham:

But you and Scott (laughs) published a blog. Scott talked about on Episode 13 called securing


Nic Fillingham:

The s- the software development lifecycle with machine learning, and the thing that I think both Natalia and I picked up on when Scott was talking about this, is it sounded first-, firstly it sounded like a exceptionally complex premise, and I don't mean to diminish, but I think Natalia and I were both "oh wow you built a model that sort of went through repro steps and passed all the logs inside of security bugs in order to better classify them but that's not what this does", this is about literally looking at the words that are the title of the security bug, and then building a model to try and determine whether it was truly security or something else, is that right?


Mayana Pereira:

That's exactly it. This was such an interesting project. When I started collaborating with Scott, and some other engineers in the team. I was a little skeptical about using only titles, to make prediction about whether a bug has, is security related or not. And, it seems. Now that I have trained several models and passed it and later retrained to- to get more of a variety of data in our model. I have learned that people are really good at describing what is going on in a bug, in the title, it feels like they really summarize it somehow so it's- it's doing a good job because, yes, that's exactly what we're doing, we are using bug titles only from several sources across Microsoft, and then we use that to understand which bugs are security related or not, and how we can have an overall view of everything that is happening, you know in various teams across different products. And, that has given a lot of visibilities to some unknown problems and some visibility to some things that we were not seeing before, because now you can scan, millions of bugs in a few seconds. Just reading titles, you have a model that does it really fast. And, I think it is a game changer in that sense, in the visibility and how do you see everything that is happening in that bug world.


Natalia Godyla:

So what drove that decision? Why are we relying only on the titles, why can't we use the- the full bug reports?


Mayana Pereira:

There are so many reasons for that. I think, the first reason was the fact that the full bug report, sometimes, has sensitive information. And we were a little bit scared about pulling all that sensitive information which could include passwords, could include, you know, maybe things that should not be available to anyone, and include that in a- in a VM to train a model, or, in a data science pipeline. And, having to be extremely careful also about not having our model learning passwords, not having that. So that was one of the big, I think incentives off, let's try titles only, and see if it works. If it doesn't work then we can move on and see how we can overcome the problem of the sensitive information. And it did work, when we saw that we had a lot of signal in bug titles only, we decided to really invest in that and get really good models by u- utilizing bug titles only.


Nic Fillingham:

I'm going to read from the blog just for a second here, because some of the numbers here, uh, are pretty staggering, so, again this was written 2020, uh, in April, so there's obviously, probably updated numbers since then but it said that Microsoft 47,000 developers generate nearly 30,000 bugs a month, which is amazing that's coming across over 100 Azure DevOps and GitHub repositories. And then you had it you, you actually have a count here saying since 2001 Microsoft has collected 13 million work items and bugs which I just thinks amazing. So, do you want to speak to, sort of, the volume of inputs and, sort of, signals here in to building that model and maybe some of the challenges, and then a follow on question is, is this model, still active today, is this- is this work still ongoing, has it been incorporated into a product or another, another process?


Nic Fillingham:

Do you want to start with, with numbers or.


Mayana Pereira:

Yes, I think that from my data scientist point of view, having such large numbers is absolutely fantastic because it gives us a historical data set, very rich so we can understand how data has evolved over time. And also, if this- the security terminology has changed the law, or how long will this model last, in a sense. And it was interesting to see that you can have different tools, different products, different things coming up, but the security problems, at least for, I would say for the past six, seven years, when it comes to terminology, because what I was analyzing was the terminology of the security problems. My model was a natural language processing model. It was pretty consistent, so that was really interesting to see from that perspective we have. And by having so much data, you know, this amazing volume. It helped us to build better classifiers for sure. So this is my- my data scientist side saying, amazing. I love it so much data.


Nic Fillingham:

What's the status of this project on this model now.? Is it- is it still going? Has it been embedded into another- another product, uh, or process?


Mayana Pereira:

Yes, it's still active. It's still being used. So, right now, this product. This, not the product- the product, but the model is mainly used by the customer security interest team in [Sila 00:16:16], so they use the model in order to understand the security state of Microsoft products in general, and, uh, different products and looking at specific products as well, are using the model to get the- the bugs statistics and security bugs statistics for all these different products across Microsoft. And there are plans on integrating the- this specific model or a variation of the model into other security lifecycle pipelines, but this is a decision that is more on CST customer Security Trust side and I have, um, only followed it, but I don't have specific details for that right now. But, I have seen a lot of good interesting results coming out of that model, good insights and security engineers using the results of the model to identify potential problems, and fix those problems much faster.


Natalia Godyla:

So, taking a step back and just thinking about the journey that your team has gone on to get the model to the state that it's in today. Uh, in the blog you listed a number of questions to figure out what would be the right data to train the model. So the questions were, is there enough data? How good is the data? Are there data usage restrictions? And, can data be generated in a lab?


Natalia Godyla:

So can you talk us through how you answered these questions like, as a- as a data scientist you were thrilled that there was a ton of data out there, but what was enough data? How did you define how good the data was? Or, whether it was good enough.


Mayana Pereira:

Great. So, those were questions that I asked myself before even knowing what the project was about, and the answer to is there enough data? It seemed very clear from the beginning that, yes, we had enough data, but those were questions that I brought up on the blog, not only for myself but for anyone else that was interested in replicating those experiments in their company or maybe university or s- anywhere any- any data scientist that is interested to train your own model for classification, which questions should be asked? Once you start a project like this. So the, is there enough data for me? Was clear from the beginning, we had several products so we had a variety of data sources. I think that when you reach, the number of millions of samples of data. I think that speaks for itself. It is a high volume. So I felt, we did have enough data.


Mayana Pereira:

And, when it came to data quality. That was a more complex question. We had data in our hands, bugs. We wanted to be able to train a model that could different- differentiate from security bugs and non security bugs, you know. And, for that, Usually what we do with machine learning, is we have data, that data has labels, so you have data that represents security bugs, data that represents non security bugs. And then we use that to train the model. And those labels were not so great. So we needed to understand how the not so great labels was going to impact our model, you know, we're going to train a model with labels that were not so great. So


Mayana Pereira:

That was gonna happen. So that was one of the questions that we asked ourselves. And I did a study on that, on understanding what is the impact of these noisy labels and the training data set. And how is it gonna impact the classification results that we get once using this, this training data? So this was one of the questions that I asked and we, I did several experiments, adding noise. I did that myself, I, I added noise on purpose to the data set to see what was the limits of this noise resilience. You know, when you have noisy labels in training, we published it in a, in an academic conference in 2019, and we understood that it was okay to have noisy labels. So security bugs that were actually labeled as not security and not security bugs labeled as security. There was a limit to that.


Mayana Pereira:

We kinda understood the limitations of the model. And then we started investigating our own data to see, is our own data within those limits. If yes, then we can use this data confidentially to train our models. If no, then we'll have to have some processes for correcting labels and understanding these data set a little bit better. What can we use and what can we not use to train the models. So what we found out is that, we didn't have noisy labels in the data set. And we had to make a few corrections in our labels, but it was much less work because we understood exactly what needed to be done, and not correct every single data sample or every single label in a, an enormous data set of millions of entries. So that was something that really helped.


Mayana Pereira:

And then the other question, um, that we asked is, can we generate data in the lab? So we could sometimes force a specific security issue and generate some, some box that had that security description into titles. And why did we include that in the list of questions? Because a lot of bugs that we have in our database are generated by automated tools. So when you have a new tool being included in your ecosystem, how is your model going to recognize the bugs that are coming from this new tool? So does our, ma- automatically generated box. And we could wait for the tool to be used, and then after a while we gathered the data that the tool provided us and including a retraining set. But we can also do that in the lab ecosystem, generate data and then incorporate in a training set. So this is where this comes from.


Nic Fillingham:

I wanted to ask possibly a, a very rudimentary question, uh, especially to those that are, you know, very familiar with machine learning. When you have a data set, there's words, there is text in which you're trying to generate labels for that text. Does the text itself help the process of creating labels? So for example, if I've got a bug and the name of that bug is the word security is in the, the actual bug name. Am I jump-starting, am I, am I skipping some steps to be able to generate good labels for that data? Because I already have the word I'm looking for. Like I, I think my question here is, was it helpful to generate your labels because you were looking at text in the actual title of the bug and trying to ascertain whether something was security or not?


Mayana Pereira:

So the labels were never generated by us or by me, the data scientists. The labels were coming from the engineering systems where we collected the data from. So we were relying on what- whatever happened in the, in the engineering team, engineering group and relying that they did, uh, a good job of manually labeling the bugs as security or not security. But that's not always the case, and that doesn't mean that the, the engineers are not good or are bad, but sometimes they have their own ways of identifying it in their systems. And not necessarily, it is the same database that we had access to. So sometimes the data is completely unlabeled, the data that comes to us, and sometimes there are mistakes. Sometimes you have, um, specific engineer that doesn't have a lot of security background. The person sees a, a problem, describes the problem, but doesn't necessarily attribute the problem as a security problem. Well, that can happen as well.


Mayana Pereira:

So that is where the labels came from. The interesting thing about the terminology is that, out of the millions and millions of security bugs that I did review, like manually reviewed, because I kinda wanted to understand what was going on in the data. I would say that for sure, less than 1%, even less than that, had the word security in it. So it is a very specific terminology when you see that. So people tend to be very literal in what the problem is, but not what the problem will generate. In a sense of they will, they will use things like Cross-site Scripting or passwords in clear, but not necessarily, there's a security pr- there's a security problem. But just what the issue is, so it is more of getting them all to understand that security lingual and what is that vocabulary that constitutes security problems. So that's wh- that's why it is a little bit hard to generate a list of words and see if it matches. If a specific title matches to this list of words, then it's security.


Mayana Pereira:

It was a little bit hard to do that way. And sometimes you have in the title, a few different words that in a specific order, it is a security problem. In another order, it is not. And then, I don't have that example here with me, but I, I could see some of those examples in the data. For example, I think the Cross-site Scripting is a good example. Sometimes you have site and cross in another place in the title. It has nothing to do with Cross-site Scripting. Both those two words are there. The model can actually understand the order and how close they are in the bug title, and make the decision if it is security or not security. So that's why the model is quite easier to distinguish than if we had to use rules to do that.


Natalia Godyla:

I have literally so many questions.


Nic Fillingham:

[laughs].


Natalia Godyla:

I'm gonna start with, uh, how did you teach at the lingo? So what did you feed the model so that it started to pick up on different types of attacks like Cross-site Scripting?


Mayana Pereira:

Perfect. The training algorithm will do that for me. So basically what I need to guarantee is that we're using the correct technique to do that. So the technique will, the machine learning technique will basically identify from this data set. So I have a big data set of titles. And each title will have a label which is security or non-security related to it. Once we feed the training algorithm with all this text and their associated labels, the training algorithm will, will start understanding that, some words are associated with security, some words are associated with non-security. And then the algorithm will, itself will learn those patterns. And then we're gonna train this algorithm. So in the future, we'll just give the algorithm a new title and say, "Hey, you've learned all these different words, because I gave you this data set from the past. Now tell me if this new ti- if this new title that someone just came up with is a security problem or a, a non-security problem." And the algorithm will, based on all of these examples that it has seen before, will make a decision if it is security or non-security.


Natalia Godyla:

Awesome. That makes sense. So nothing was provided beforehand, it was all a process of leveraging the labels.


Mayana Pereira:

Yes.


Natalia Godyla:

Also then thinking about just the dataset that you received, you were working with how many different business groups to get this data? I mean, it, it must've been from several different product teams, right?


Mayana Pereira:

Right. So I had the huge advantage of having an amazing team that is a data center team that is just focused on doing that. So their business is go around the company, gather data and have everything harmonized in a database. So basically, what I had to do is work with this specific team that had already done this amazing job, going across the company, collecting data and doing this hard work of harvesting data and harmonizing data. And they had it with them. So it is a team that does that inside Microsoft. Collects the data, gets everything together. They have their databases updated several times a day, um, collecting


Mayana Pereira:

... Data from across the company, so it is a lot of work, yeah.


Natalia Godyla:

So do different teams treat bug reports differently, meaning is there any standardization that you had to do or anything that you wanted to implement within the bug reports in order to get better data?


Mayana Pereira:

Yes. Teams across the company will report bugs differently using different systems. Sometimes it's Azure DevOps, sometimes it can be GitHub. And as I mentioned, there is a, there was a lot of work done in the data harmonization side before I touched the data. So there was a lot of things done to get the data in, in shape. This was something that, fortunately, several amazing engineers did before I touched the data. Basically, what I had to do once I touched it, was I just applied the data as is to the model and the data was very well treated before I touched it.


Nic Fillingham:

Wow. So many questions. I did wanna ask about measuring the success of this technique. Were you able to apply a metric, a score to the ... And I'm, I, I don't even know what it would be. Perhaps it would be the time to address a security bug pre and post this work. So, did this measurably decrease the amount of time for prioritized security bugs to be, to be addressed?


Mayana Pereira:

Oh, definitely. Yes, it did. So not only it helped in that sense, but it helped in understanding how some teams were not identifying specific classes of bugs as security. Because we would see this inconsistency with the labels that they were including in their own databases. These labels would come to this big database that is harmonized and then we would apply the model on top of these data and see that specific teams were treating their, some data points as non-security and should have been security. Or sometimes they were treating as security, but not with the correct severity. So it would, should have been a critical bug and they were actually treating it as a moderate bug. So, that, I think, not only the, the timing issue was really important, but now you have a visibility of behavior and patterns across the company that the model gives us.


Nic Fillingham:

That's amazing. And so, so if I'm an engineer at Microsoft right now and I'm in my, my DevOps environment and I'm logging a bug and I use the words cross- cross scripting somewhere in the bug, what's the timing with which I get the feedback from your model that says, "Hey, your prioritization's wrong," or, "Hey, this has been classified incorrectly"? Are we at the point now where this model is actually sort of integrated into the DevOps cycle or is that still coming further down the, the, the path?


Mayana Pereira:

So you have, the main customer is Customer Security and Trust team inside Microsoft. They are the ones using it. But as soon as they start seeing problems in the data or specific patterns and problems in specific teams' datasets, they will go to that team and then have this, they have a campaign where they go to different teams and, and talk to them. And some teams, they do have access to the datasets after they are classified by our model. Right now, there's, they don't have the instant response, but that's, that's definitely coming.


Nic Fillingham:

So, Mayana, how is Customer Security and Trust, your organization, utilizing the outputs of this model when a, when a, when a bug gets flagged as being incorrectly classified, you know, is there a threshold, and then sort of what happens when you, when you get those flags?


Mayana Pereira:

So the engineering team, the security engineering team in Customer Security and Trust, they will use the model to understand the overall state of security of Microsoft products, you know, like the products across the company, our products, basically. And they will have an understanding of how fast those bugs are being mitigated. They'll have an understanding of the volume of bugs, and security bugs in this case, and they can follow this bugs in, in a, in a timely manner. You know, as soon as the bug comes to the CST system, they bug gets flagged as either security or not security. Once it's flagged as security, there, there is a second model that will classify the severity of the bug and the CST will track these bugs and understand how fast the teams are closing those bugs and how well they're dealing with the security bugs.


Natalia Godyla:

So as someone who works in the AI for Good group within Microsoft, what is your personal passion? What would you like to apply AI to if it, if it's not this project or, uh, maybe not a project within Microsoft, what is, what is something you want to tackle in your life?


Mayana Pereira:

Oh, love the question. I think my big passion right now is developing machine learning models for eradication of child sexual abuse medias in, across different platforms. So you can think about platform online from search engines to data sharing platforms, social media, anything that you can have the user uploading content. You can have problems in that area. And anything where you have using visualizing content. You want to protect that customer, that user, from that as well. But most importantly, protect the victims from those crimes and I think that has been, um, something that I have been dedicating s- some time now. I was fortunate to work with an NGO, um, recently in that se- in that area, in that specific area. Um, developed a few models for them. She would attacked those kind of medias. And these would be my AI for Good passion for now. The other thing that I am really passionate about is privacy, data privacy. I feel like we have so much data out there and there's so much of our information out there and I feel like the great things that we get from having data and having machine learning we should not, not have those great things because of privacy compromises.


Mayana Pereira:

So how can we guarantee that no one's gonna have their privacy compromised? And at the same time, we're gonna have all these amazing systems working. You know, how can we learn from data without learning from specific individuals or without learning anything private from a specific person, but still learn from a population, still learn from data. That is another big passion of mine that I have been fortunate enough to work in such kind of initiatives inside Microsoft. I absolutely love it. When, when I think about guaranteeing privacy of our customers or our partners or anyone, I think that is also a big thing for me. And that, that falls under the AI for Good umbrella as well since that there's so much, you know, personal information in some of these AI for Good projects.


Natalia Godyla:

Thank you, Mayana, for joining us on the show today.


Nic Fillingham:

We'd love to have you back especially, uh, folks, uh, on your team to talk more about some of those AI for Good projects. Just, finally, where can we go to follow your work? Do you have a blog, do you have Twitter, do you have LinkedIn, do you have GitHub? Where should, where should folks go to find you on the interwebs?


Mayana Pereira:

LinkedIn is where I usually post my latest works, and links, and interesting things that are happening in the security, safety, privacy world. I love to, you know, share on LinkedIn. So m- I'm Mayana Pereira on LinkedIn and if anyone finds me there, feel free to connect. I love to connect with people on LinkedIn and just chat and meet new people networking.


Natalia Godyla:

Awesome. Thank you.


Mayana Pereira:

Thank you. I had so much fun. It was such a huge pleasure to talk to you guys.


Natalia Godyla:

Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode.


Nic Fillingham:

And don't forget to Tweet us at MSFTSecurity or email us at securityunlocked@microsoft.com with topics you'd like to hear on a future episode. Until then, stay safe.


Natalia Godyla:

Stay secure. 

More Episodes

7/14/2021

Securing the Internet of Things

Ep. 36
Thereused to bea time when our appliances didn’t talk back to us, but it seems like nowadays everything in our home is getting smarter.Smart watches, smart appliances,smart lights-smart everything! Thisconnectivity to the internetis what we call the Internet of Things(IoT).It’s becoming increasingly common for our everyday items to be “smart,” and while thatmay providea lot of benefits, like your fridge reminding you when you may need to get more milk, it alsomeans thatall ofthose devices becomesusceptible to cyberattacks.On this episode of Security Unlocked, hostsNic FillinghamandNatalia Godylatalk toArjmandSamuelabout protecting IoT devices, especially with a zero trust approach.Listenin to learnnot onlyaboutthe importance of IoT security,but also what Microsoft is doing to protect againstsuchattacks and how you canbettersecurethesedevices.In This Episode You Will Learn: Whatthe techniquesareto verify explicitly on IoT devicesHow to apply the zero trust model in IoTWhat Microsoft is doing to protect against attacks on IoTSome Questions We Ask:What isthedifference between IoT and IT?Why is IoT security so important?What are the best practices for protecting IoT?Resources:ArjmandSamuel’s LinkedIn:https://www.linkedin.com/in/arjmandsamuel/Nic Fillingham’s LinkedIn:https://www.linkedin.com/in/nicfill/Natalia Godyla’s LinkedIn:https://www.linkedin.com/in/nataliagodyla/Microsoft Security Blog:https://www.microsoft.com/security/blog/Related:Security Unlocked: CISO Series with Bret Arsenaulthttps://thecyberwire.com/podcasts/security-unlocked-ciso-seriesTranscript:[Full transcript can be found athttps://aka.ms/SecurityUnlockedEp36]Nic Fillingham:(music) Hello and welcome to Security Unlocked, a new podcast from Microsoft where we unlock insights from the latest in new and research from across Microsoft's security, engineering and operations teams. I'm Nic Fillingham.Natalia Godyla:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft Security, deep dive into the newest threat intel, research and data science.Nic Fillingham:And profile some of the fascinating people working on artificial intelligence in Microsoft Security.Natalia Godyla:And now, let's unlock the pod. (music)Natalia Godyla:Welcome everyone to another episode of Security Unlocked. Today we are joined by first time guest, Arjmand Samuel, who is joining us to discuss IoT Security, which is fitting as he is an Azure IoT Security leader a Microsoft. Now, everyone has heard the buzz around IoT. There's been constant talk of it over the past several years, and, but now we've all also already had some experience with IoT devices in our personal life. Would about you, Nic? What do you use in your everyday life? What types of IoT devices?Nic Fillingham:Yeah. I've, I've got a couple of smart speakers, which I think a lot of people have these days. They seem to be pretty ubiquitous. And you know what? I sort of just assumed that they automatically update and they've got good security in them. I don't need to worry about it. Uh, maybe that's a bit naïve, but, but I sort of don't think of them as IoT. I just sort of, like, tell them what I music I want to play and then I tell them again, because they get it wrong. And then I tell them a third time, and then I go, "Ugh," and then I do it on my phone.Nic Fillingham:I also have a few cameras that are pointed out around the outside of the house. Because I live on a small farm with, with animals, I've got some sheep and pigs, I have to be on the look out for predators. For bears and coyotes and bobcats. Most of my IoT, though, is very, sort of, consummary. Consumers have access to it and can, sort of, buy it or it comes from the utility company.Natalia Godyla:Right. Good point. Um, today, we'll be talking with Arjmand about enterprise grade IoT and OT, or Internet of Things and operational technology. Think the manufacturing floor of, uh, plants. And Arjmand will walk us through the basics of IoT and OT through to the best practices for securing these devices.Nic Fillingham:Yeah. And we spent a bit of time talking about zero trust and how to apply a zero trust approach to IoT. Zero trust, there's sort of three main pillars to zero trust. It's verify explicitly, which for many customers just means sort of MFA, multi factorial authentication. It's about utilizing least privilege access and ensuring that accounts, users, devices just have access to the data they need at the time they need it. And then the third is about always, sort of, assuming that you've been breached and, sort of, maintaining thing philosophy of, of let's just assume that we're breached right now and let's engage in practices that would, sort of, help root out a, uh, potential breach.Nic Fillingham:Anyway, so, Arjmand, sort of, walks us through what it IoT, how does it relate to IT, how does it relate to operational technology, and obviously, what that zero trust approach looks like. On with the pod.Natalia Godyla:On with the pod. (music) Today, we're joined by Arjmand Samuel, principle program manager for the Microsoft Azure Internet of Things Group. Welcome to the show, Arjmand.Arjmand Samuel:Thank you very much, Natalia, and it's a pleasure to be on the show.Natalia Godyla:We're really excited to have you. Why don't we kick it off with talking a little bit about what you do at Microsoft. So, what does your day to day look like as a principle program manager?Arjmand Samuel:So, I am part of the Azure IoT Engineering Team. I'm a program manager on the team. I work on security for IoT and, uh, me and my team, uh, we are responsible for making sure that, uh, IoT services and clients like the software and run times and so on are, are built securely. And when they're deployed, they have the security properties that we need them and our customers demand that. So, so, that's what I do all a long.Nic Fillingham:And, uh, we're going to talk about, uh, zero trust and the relationship between a zero trust approach and IoT. Um, but before we jump into that, Arjmand, uh, we, we had a bit of a look of your, your bio here. I've got a couple of questions I'd love to ask, if that's okay. I want to know about your, sort of, tenure here at Microsoft. Y- y- you've been here for 13 years. Sounds like you started in, in 2008 and you started in the w- what was called the Windows Live Team at the time, as the security lead. I wonder if you could talk a little bit about your, your entry in to Microsoft and being in security in Microsoft for, for that amount of time. You must have seen some, sort of, pretty amazing changes, both from an industry perspective and then also inside Microsoft.Arjmand Samuel:Yeah, yeah, definitely. So, uh, as you said, uh, 2008 was the time, was the year when I came in. I came in with a, a, a degree in, uh, security, in- information security. And then, of course, my thinking and my whole work there when I was hired at Microsoft was to be, hey, how do we actually make sure that our product, which was Windows Live at that time, is secure? It has all the right security properties that, that we need that product to have. So, I- I came in, started working on a bunch of different things, including identity and, and there was, these are early times, right? I mean, we were all putting together this infrastructure, reconciling all the identity on times that we had. And all of those were things that we were trying to bring to Windows Live as well.Arjmand Samuel:So, I was responsible for that as well as I was, uh, working on making sure that, uh, our product had all the right diligence and, and security diligence that is required for a product to be at scale. And so, a bunch of, you know, things like STL and tech modeling and those kind of things. I was leading those efforts as well at, uh, Windows Live.Natalia Godyla:So, if 2008 Arjmand was talking to 2021 Arjmand, what would he be most surprised about, about the evolution over the past 13 years, either within Microsoft or just in the security industry.Arjmand Samuel:Yeah. Yeah. (laughs) That's a great, great question, and I think in the industry itself, e- evolution has been about how all around us. We are now engulfed in technology, connected technology. We call it IoT, and it's all around us. That was not the landscape 10, 15 years back. And, uh, what really is amazing is how our customers and partners are taking on this and applying this in their businesses, right? This meaning the whole industry of IoT and, uh, Internet of Things, and taking that to a level where every data, every piece of data in the physical world can be captured or can be acted upon. That is a big change from the last, uh, 10, 15 to where we are today.Nic Fillingham:I thought you were going to say TikTok dance challenges.Arjmand Samuel:(laughs)Natalia Godyla:(laughs)Nic Fillingham:... because that's, that's where I would have gone.Arjmand Samuel:(laughs) that, too. That, too, right? (laughs)Nic Fillingham:That's a (laughs) digression there. So, I'm pretty sure everyone knows what IoT is. I think we've already said it, but let's just, sort of, start there. So, IoT, Internet of Things. Is, I mean, that's correct, right? Is there, is there multiple definitions of IoT, or is it just Internet of Things? And then, what does the definition of an Internet of Things mean?Arjmand Samuel:Yeah, yeah. It;s a... You know, while Internet of Things is a very recognized acronym these days, but I think talking to different people, different people would have a different idea about how Internet of Thing could be defined. And the way I would define it, and again, not, not, uh, necessarily the authority or the, the only definition. There are many definitions, but it's about having these devices around us. Us is not just people but also our, our manufacturing processes, our cars, our, uh, healthcare systems, having all these devices around, uh, these environments. They are, these devices, uh, could be big, could be small. Could be as small as a very small temperature sensor collecting data from an environment or it could be a Roboticom trying to move a full car up and down an assembly line.Arjmand Samuel:And first of all, collecting data from these devices, then bringing them, uh, uh, using the data to do something interesting and insightful, but also beyond that, being able to control these devices based on those insights. So, now there's a feedback loop where you're collecting data and you are acting on that, that data as well. And that is where, how IoT is manifesting itself today in, in, in the world. And especially for our customers who are, who tend to be more industrial enterprises and so on, it's a big change that is happening. It's, it's a huge change that, uh, they see and we call it the transformation, the business transformation happening today. And part of that business transformation is being led or is being driven through the technology which we call IoT, but it's really a business transformation.Arjmand Samuel:It's really with our customers are finding that in order to remain competitive and in order to remain in business really, at the end of the day, they need to invest. They need to bring in all these technologies to bear, and Internet of Things happens that technology.Nic Fillingham:So, Arjmand, a couple other acronyms. You know, I think, I think most of our audience are pretty familiar with IoT, but we'll just sort of cover it very quickly. So, IoT versus IT. IT is, obviously, you know, information technology, or I think that's the, that's the (laughs) globally accepted-Arjmand Samuel:Yeah, yeah.Nic Fillingham:... definition. You know, do you we think of IoT as subset of IT? What is the relationship of, of those two? I mean, clearly, there are three letters versus two letters, (laughs) but there is relationship there. Wh- wh- what are your thoughts?Arjmand Samuel:Yeah. There's a relationship as well as there's a difference, and, and it's important to bring those two out. Information technology is IT, as we know it now for many years, is all about enterprises running their applications, uh, business applications mostly. For that, they need the network support. They need databases. They need applications to be secured and so on. So, all these have to work together. The function of IT, information technology, is to make sure that the, there is availability of all these resources, applications, networks and databases as well as you have them secured and private and so on.Arjmand Samuel:So, all of that is good, but IoT takes it to the next level where now it's not only the enterprise applications, but it's also these devices, which are now deployed by the enterprise. I mentioned Roboticoms. Measured in a conference room you have all these equipment in there, projection and temperature sensors and occupancy sensors and so on. So, all of those beco- are now the, the add on to what we used to call IT and we are calling it the IoT.Arjmand Samuel:Now, the interesting part here is in the industrial IoT space. Th- this is also called OT, operation technology. So, you know, within an organization there'll be IT and OT. OT's operation technology and these are the people or the, uh, function within an organization who deal with the, with the physical machines, the physical plant. You know, the manufacturing line, the conveyor belts, the Roboticoms, and these are called OT functions.Arjmand Samuel:The interesting part here is the goal of IT is different from the goal of OT. OT is all about availability. OT's all about safety, safety so that it doesn't hurt anybody working on the manufacturing line. OT's all about environmental concerns. So, it should not leak bad chemicals and so on. A while, if you talk about security, and this is, like, a few years back when we would talk about security with an OT person, the, the person who's actually... You know, these are people who actually wear those, uh, hard hats, you know, on, uh, a manufacturing plant. And if you talk about security to an OT person, they will typically refer to that guard standing outside and, and, uh, the-Nic Fillingham:Physical security.Arjmand Samuel:The physical security and the, the walls and the cameras, which would make sure that, you know, and then a key card, and that's about all. This was OT security, but now when we started going in and saying that, okay, all these machines can be connected to, to each other and you can collect all this data and then you can actually start doing something interesting with this data. That is where the definition of security and the functions of OT evolved. And not evolving, I mean different companies are at different stages, but they're now evolving where they're thinking, okay, it's not only about the guard standing outside. It's also the fact that the Roboticom could be taken over remotely and somebody outside, around the world, around the globe could actually be controlling that Roboticom to do something bad. And that realization and the fact that now you actually have to control it in the cyber sense and not only in the physical sense is the evolution that happened between OT.Arjmand Samuel:Now, IT and OT work together as well because the same networks are shared typically. Some of the applications that use the data from these devices are common. So, IT and OT, this is the other, uh, thing that has changed and, and we are seeing that change, is starting to work and come closer. Work together more. IoT's really different, but at the same time requires a lot of stuff that IT has traditionally done.Natalia Godyla:Hmm. So, what we considered to be simple just isn't simple anymore.Arjmand Samuel:That's life, right? (laughs) Yeah.Natalia Godyla:(laughs)Arjmand Samuel:(laughs)Natalia Godyla:So, today we wanted to talk about IoT security. So, let's just start with, with framing the conversation a little bit. Why is IoT security important and what makes it more challenging, different than traditional security?Arjmand Samuel:As I just described, right, I mean, we are now infusing compute and in every environment around us. I mean, we talked a little bit about the conveyor belt. Imagine the conference rooms, the smart buildings and, and all the different technologies that are coming in. These are technologies, while they're good, they're serve a scenario. They, they make things more efficient and so on, but they're also now a point of, uh, of failure for that whole system as well as a way for malicious sectors to bring in code if possible. And to either, uh, imagine a scenario where or an attack where a malicious sector goes into the conveyor belt and knows exactly the product that is passing through. And imagine that's something either takes the data and sells it to somebody or, worse case, stops the conveyor belt. That is millions of dollars of loss very, uh, that data that the company might be incurring.Arjmand Samuel:So, now that there's infused computer all around us, we are now living in a target which in a environment which can be attacked, and which can be used for bad things much more than what it was when we were only applications, networks and databases. Easy to put a wall around. Easy to understand what's going on. They're easy to lock down. But with all these devices around us, it's becoming much and much harder to do the same.Nic Fillingham:And then what sort of, if, if we think about IoT and IoT security, one of the things that, sort of, makes it different, I- I th- think, and here I'd love you to explain this, sort of... I- I'm thinking of it as a, as a, as a spectrum of IoT devices that, I mean, they have a CPU. They have some memory. They have some storage. They're, they're running and operating system in some capacity all the way through to, I guess, m- much more, sort of, rudimentary devices but do have some connection, some network connection in order for instruction or data to, sort of, move backwards and forwards. What is it that makes this collection of stuff difficult to protect or, you know, is it difficult to protect? And if so, why? And then, how do we think about the, the, the potential vectors for attack that are different in this scenario versus, you know, protecting lap tops and servers?Arjmand Samuel:Yeah, yeah. That's a good one. So, uh, what happens is you're right. Uh, IoT devices can be big and small, all right. They could be a small MCU class device with a real-time operating system on it. Very small, very, uh, single purpose device, which is imagine collecting temperature or humidity only. Then we have these very big, what we call the edge or heavy edge devices, which are like server class devices running a Roboticom or, or even a gateway class device, which is aggregating data from many devices, right, as a, a, and then take, taking the data and acting on it.Arjmand Samuel:So, now with all this infrastructure, one of the key things that we have seen is diversity and heterogeneity of these devices. Not just in terms of size, but also in terms of who manufactured them, when they were manufactured. So, many of the temperature sensors in environments could be very old. Like, 20 years old and people are trying to use the same equipment and not have to change anything there. And which they can. Technically they could, but then those devices were never designed in for a connected environment for these, this data to actually, uh, be aggregated and sent on the network, meaning they per- perhaps did not have encryption built into it. So, we have to do something, uh, additional there.Arjmand Samuel:And so now with the diversity of devices, when they came in, the, the feature set is so diverse. Some of them were, are more recent, built with the right security principles and the right security properties, but then some of them might not be. So, this could raise a, a challenge where how do you actually secure an infrastructure where you have this whole disparity and many different types of devices, many different manufacturers, many of ages different for these devices. Security properties are different and as we all know talking about security, the attack would always come from the weakest link. So, the attacker would always find, within that infrastructure, the device which has the least security as a entry point into that infrastructure. So, we can't just say, "Oh, I'll just protect my gateway and I'm fine." We have to have some mitigation for everything on that network. Everything. Even the older ones, older devices. We call them brownfield devices because they tend to be old devices, but they're also part of the infrastructure.Arjmand Samuel:So, how do we actually think about brownfield and the, the newer ones we call greenfield devices? Brownfield and greenfield, how do we think about those given they will come from different vendors, different designs, different security properties? So, that's a key challenge today that we have. So, they want to keep those devices as well as make sure that they are secure because the current threat vectors and threat, uh, the, and attacks are, are much more sophisticated.Natalia Godyla:So, you have a complex set of devices that the security team has to manage and understand. And then you have to determine at another level which of those devices have vulnerabilities or which one is the most vulnerable, and then, uh, assume that your most vulnerable, uh, will be the ones that are exploited. It, so, is that, that typically the attack factor? It's going to be the, the weakest link, like you said? And h- how does an attacker try to breach the IoT device?Arjmand Samuel:Yeah, yeah. And, and this is where we, we started using the term zero trust IoT.Natalia Godyla:Mm-hmm (affirmative).Arjmand Samuel:So, IoT devices are deployed in an environment which can not be trusted, should not be trusted. You should assume that there is zero trust in that environment, and then all these devices, when they are in there, you will do the right things. You'll put in the right mitigations so that the devices themselves are robust. Now, another example I always give here is, and, uh, I, your question around the attack vectors and, and how attacks are happening, typically in the IT world, now that we, we have the term defined, in the IT world, you will always have, you know, physical security. You will always put servers in a room and lock it, and, and so on, right, but in an IoT environment, you have compute devices. Imagine these are powerful edge nodes doing video analytics, but they're mounted on a pole next to a camera outside on the road, right? So, which means the physical access to that device can not be controlled. It could be that edge node, again, a powerful computer device with lots of, you know, CPU and, and so on, is deployed in a mall looking at video streams and analyzing those video streams, again, deployed out there where any attacker physically can get a hold of the device and do bad things.Arjmand Samuel:So, again, the attack vectors are also different between IT and OT or IoT in the sense that the devices might not be physically contained in a, in an environment. So, that puts another layer of what do we do to protect such, uh, environments?Nic Fillingham:And then I want to just talk about the role of, sort of, if we think about traditional computing or traditional, sort of, PC based computing and PC devices, a lot of the attack vectors and a lot of the, sort of, weakest link is the user and the user account. And that's why, you know, phishing is such a massive issue that if we can socially engineer a way for the person to give us their user name and password or whatever, we, we, we can get access to a device through the user account. IoT devices and OT devices probably don't use that construct, right? They probably, their userless. Is that accurate?Arjmand Samuel:Yeah. That's very accurate. So, again, all of the attack vectors which we know from IT are still relevant because, you know, if you, there's a phishing attack and the administrator password is taken over you can still go in and destroy the infrastructure, both IT and IoT. But at the same time, these devices, these IoT devices typically do not have a user interacting with them, typically in the compute sense. You do not log into an IoT device, right? Except in sensor with an MCU, it doesn't even have a user experience, uh, a screen on it. And so, there is typically no user associated with it, and that's another challenge. So you need to still have an identity off the device, not on the device, but off the device, but that identity has to be intrinsic off the device. It has to be part of the device and it has to be stable. It has to be protected, secure, and o- on the device, but it does not typically a user identity.Arjmand Samuel:And, and that's not only true for temperature sensors. You know, the smaller MCU class devices. That's true for edge nodes as well. Typically, an edge node, and by the way, when I say the edge node, edge node is a full blown, rich operating system. CPU, tons of memory, even perhaps a GPU, but does not typically have a user screen, a keyboard and a mouse. All it has is a video stream coming in through some protocol and it's analyzing that and then making some AI decisions, decisions based on AI. And, and, but that's a powerful machine. Again, there might never ever be a user interactively signing into it, but the device has an identity of its own. It has to authenticate itself and it workload through other devices or to the Cloud. And all of that has to be done in a way where there is no user attached to it.Natalia Godyla:So, with all of this complexity, how can we think about protecting against IoT attacks. You discussed briefly that we still apply the zero trust model here. So, you know, at a high level, what are best practices for protecting IoT?Arjmand Samuel:Yeah, yeah. Exactly. Now that we, we just described the environment, we described the devices and, and the attacks, right? The bad things that can happen, how do we do that? So, the first thing we want to do, talk about is zero trust. So, do not trust the environment. Even if it is within a factory and you have a guard standing outside and you have all the, you know, the physical security, uh, do not trust it because there are still vectors which can allow malicious sectors to come into those devices. So, that's the first one, zero trust.Arjmand Samuel:Uh, do not trust anything that is on the device unless you explicitly trust it, you explicitly make sure that you can go in and you can, attest the workload, as an example. You can attest the identity of the device, as an example. And you can associate some access control polices and you have to do it explicitly and never assume that this is, because it's a, uh, environment in a factory you're good. So, you never assume that. So, again, that's a property or a principle within zero trust that we always exercise.Arjmand Samuel:Uh, the other one is you always assume breach. You always assume that bad things will happen. I- it's not if they'll happen or not. It's about when they're s- uh, going to happen. So, for the, that thinking, then you're putting in place mitigations. You are thinking, okay, if bad things are going to happen, how do I contain the bad things? How do I contain? How do I make sure that first of all, I can detect bad things happening. And we have, and we can talk about some of the offerings that we have, like Defender for IoT as an example, which you can deploy on to the environment. Even if it's brownfield, you can detect bad things happening based on the network characteristics. So, that's Defender for IoT.Arjmand Samuel:And, and once you can detect bad things happening then you can do something about it. You get an alert. You can, you can isolate that device or take that device off the network and refresh it and do those kind of things. So, the first thing that needs to happen is you assume that it's going breach. You always assume that whatever you are going to trust is explicitly trusted. You always make sure that there is a way to explicitly trust, uh, uh, uh, either the workload or the device or the network that is connected onto the device.Nic Fillingham:So, if we start with verify explicitly, in the traditional compute model where it's a user on a device, we can verify explicitly with, usually, multi factor authentication. So, I have my user name and password. I add an additional layer of authentication, whether it's an, you know, app on my phone, a key or something, some physical device, there's my second factor and I'm, I'm verified explicitly in that model. But again, no users or the user's not, sort of, interacting with the device in, sort of, that traditional sense, so what are those techniques to verify explicitly on an IoT device?Arjmand Samuel:Yeah. I, exactly. So, we, in that white paper, which we are talking about, we actually put down a few things that you can actually do to, to, en- ensure that you have all the zero trust requirements together. Now, the first one, of course, is you need, uh, all devices to have strong identity, right? So, because identity is a code. If you can not identi- identify something you can not, uh, give it an access control policy. You can not trust the data that is coming out from that, uh, device. So, the first thing you do is you have a strong identity. By a strong identity we mean identity, which is rooted in hardware, and so, what we call the hardware based root of trust. It's technologies like TPM, which ensure that you have the private key, which is secured in our hardware, in the hardware and you can not get to it, so and so on. So, you, you ensure that you have a, a strong identity.Arjmand Samuel:You always have these privilege access so you do not... And these principles have been known to our IT operations forever, right? So, many years they have been refined and, uh, people know about those, but we're applying them to the IoT world. So, these privilege access, if our device is required to access another device or data or to push out data, it should only do that for the function it is designed for, nothing more than that. You should always have some level of, uh, device health check. Perhaps you should be able to do some kind of test station of the device. Again, there is no user to access the device health, but you should be able to do, and there are ways, there are services which allow you to measure something on the device and then say yes it's good or not.Arjmand Samuel:You should be able to do a continuous update. So, in case there is a device which, uh, has been compromised, you should be able to reclaim that device and update it with a fresh image so that now you can start trusting it. And then finally you should be able to securely monitor it. And not just the device itself, but now we have to technologies which can monitor the data which is passing through the network, and based on those characteristics can see if a device is attacked or being attacked or not. So, those are the kind of things that we would recommend for a zero trust environment to take into account and, and make those requirements a must for, for IoT deployments.Natalia Godyla:And what's Microsoft's role in protecting against these attacks?Arjmand Samuel:Yeah, yeah. So, uh, a few products that we always recommend. If somebody is putting together a new IoT device right from the silicone and putting that device together, we have a great secure be design device, which is called Azure Sphere. Azure Sphere has a bunch of different things that it does, including identity, updates, cert management. All these are important functions that are required for that device to function. And so, a new device could use the design that we have for Azure Sphere.Arjmand Samuel:Then we have, a gateway software that you put on a gateway which allows you to secure the devices behind that gateway for on time deployments. We have Defender for IoT, again as I mentioned, but Defender for IoT is on-prem, so you can actually monitor all the tracks on the network and on the devices. You could also put a agent, a Micro Agent on these devices, but then it also connects to Azure Sentinel. Azure Sentinel is a enterprise class user experience for security administrators to know what bad things are happening on, on-prem. So, it, the whole end to end thing could works all the way from the network, brownfield devices to the Cloud.Arjmand Samuel:We also have things like, uh, IoT Hub Device Provisioning service. Device provisioning service is an interesting concept. I'll try to briefly describe that. So, what happens is when you have an identity on a device and you want to actually put that device, deploy that device in your environment, it has to be linked up with a service in the Cloud so that it can, it knows the device, there's an identity which is shared and so on. Now, you could do it manually. You could actually bring that device in, read a code, put it in the Cloud and your good to go because now the Cloud knows about that device, but then what do you do when you have to deploy a million devices? And we're talking about IoT scale, millions. A fleet of millions of devices. If you take that same approach of reading a key and putting it in the Cloud, one, you'd make mistakes. Second, you will probably need a lifetime to take all those keys and put them in the cloud.Arjmand Samuel:So, in order to solve that problem, we have the device provisioning service, which it's a service in the Cloud. It is, uh, linked up to the OEMs or manufacturing devices. And when you deploy our device in your field, you do not have to do any of that. Your credentials are passed between the service and the, and the device. So, so, that's another service. IoT Hub Device Provisioning Service.Arjmand Samuel:And then we have, uh, a work, the, uh, a piece of work that we have done, which is the Certification of IoT Devices. So, again, you need the devices to have certain security properties. And how do you do that? How do you ensure that they have the right security properties, like identity and cert management and update ability and so on, we have what we call the Edge Secured-core Certification as well as Azure Certified Device Program. So, any device which is in there has been tested by us and we certify that that device has the right security properties. So, we encourage our customers to actually pick from those devices so that they, they actually get the best security properties.Natalia Godyla:Wow. That's a lot, which is incredible. What's next for Microsoft's, uh, approach to IoT security?Arjmand Samuel:Yeah, yeah. So, uh, one of the key things that we have heard our customers, anybody who's going into IoT ask the question, what is the risk I'm taking? Right? So, I'm deploying all these devices in my factories and Roboticom's connecting them, and so on, but there's a risk here. And how do I quantify that risk? How do I understand th- that risk and how do I do something about that risk?Arjmand Samuel:So, we, we got those questions many years back, like four, five years back. We started working with the industry and together with the Industrial Internet Consortium, IIC, which a consortium out there and there are many companies part of that consortium, we led something called The Security Maturity Model for IoT. So, so, we put down a set of principles and a set of processes you follow to evaluate the maturity of your security in IoT, right? So, it's a actionable thing. You take the document, you evaluate, and then once you have evaluated, it actually give you a score.It says you're level one, or two, or three, or four. Four, that's the authentication. All else is controlled management. And then based on th- that level, you know where you care, first of all. So, you know what your weaknesses are and what you need to do. So, that's a very actionable thing. But beyond that, if you're at level two and you want to be at level four, and by want to means your scenario dictates that you should be at level four, it is actionable. It gives you a list of things to do to go from level two to level four. And then you can reevaluate yourself and then you know that you're at level four. So, that's a maturityArjmand Samuel:Now, In order to operationalize that program with in partnership with IAC, we also have been, and IAC's help, uh, has been instrumental here, we have been working on a training program where we have been training auditors. These are IoT security auditors, third party, independent auditors who are not trained on SMMs Security Maturity Model. And we tell our customers, if you have a concern, get yourself audited using SMM, using the auditors and that will tell you where you are and where you need to go. So, it's evolving. Security for IoT's evolving, but I think we are at the forefront of that evolution.Nic Fillingham:Just to, sort of, finish up here, I'm thinking of some of the recent IoT security stories that were in the news. We won't mention any specifically, but there, there have been some recently. My take aways hearing those stories reading those stories in the news is that, oh, wow, there's probably a lot of organizations out here and maybe individuals at companies that are using IoT and OT devices that maybe don't see themselves as being security people or having to think about IoT security, you know T security. I just wonder if do you think there is a, a population of folks out here that don't think of themselves as IoT security people, but they really are? And then therefore, how do we sort of go find those people and help them go, get educated about securing IoT devices?Arjmand Samuel:Yeah, that's, uh, that's exactly what we are trying to do here. So, uh, people who know security can obviously know the bad things that can happen and can do something about it, but the worst part is that in OT, people are not thinking about all the bad things that can happen in the cyber world. You mentioned that example with that treatment plant. It should never have been connected to the network, unless required. And if it was connected to the, uh, to the network, to the internet, you should have had a ton a mitigations in place in case somebody was trying to come in and should have been stopped. And in that particular case, y- there was a phishing attack and the administrative password was, was taken over. But even with that, with the, some of our products, like Defender for IoT, can actually detect the administrative behavior and can, can detect if an administrator is trying to do bath things. It can still tell other administrators there's bad things happening.Arjmand Samuel:So, there's a ton of things that one could do, and it all comes down, what we have realized is it all comes down to making sure that this word gets out, that people know that there is bad things that can happen with IoT and it's not only your data being stolen. It's very bad things as in that example. And so, the word out, uh, so that we can, uh, we can actually make IoT more secure.Nic Fillingham:Got it. Arjmand, again, thanks so much for your time. It sounds like we really need to get the word out. IoT security is a thing. You know, if you work in an organization that employs IoT or OT devices, or think you might, go and download this white paper. Um, we'll put the link in the, uh, in the show notes. You can just search for it also probably on the Microsoft Security Blog and learn more about cyber security for IoT, how to apply zero trust model. Share it with your, with your peers and, uh, let's get as much education as we can out there.Arjmand Samuel:Thank you very much for this, uh, opportunity.Nic Fillingham:Thanks, Arjmand, for joining us. I think we'll definitely touch on cyber security for IoT, uh, in future episodes. So, I'd love to talk to you again. (music)Arjmand Samuel:Looking forward to it. (music)Natalia Godyla:Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode.Nic Fillingham:And don't forget to Tweet us @MSFTSecurity or email us at securityunlocked@Microsoft.com with topics you'd like to hear on a future episode. (music) Until then, stay safe.Natalia Godyla:Stay secure. (music)
7/7/2021

Looking a Gift Card Horse in the Mouth

Ep. 35
Is it just me, or do you also miss the goodoledays of fraudulent activity?You remember the kind I’m talking about, theemails from princes around the world asking for just a couple hundred dollars to help them unfreeze or retrieve their massive fortune which they would share with you. Attacks havegrownmore nuanced, complex, and invasive since then, but because of the unbelievable talent at Microsoft, we’re constantly getting better at defending against it.On this episode of Security Unlocked, hosts Nic Fillingham and NataliaGodylasit down with returning champion, Emily Hacker, to discuss Business Email Compromise (BEC), an attack that has perpetrators pretending to be someone from the victim’s place of work and instructs them to purchase gift cards and send them to thescammer.Maybe it’s good tolookagift cardhorse in the mouth?In This Episode You Will Learn:Why BEC is such an effective and pervasive attackWhat are the key things to look out for to protect yourself against oneWhy BEC emails are difficult to trackSome Questions We Ask:How do the attackers mimic a true-to-form email from a colleague?Why do we classify this type of email attack separately from others?Why are they asking for gift cards rather than cash?Resources:Emily Hacker’s LinkedIn:https://www.linkedin.com/in/emilydhacker/FBI’s2020Internet Crime Reporthttps://www.ic3.gov/Media/PDF/AnnualReport/2020_IC3Report.pdfNicFillingham’sLinkedIn:https://www.linkedin.com/in/nicfill/NataliaGodyla’sLinkedIn:https://www.linkedin.com/in/nataliagodyla/Microsoft Security Blog:https://www.microsoft.com/security/blog/Related:Security Unlocked: CISO Series with Bret Arsenaulthttps://SecurityUnlockedCISOSeries.comTranscript:[Full transcript can be found athttps://aka.ms/SecurityUnlockedEp35]Nic Fillingham:Hello, and welcome to Security Unlocked, a new podcast from Microsoft, where we unlock insights from the latest in news and research from across Microsoft security engineering and operations teams. I'm Nic Fillingham.Natalia Godyla:And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft security, deep dive into the newest thread intel, research and data science.Nic Fillingham:And profile some of the fascinating people working on artificial intelligence in Microsoft security.Natalia Godyla:And now, let's unlock the pod.Nic Fillingham:Hello listeners, hello, Natalia, welcome to episode 35 of Security Unlocked. Natalia, how are you?Natalia Godyla:I'm doing well as always and welcome everyone to another show.Nic Fillingham:It's probably quite redundant, me asking you how you are and you asking me how you are, 'cause that's not really a question that you really answer honestly, is it? It's not like, "Oh, my right knee's packing at the end a bit," or "I'm very hot."Natalia Godyla:Yeah, I'm doing terrible right now, actually. I, I just, uh- Nic Fillingham:Everything is terrible.Natalia Godyla:(laughs)Nic Fillingham:Well, uh, our guest today is, is a returning champ, Emily Hacker. This is her third, uh, appearance on Security Unlocked, and, and she's returning to talk to us about a, uh, new business email compromise campaign that she and her colleagues helped unearth focusing on some sort of gift card scam.Nic Fillingham:We've covered business email compromise before or BEC on the podcast. Uh, we had, uh, Donald Keating join us, uh, back in the early days of Security Unlocked on episode six. The campaign itself, not super sophisticated as, as Emily sort of explains, but so much more sort of prevalent than I think a lot of us sort of realize. BEC was actually the number one reported source of financial loss to the FBI in 2020. Like by an order of magnitude above sort of, you know, just places second place, third place, fourth place. You know, I think the losses were in the billions, this is what was reported to the FBI, so it's a big problem. And thankfully, we've got people like, uh, Emily on it.Nic Fillingham:Natalia, can you give us the TLDR on the, on the campaign that Emily helps describe?Natalia Godyla:Yeah, as you said, it's, uh, a BEC gift card campaign. So the attackers use typosquatted domains, and socially engineered executives to request from employees that they purchase gift cards. And the request is very vague. Like, "I need you to do a task for me, "or "Let me know if you're available." And they used that authority to convince the employees to purchase the gift cards for them. And they then co-converted the gift cards into crypto at, at scale to collect their payout.Nic Fillingham:Yeah, and we actually discuss with Emily that, that between the three of us, Natalia, myself and Emily, we actually didn't have a good answer for how the, uh- Natalia Godyla:Mm-hmm (affirmative).Nic Fillingham:... these attackers are laundering these gift cards and, and converting them to crypto. So we're gonna, we're gonna go and do some research, and we're gonna hopefully follow up on a, on a future episode to better understand that process. Awesome. And so with that, on with the pod.Natalia Godyla:On with the pod.Nic Fillingham:Welcome back to the Security Unlocked podcast. Emily hacker, how are you?Emily Hacker:I'm doing well. Thank you for having me. How are you doing?Nic Fillingham:I'm doing well. I'm trying very hard not to melt here in Seattle. We're recording this at the tail end of the heat wave apocalypse of late June, 2021. Natalia, are you all in, I should have asked, have you melted or are you still in solid form?Natalia Godyla:I'm in solid form partially because I think Seattle stole our heat. I'm sitting in Los Angeles now.Nic Fillingham:Uh huh, got it. Emily, thank you for joining us again. I hope you're also beating the heat. You're here to talk about business email compromise. And you were one of the folks that co-authored a blog post from May 6th, talking about a new campaign that was discovered utilizing gift card scams. First of all, welcome back. Thanks for being a return guest. Second of all, do I get credit or do I get blame for the tweet that enabled you to, to- Emily Hacker:(laughs) It's been so long, I was hoping you would have forgotten.Nic Fillingham:(laughs) Emily and I were going backward forward on email, and I basically asked Emily, "Hey, Emily, who's like the expert at Microsoft on business email compromise?" And then Emily responded with, "I am."Emily Hacker:(laughs)Nic Fillingham:As in, Emily is. And so I, I think I apologized profusely. If I didn't, let me do that now for not assuming that you are the subject matter expert, but that then birthed a very fun tweet that you put out into the Twitter sphere. Do you wanna share that with the listeners or is this uncomfortable and we need to cut it from the audio?Emily Hacker:No, it's fine. You can share with the listeners. I, uh- Nic Fillingham:(laughs)Emily Hacker:... I truly was not upset. I don't know if you apologized or not, because I didn't think it was the thing to apologize for. Because I didn't take your question as like a, "Hey," I'm like, "Can you like get out of the way I did not take it that way at all. It was just like, I've been in this industry for five years and I have gotten so many emails from people being like, "Hey, who's the subject matter in X?" And I'm always having to be like, "Oh, it's so and so," you know, or, "Oh yeah, I've talked to them, it's so-and-so." And for once I was like, "Oh my goodness, it me."Natalia Godyla:(laughs)Emily Hacker:Like I'm finally a subject matter in something. It took a long time. So the tweet was, was me being excited that I got to be the subject matter expert, not me being upset at you for asking who it was.Nic Fillingham:No, I, I took it in it's, I did assume that it was excitement and not crankiness at me for not assuming that it would be you. But I was also excited because I saw the tweet, 'cause I follow you on Twitter and I'm like, "Oh, that was me. That was me." And I got to use- Emily Hacker:(laughs)Nic Fillingham:... I got to use the meme that's the s- the, the weird side eye puppet, the side, side eye puppet. I don't know if that translates. There's this meme where it's like a we-weird sort of like H.R. Pufnstuf sort of reject puppet, and it's sort of like looking sideways to the, to the camera.Emily Hacker:Yes.Nic Fillingham:Uh, I've, and I've- Emily Hacker:Your response literally made me laugh a while though alone in my apartment.Nic Fillingham:(laughs_ I've never been able to use that meme in like its perfect context, and I was like, "This is it."Emily Hacker:(laughs) We just set that one up for a comedy home run basically.Nic Fillingham:Yes, yes, yes. And I think my dad liked the tweet too- Natalia Godyla:(laughs)Nic Fillingham:... so I think I had that, so that was good.Emily Hacker:(laughs)Nic Fillingham:Um, he's like my only follower.Emily Hacker:Pure success.Nic Fillingham:Um, well, on that note, so yeah, we're here to talk about business email compromise, which we've covered on the, on the podcast before. You, as I said, uh, co-authored this post for May 6th. We'll have a, a broader conversation about BEC, but let's start with these post. Could you, give us a summary, what was discussed in this, uh, blog post back on, on May 6th?Emily Hacker:Yeah, so this blog post was about a specific type of business email compromise, where the attackers are using lookalike domains and lookalike email addresses to send emails that are trying, in this particular case, to get the user to send them a gift card. And so this is not the type of BEC where a lot of people might be thinking of in terms of conducting wire transfer fraud, or, you know, you read in the news like some company wired several million dollars to an attacker. That wasn't this, but this is still creating a financial impact and that the recipient is either gonna be using their own personal funds or in some cases, company funds to buy gift cards, especially if the thread actor is pretending to be a supervisor and is like, "Hey, you know, admin assistant, can you buy these gift cards for the team?" They're probably gonna use company funds at that point.Emily Hacker:So it's still something that we keep an eye out for. And it's actually, these gift card scams are far and away the most common, I would say, type of BEC that I am seeing when I look for BEC type emails. It's like, well over, I would say 70% of the BEC emails that I see are trying to do this gift card scam, 'cause it's a little easier, I would say for them to fly under the radar maybe, uh, in terms of just like, someone's less likely to report like, "Hey, why did you spend $30 on a gift card?" Than like, "Hey, where did those like six billion dollars go?" So like in that case, "This is probably a little easier for them to fly under the radar for the companies. But in terms of impact, if they send, you know, hundreds upon hundreds of these emails, the actors are still gonna be making a decent chunk of change at the end of the day.Emily Hacker:In this particular instance, the attackers had registered a couple hundred lookalike domains that aligned with real companies, but were just a couple of letters or digits off, or were using a different TLD, or use like a number or sort of a letter or something, something along the lines to where you can look at it and be like, "Oh, I can tell that the attacker is pretending to be this other real company, but they are actually creating their own."Emily Hacker:But what was interesting about this campaign that I found pretty silly honestly, was that normally when the attacker does that, one would expect them to impersonate the company that their domain is looking like, and they totally didn't in this case. So they registered all these domains that were lookalike domains, but then when they actually sent the emails, they were pretending to be different companies, and they would just change the display name of their email address to match whoever they were impersonating.Emily Hacker:So one of the examples in the blog. They're impersonating a guy named Steve, and Steve is a real executive at the company that they sent this email to. But the email address that they registered here was not Steve, and the domain was not for the company that Steve works at. So they got a little bit, I don't know if they like got their wires crossed, or if they just were using the same infrastructure that they were gonna use for a different attack, but these domains were registered the day before this attack. So it definitely doesn't seem like opportunistic, and which it doesn't seem like some actors were like, "Oh, hey look, free domains. We'll send some emails." Like they were brand new and just used for strange purposes.Natalia Godyla:Didn't they also fake data in the headers? Why would they be so careless about connecting the company to the language in the email body but go through the trouble of editing the headers?Emily Hacker:That's a good question. They did edit the headers in one instance that I was able to see, granted I didn't see every single email in this attack because I just don't have that kind of data. And what they did was they spoofed one of the headers, which is an in-reply-to a header, which makes it, which is the header that would let us know that it's a real reply. But I worked really closely with a lot of email teams and we were able to determine that it wasn't indeed a fake reply.Emily Hacker:My only guess, honestly, guess as to why that happened is one of two things. One, the domain thing was like a, a mess up, like if they had better intentions and the domain thing went awry. Or number two, it's possible that this is multiple attackers conducting. If one guy was responsible for the emails with the mess of domains, and a different person was responsible for the one that had the email header, like maybe the email header guy is just a little bit more savvy at whose job of crime than the first guy.Natalia Godyla:(laughs)Nic Fillingham:Yeah, I li- I like the idea of, uh, sort of ragtag grubbing. I don't mean to make them an attractive image, but, you know, a ragtag group of people here. And like, you've got a very competent person who knows how to go and sort of spoof domain headers, and you have a less competent person who is- Emily Hacker:Yeah. It's like Pinky and the Brain.Nic Fillingham:Yeah, it is Pinky and the Brain. That's fantastic. I love the idea of Pinky and the Brain trying to conduct a multi-national, uh- Emily Hacker:(laughs)Nic Fillingham:... BEC campaign as their way to try and take over the world. Can we back up a little bit? We jumped straight into this, which is totally, you know, we asked you to do that. So, but let's go back to a little bit of basics. BEC stands for business email compromise. It is distinct from, I mean, do you say CEC for consumer email compromise? Like what's the opposite side of that coin? And then can you explain what BEC is for us and why we sort of think about it distinctly?Emily Hacker:Mm-hmm (affirmative), so I don't know if there's a term for the non-business side of BEC other than just scam. At its basest form, what BEC is, is just a scam where the thread actors are just trying to trick people out of money or data. And so it doesn't involve any malware for the most part at the BEC stage of it. It doesn't involve any phishing for the most part at the BEC stage of it. Those things might exist earlier in the chain, if you will, for more sophisticated attacks. Like an attacker might use a phishing campaign to get access before conducting the BEC, or an attacker might use like a RAT on a machine to gain access to emails before the actual BEC. But the business email compromise email itself, for the most part is just a scam. And what it is, is when an attacker will pretend to be somebody at a company and ask for money data that can include, you know, like W-2's, in which case that was still kind of BEC.Emily Hacker:And when I say that they're pretending to be this company, there's a few different ways that that can happen. And so, the most, in my opinion, sophisticated version of this, but honestly the term sophisticated might be loaded and arguable there, is when the attacker actually uses a real account. So business email compromise, the term might imply that sometimes you're actually compromising an email. And those are the ones where I think are what people are thinking of when they're thinking of these million billion dollar losses, where the attacker gains access to an email account and basically replies as the real individual.Emily Hacker:Let's say that there was an email thread going on between accounts payable and a vendor, and the attacker has compromised the, the vendor's email account, well, in the course of the conversation, they can reply to the email and say, "Hey, we just set up a new bank account. Can you change the information and actually wire the million dollars for this particular project to this bank account instead?" And if the recipient of that email is not critical of that request, they might actually do that, and then the money is in the attacker's hands. And it's difficult to be critical of that request because it'll sometimes literally just be a reply to an ongoing email thread with someone you've probably been doing business with for a while, and nothing about that might stand out as strange, other than them changing the account. It can be possible, but difficult to get it back in those cases. But those are definitely the ones that are, I would say, the most tricky to spot.Emily Hacker:More common, I would say, what we see is the attacker is not actually compromising an email, not necessarily gaining access to it, but using some means of pretending or spoofing or impersonating an email account that they don't actually have access to. And that might include registering lookalike domains as in the case that we talked about in this blog. And that can be typosquatted domains or just lookalike domains, where, for example, I always use this example, even though I doubt this domain is available, but instead of doing microsoft.com, they might do Microsoft with a zero, or like Microsoft using R-N-I-C-R-O-S-O-F-t.com. So it looks like an M at first glance, but it's actually not. Or they might do something like microsoft-com.org or something, which that obviously would not be available, but you get the point. Where they're just getting these domains that kind of look like the right one so that somebody, at first glance, will just look up and be like, "Oh yeah, that looks like Microsoft. This is the right person."Emily Hacker:They might also, more commonly, just register emails using free email services and either do one of two things, make the email specific to the person they're targeting. So let's say that an attacker was pretending to be me. They might register emilyhacker@gmail.com, or more recently and maybe a little bit more targeted, they might register like emily.hacker.microsoft.com@gmail.com, and then they'll send an email as me. And then on the, I would say less sophisticated into the spectrum, is when they are just creating an email address that's like bob@gmail.com. And then they'll use that email address for like tons of different targets, like different victims. And they'll either just change the display name to match someone at the company that they're targeting, or they might just change it to be like executive or like CEO or something, which like the least believable of the bunch in my opinion is when they're just reusing the free emails.Emily Hacker:So that's kind of the different ways that they can impersonate or pretend to be these companies, but I see all of those being used in various ways. But for sure the most common is the free email service. And I mean, it makes sense, because if you're gonna register a domain name that cost money and it takes time and takes skill, same with compromising an email account, but it's quick and easy just to register a free email account. So, yeah.Nic Fillingham:So just to sort of summarize here. So business email compromise i-is obviously very complex. There's lots of facets to it.Emily Hacker:Mm-hmm (affirmative).Nic Fillingham:It sounds like, first of all, it's targeted at businesses as opposed to targeted individuals. In targeted individuals is just more simple scams. We can talk about those, but business email compromise, targeted at businesses- Emily Hacker:Mm-hmm (affirmative).Nic Fillingham:... and the end goal is probably to get some form of compromise, and which could be in different ways, but some sort of compromise of a communication channel or a communication thread with that business to ultimately get some money out of them?Emily Hacker:Yep, so it's a social engineering scheme to get whatever their end goals are, usually money. Yeah.Nic Fillingham:Got it. Like if I buy a gift card for a friend or a family for their birthday, and I give that to them, the wording on the bottom says pretty clearly, like not redeemable for cash. Like it's- Emily Hacker:So- Nic Fillingham:... so what's the loophole they're taking advantage of here?Emily Hacker:Criminals kind of crime. Apparently- Natalia Godyla:(laughs)Emily Hacker:... there are sites, you know, on the internet specifically for cashing out gift cards for cryptocurrency.Nic Fillingham:Hmm.Emily Hacker:And so they get these gift cards specifically so that they can cash them out for cryptocurrency, which then is a lot, obviously, less traceable as opposed to just cash. So that is the appeal of gift cards, easier to switch for, I guess, cryptocurrency in a much less traceable manner for the criminals in this regard. And there are probably, you know, you can sell them. Also, you can sell someone a gift card and be like, "Hey, I got a $50 iTunes gift card. Give me $50 and you got an iTunes gift card." I don't know if iTunes is even still a thing. But like that is another means of, it's just, I think a way of like, especially the cryptocurrency one, it's just a way of distancing themselves one step from the actual payout that they end up with.Nic Fillingham:Yeah, I mean, it's clearly a, a laundering tactic.Emily Hacker:Mm-hmm (affirmative).Nic Fillingham:It's just, I'm trying to think of like, someone's eventually trying to get cash out of this gift card-Emily Hacker:Mm-hmm (affirmative).Nic Fillingham:... and instead of going into Target with 10,000 gift cards, and spending them all, and then turning right back around and going to the returns desk and saying like, "I need to return these $10,000 that I just bought."Emily Hacker:Mm-hmm (affirmative).Nic Fillingham:I guess I'm just puzzled as to how, at scale- Emily Hacker:Yeah.Nic Fillingham:... and I guess that's the key word here, at scale, at a criminal scale, how are they, what's the actual return? Are they getting, are they getting 50 cents on the dollar? Are they getting five cents on the dollar? Are they getting 95 cents on the dollar? Um, it sounds like, maybe I don't know how to ask that question, but I think it's a fascinating one, I'd love to learn more about.Emily Hacker:It is a good question. I would imagine that the, the sites where they exchange them for cryptocurrency are set up in a way where rather than one person ending up with all the gift cards to where that you have an issue, like what you're talking about with like, "Hey, uh, can I casually return these six million gift cards?" Like rather than that, they're, it's more distributed. But there probably is a surcharge in terms of they're not getting a one-to-one, but it's- Nic Fillingham:Yeah.Emily Hacker:... I would not imagine that it's very low. Or like I would not imagine that they're getting five cents on the dollar, I would imagine it's higher than that.Nic Fillingham:Got it.Emily Hacker:But I don't know. So, that's a good question.Natalia Godyla:And we're talking about leveraging this cryptocurrency model to cash them out. So has there been an increase in these scams because they now have this ability to cash them out for crypto? Like, was that a driver?Emily Hacker:I'm not sure. I don't know how long the crypto cash out method has been available.Natalia Godyla:Mm-hmm (affirmative).Emily Hacker:I've only recently learned about it, but that's just because I don't spend, I guess I don't spend a lot of time dealing with that end of the scam. For the most part, my job is looking at the emails themselves. So, the, learning what they're doing once they get the gift cards was relatively new to me, but I don't think it's new to the criminals. So it's hard for me to answer that question, not knowing how long the, the crypto cash out method has been available to them. But I will say that it does feel like, in the last couple of years, gift card scams have just been either increasing or coming into light more, but I think increasing.Nic Fillingham:Emily, what's new about this particular campaign that you discussed in the blog? I-it doesn't look like there's something very new in the approach here. This feels like it's a very minor tweak on techniques that have been employed for a while. Tell me what's, what's new about this campaign? (laughs)Emily Hacker:(laughs) Um, so I would agree that this is not a revolutionary campaign.Nic Fillingham:Okay.Emily Hacker:And I didn't, you know, choose to write this one into the blog necessarily because it's revolutionary, but rather because this is so pervasive that I felt like it was important for Microsoft customers to be aware that this type of scam is so, I don't know what word, now we're both struggling with words, I wanna say prolific, but suddenly the definition of that word seems like it doesn't fit in that sentence.Nic Fillingham:No, yeah, prolific, that makes sense. Emily Hacker:Okay.Nic Fillingham:Like, this is, it sounds like what you're saying is, this blog exists not because this campaign is very unique and some sort of cutting-edge new technique, it exists because it's incredibly pervasive.Emily Hacker:Yes.Nic Fillingham:And lots and lots of people and lots and lots of businesses are probably going to get targeted by it. Emily Hacker:Exactly.Nic Fillingham:And we wanna make sure everyone knows about it.Emily Hacker:And the difference, yes, and the, the only real thing that I would say set this one apart from some of the other ones, was the use of the lookalike domains. Like so many of the gift cards scams that I see, so many of the gift cards scams that I see are free email accounts, Gmail, AOL, Hotmail, but this one was using the lookalike domains. And that kind of gave us a little bit more to talk about because we could look into when the domains were registered. I saw that they were registered the day, I think one to two days before the attack commenced. And that also gave us a little bit more to talk about in terms of BEC in the blog, because this kind of combined a couple of different methods of BEC, right? It has the gift cards scam, which we see just all the time, but it also had that kind of lookalike domain, which could help us talk about that angle of BEC.Emily Hacker:But I had been, Microsoft is, is definitely starting to focus in on BEC, I don't know, starting to focus in, but increasing our focus on BEC. And so, I think that a lot of the stuff that happens in BEC isn't new. Because it's so successful, there's really not much in the way of reason for the attackers to shift so dramatically their tactics. I mean, even with the more sophisticated attacks, such as the ones where they are compromising an account, those are still just like basic phishing emails, logging into an account, setting up forwarding rules, like this is the stuff that we've been talking about in BEC for a long time. But I think Microsoft is talking about these more now because we are trying to get the word out, you know, about this being such a big problem and wanting to shift the focus more to BEC so that more people are talking about it and solving it. Natalia Godyla:It seemed like there was A/B testing happening with the cybercriminals. They had occasionally a soft intro where someone would email and ask like, "Are you available?" And then when the target responded, they then tried to get money from that individual, or they just immediately asked for money.Emily Hacker:Mm-hmm (affirmative).Natalia Godyla:Why the different tactics? Were they actually attempting to be strategic to test which version worked, or was it just, like you said, different actors using different methods?Emily Hacker:I would guess it's different actors using different methods or another thing that it could be was that they don't want the emails to say the same thing every time, because then it would be really easy for someone like me to just identify them- Natalia Godyla:Mm-hmm (affirmative).Emily Hacker:... in terms of looking at mail flow for those specific keywords or whatever. If they switch them up a little bit, it makes it harder for me to find all the emails, right? Or anybody. So I think that could be part of the case in terms of just sending the exact same email every time is gonna make it really easy for me to be like, "Okay, well here's all the emails." But I think there could also be something strategic to it as well. I just saw one just yesterday actually, or what day is it, Tuesday? Yeah, so it must've been yesterday where the attacker did a real reply.Emily Hacker:So they sent the, the soft opening, as you said, where it just says, "Are you available?" And then they had sent a second one that asked that full question in terms of like, "I'm really busy, I need you to help me, can you call me or email me," or something, not call obviously, because they didn't provide a phone number. Sometimes they do, but in this case, they didn't. And they had actually responded to their own email. So the attacker replied to their own email to kind of get that second push to the victim. The victim just reported the email to Microsoft so they didn't fall for it. Good for them. But it does seem that there might be some strategy involved or desperation. I'm not sure which one.Natalia Godyla:(laughs) Fine line between the two.Emily Hacker:(laughs)Nic Fillingham:I'd want to ask question that I don't know if you can answer, because I don't wanna ask you to essentially, you know, jeopardize any operational security or sort of tradecraft here, but can you give us a little tidbit of a glimpse of your, your job, and, and how you sort of do this day-to-day? Are you going and registering new email accounts and, and intentionally putting them in dodgy places in hopes of being the recipient? Or are you just responding to emails that have been reported as phishing from customers? Are you doing other things like, again, I don't wanna jeopardize any of your operational security or, you know, the processes that you use, but how do you find these?Emily Hacker:Mm-hmm (affirmative).Nic Fillingham:And how do you then sort of go and follow the threads and uncover these campaigns?Emily Hacker:Yeah, there's a few ways, I guess that we look for these. We don't currently have any kind of like Honey accounts set up or anything like that, where we would be hoping to be targeted and find them this way. I know there are different entities within Microsoft who are, who do different things, right? So my team is not the entity that would be doing that. So my team's job is more looking at what already exists. So we're looking at stuff that customers have reported, and we're also looking at open source intelligence if anyone else has tweeted or released a blog or something about an ongoing BEC campaign, that might be something that then I can go look at our data and see if we've gotten.Emily Hacker:But the biggest way outside of those, those are the two, like I would say smaller ways. The biggest way that we find these campaigns is we do technique tracking. So we have lots of different, we call them traps basically, and they run over all mail flow, and they look for certain either keywords or there are so many different things that they run on. Obviously not just keywords, I'm just trying to be vague here. But like they run on a bunch of different things and they have different names. So if an email hits on a certain few items, that might tell us, "Hey, this one might be BEC," and then that email can be surfaced to me to look into.Emily Hacker:Unfortunately, BEC is very, is a little bit more difficult to track just by the nature of it not containing phishing links or malware attachments or anything along those lines. So it is a little bit more keyword based. And so, a lot of times it's like looking at 10,000 emails and looking for the one that is bad when they all kind of use the same keywords. And of course, we don't just get to see every legitimate email, 'cause that would be like a crazy customer privacy concern. So we only get to really see certain emails that are suspected malicious by the customer, in which case it does help us a little bit because they're already surfacing the bad ones to us.Emily Hacker:But yeah, that's how we find these, is just by looking for the ones that already seem malicious kind of and applying logic over them to see like, "Hmm, this one might be BEC or," you know, we do that, not just for BEC, but like, "Hmm, this one seems like it might be this type of phishing," or like, "Hmm, this one seems like it might be a buzz call," or whatever, you know, these types of things that will surface all these different emails to us in a way that we can then go investigate them.Nic Fillingham:So for the folks listening to this podcast, what do you want them to take away from this? What you want us to know on the SOC side, on the- Emily Hacker:Mm-hmm (affirmative).Nic Fillingham:... on the SOC side? Like, is there any additional sort of, what are some of the fundamentals and sort of basics of BEC hygiene? Is there anything else you want folks to be doing to help protect the users in their organizations?Emily Hacker:Yeah, so I would say not to just focus on monitoring what's going on in the end point, because BEC activity is not going to have a lot, if anything, that's going to appear on the end point. So making sure that you're monitoring emails and looking for not just emails that contain malicious links or attachments, but also looking for emails that might contain BEC keywords. Or even better, if there's a way for you to monitor your organization's forwarding rules, if a user suddenly sets up a, a slew of new forwarding rules from their email account, see if there's a way to turn that into a notification or an alert, I mean, to you in the SOC. And that's a really key indicator that that might be BEC, not necessarily gift cards scam, but BEC.Emily Hacker:Or see if there is a way to monitor, uh, not monitor, but like, if your organization has users reporting phishing mails, if you get one that's like, "Oh, this is just your basic low-level credential phishing," don't just toss it aside and be like, "Well, that was just one person and has really crappy voicemail phish, no one's going to actually fall for that." Actually, look and see how many people got the email. See if anybody clicked, force password resets on the people that clicked, or if you can't tell who clicked on everybody, because it really only takes one person to have clicked on that email and you not reset their password, and now the attackers have access to your organization's email and they can be conducting these kinds of wire transfer fraud.Emily Hacker:So like, and I know we're all overworked in this industry, and I know that it can be difficult to try and focus on everything at once. And especially, you know, if you're being told, like our focus is ransomware, we don't want to have ransomware. You're just constantly monitoring end points for suspicious activity, but it's important to try and make sure that you're not neglecting the stuff that only exists in email as well. Natalia Godyla:Those are great suggestions. And I'd be remiss not to note that some of those suggestions are available in Microsoft Defender for Office 365, like the suspicious forwarding alerts or attack simulation training for user awareness. But thank you again for joining us, Emily, and we hope to have you back on the show many more times.Emily Hacker:Yeah, thanks so much for having me again.Natalia Godyla:Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode.Nic Fillingham:And don't forget to tweet us @msftsecurity, or email us at securityunlocked@microsoft.com with topics you'd like to hear on our future episode. Until then, stay safe.Natalia Godyla:Stay secure.
6/30/2021

Simulating the Enemy

Ep. 34
How does that old saying go?Keep your friends close andkeepyour understanding of a threat actor’sunderlying behavior and functionality of tradecraft closer?Asnew tools are developed and implemented for individuals and businesses to protect themselves, wouldn’t it be great to see how they hold up against different attacks withoutactually havingto wait for an attack to happen?Microsoft’s new open-source tool,Simuland, allows users to simulate attacks on their owninfrastructureto see where their own weaknesses lie.In this episode of Security Unlocked, hosts Natalia Godyla and Nic Fillingham sit down with Roberto Rodriguez,PrincipleThreatResearcher for the Microsoft Threat Intelligence Center (MSTIC)andSimuland’sdeveloper,to understand how the project came to life, and what users can expect as they use it.In This Episode You Will Learn:How community involvement will helpSimulandgrowHow individuals can useSimulandto seeexamples of actions threat actors can take against their infrastructureWhat other projects and libraries went intoSimuland’sdevelopmentSome Questions We Ask:What exactly is being simulated inSimuland?What do does Roberto hope for users to take away fromSimuland?What is next for theSimulandproject?Resources:RobertoRodriguez’sLinkedIn:https://www.linkedin.com/in/roberto-rodriguez-96b86a58/Roberto’s blog post,SimuLand: Understand adversary tradecraft and improve detection strategies:https://www.microsoft.com/security/blog/2021/05/20/simuland-understand-adversary-tradecraft-and-improve-detection-strategies/Roberto’s Twitter:Cyb3rWard0ghttps://twitter.com/Cyb3rWard0gNic Fillingham’s LinkedIn:https://www.linkedin.com/in/nicfill/NataliaGodyla’sLinkedIn:https://www.linkedin.com/in/nataliagodyla/Microsoft Security Blog:https://www.microsoft.com/security/blog/Related:Security Unlocked: CISO Series with Bret Arsenaulthttps://SecurityUnlockedCISOSeries.comTranscript:[Full transcript can be found athttps://aka.ms/SecurityUnlockedEp34]Nic Fillingham:Hello and welcome to Security Unlocked. A new podcast from Microsoft, where we unlock insights from the latest in news and research from across Microsoft Security Engineering and Operations teams. I'm Nic Fillingham.Natalia Godyla:And I'm Natalia and Godyla. In each episode, we'll discuss the latest stories from Microsoft Security, deep dive into the newest threat intel, research and data science.Nic Fillingham:And profile some of the fascinating people working on artificial intelligence in Microsoft Security. Natalia Godyla:And now let's unlock the pod. Nic Fillingham:Hello listeners. Hello, Natalia. Welcome to episode 34 of Security Unlocked. Natalia, how are you? Natalia Godyla:I'm doing well, thanks for asking. And hello everyone. Nic Fillingham:On today's episode, we have Principal Threat Researcher from the MSTIC Group, Roberto Rodriguez, who is here to talk to us about SimuLand, which is a new open source initiative, uh, that Roberto, uh, announced and discuss in a blog post from may the 20th, 2021. Natalia, you've got a, an overview here of SimuLand. Can you give us the TLDR? Natalia Godyla:Of course. So SimuLand is like you said, an, an open source initiative at Microsoft that helps security researchers test real attack scenarios, and determine the effectiveness of the detections in products such as Microsoft 365 Defender, Azure Defender and Azure Sentinel, with the intent of expanding it beyond those products in the future. Nic Fillingham:And Roberto, obviously we'll sort of expand upon that in the interview. Uh, one of the questions we asked Roberto is how did this all begin? And it began with an email from someone in Roberto's team saying, "Hey Roberto, could you write a blog post that sort of explains the steps needed to go and, uh, deploy a lab environment that reproduces some of these techniques?" And Roberta said, "Sure." And started writing. And he got to about page 80. Uh, you got 80 pages in and decided, "You know what, I think I can probably turn this into, uh, a set of scripts or into a tool." And that's sort of the kickoff of the SimuLand project. There's obviously more to it than that, which Roberto will go into, uh, in the interview. The other thing we learned, Natalia is Roberto might have taken the crown as the busiest person in, in security. Natalia Godyla:He certainly does. And, uh, lucky us, we get to ask him questions about all of the open source projects that he's been working on. So we'll do a little bit of a Harbor cruise through those projects in addition to SimuLand and this episode.Nic Fillingham:And with that, on with the pod.Natalia Godyla:On with the pod.Nic Fillingham:Welcome to the Security Unlocked podcast, Roberto Rodriguez. Thanks for your time.Roberto Rodriguez:Yeah. Thank you. Thank you. Thank you for having me here. Nic Fillingham:Yeah. We'd love to start with a quick intro. If you could tell the audience, uh, about yourself, about your role at Microsoft and, and what is your day-to-day look like? Roberto Rodriguez:Sure. Yeah. So my name is Roberta Rodriguez. Um, I'm a Principal Threat Researcher for the Microsoft Threat Intelligence Center, known as MSTIC, and I'm part of the R&D team. And my day-to-day, uh, is very interesting. There's a lot of things going on. So my role primarily is to empower all their security researchers in my organization to do, for example, some of their development of detections, performing research in general. So I tend to follow my day-to-day into... I kind of like breaking it down into a couple of pieces. Like the whole research methodology has several different steps.Roberto Rodriguez:So what I do is I try to innovate in some of those steps in order to expedite the process, trying to maybe come up with some new tools that they could use. And at the same time, I like to dissect adversary tradecraft, and then try and just to take that knowledge and then share it with others and trying to collaborate with other teams as well. Not only in MSTIC, but yeah, but across like other teams at Microsoft as well.Natalia Godyla:Thank you for that. And today we're here to talk about one of the blogs you authored on the Microsoft Security blog, SimuLand understand adversary tradecraft, and improve detection strategies. So, um, can we just start with defining SimuLand? What is SimuLand? Roberto Rodriguez:Yep. So SimuLand is an open source initiative. It's, it's a project that started just as a blog post to talk about, for example, an end-to-end scenario where we can start mapping detections to it. So we decided to take that idea and started sharing more scenarios with the community, showing them a little bit how, for example, like a threat actor could go about it and trying to compromise the specific, you know, resources either in Azure or on Prem. And then try to map all that with some of the detections that we have, trying to validate detections and alerts from different products from the 365 Defenders security, Azure Defender. Roberto Rodriguez:And of course, Azure Sentinel at the end, trying to, trying to bring all those data sources together and then allow also not only people at Microsoft, but outside, right? Customers or people even trying to use trial licenses to understand the, you know, the power of all this technology together. Because usually, you know, when you start thinking about all these security products, we always try to picture them like as isolated products. So the idea is how we can start providing documentation to deploy lap environments, walk them through a whole scenario, map the... For example, attack behavior to detections, and then just showcase what you can do with, you know, with all these products.Roberto Rodriguez:Um, that's kind of like the main idea. And of course I, some of the output could be understanding, you know, the adversary in general, trying to go deep beyond just alerts. Because our goal also is not just to say, "Oh, this attack action happens. And then this alert triggers." The idea is to say first, you know, let's validate those alerts, but then second, we want you to go through and analyze the additional data, additional context that gets created in every single step, because at the same, you know, it will be nice to see what people can come up with. Roberto Rodriguez:You know, there's a lot of different data sets being showcased through this, you know, type of lab environments that, you know, for example, we believe that there could be other use cases that you can create on the top of all that telemetrics. So that's what we want to expose all that documentation that has helped us, for example, to do internal research. When I joined Microsoft, there was not much so I would say from a lap environment that was fully documented to deploy and then just try to use it right away when there is an incident, for example, or just trying to do research in general. So my idea was why can't we share all this with a community and see if they could also benefit because we're using this also internally.Nic Fillingham:I, I'd love to actually just quickly look at the name. So SimuLand, I'm assuming that's a portmanteau or is it, is it an acronym? Tell me how you got to SimuLand. Because I think that may actually also help, you know, further clarify what this is. Roberto Rodriguez:Yeah. So, yeah, SimuLand, uh, it's I believe, you know, it comes from as... Well, it has also some contexts around Spanish. Uh, so in Spanish we say simulando. So simulando means simulating something.Nic Fillingham:Okay.Roberto Rodriguez:But at the same time, I feel that SimuLand, the idea was to say, deploy this environment, which could turn into a, let's say like a land out there that it's, it's primarily to simulate stuff and to start, you know, learning about adversary trade graph. So it's kind of like the SimuLand, like the simulating land or the land of the simulation. And then also in Spanish, they simulando. So it has a couple of different meanings, but the, the main one is this is the land where you can simulate something and then learn and learn about that simulation in general. Roberto Rodriguez:So that, that was kind of like the thought that, you know, when behind it, not probably too much, but, uh, (laughs) that was idea. And I think that people liked it. I think it just stayed with the project. So-Nic Fillingham:And, and given that you're s- you're simulating sort of the threat space is, is this land that's being simulated? Is this your sort of sovereign, uh, land to protect? Or is this the, is this the actual sort of the theater of cyber war? Like what are you simulating here? Are you're simulating the attacker's environment. Are you simulating your environment? Are you simulating both?Roberto Rodriguez:Yeah, it's a great question. So we're trying to, primarily of course you simulate, let's say an organization that has, for example, like on-prem resources that are trying to connect to an Azure cloud infrastructure, for example. So simulating that environment first, but then at the same time, trying to execute some of those, for example, actions that I threat actor could take in order to compromise the environment. And of course that could come with some of the tools that are used also by, you know, known threat actors who trying to stay with public tools. So things that are already out there, things that have been also identified, but a few threads reports out there as well.Roberto Rodriguez:So we're trying to use what others also could use right away. You know, we don't want to, you know, of course share code or applications that no one has seen ever out there. So the idea is to primarily simulate the full organization environment, like an example of, of what that environment will look like, but then at the same time use public tools to perform some actions in the environment. Natalia Godyla:So, as you said before, you're exposing a lab environment that you had been leveraging internally at Microsoft so the community can benefit from it. What was the community using before in order to either test these products or do further research? Roberto Rodriguez:Sure. So I would say that there is a lot of different communities that we're building, let's say, like, for example, some active directory environments, uh, trying to simulate the creation of different, you know, windows endpoints, um, on a specific domain. And then they were using a lot of open source tools, for example, like, you know, things such as Sysmon from a windows perspective, like, oh, it's squarely also in windows, but then on other platforms. But at the same time, what I wanted to do is why can't we use that, which people are used to trying to use open source tools or just open tools. Roberto Rodriguez:And then at the same time trying to use, uh, for example, enterprise, security controls or products in general. That type of, uh, simulation of a full end-to-end scenario, I have not seen it before. I have seen, for example, some basic examples of one, let's say, um, you know, scenario from Microsoft Defender, evaluation labs, for example, they have a service where you can simulate two to four computers with MDE, which is Microsoft Defender for endpoint, those scenarios existed, but there was nothing out there that could have everything in one place. Roberto Rodriguez:So we're talking about Microsoft Defender for Endpoint, identity, Microsoft Defender for cloud application security, Azure Defender. And then on the top of that, Azure Sentinel detections, all that together was not out there. Once again, there was just a couple of scenarios, lap environments that were touching a few things, but he was not covering the whole framework or the whole platform to test all these different detections. But at the same time, how you can work with everything at once, because that's also one of the goals of the project is we always hear, for example, once again, detections from one product only, but then there is a lot that you can do when you have one detection from MDE, one detection from Azure Sentinel, MDI, et cetera, all that additional context was not public yet before SimuLand.Roberto Rodriguez:So that's what I was trying to do. Is to bring all this in one place and, and, you know, bringing everything to the SimuLand. (laughs)Nic Fillingham:Is there a particular scenario Roberto, that you can sort of walk us through that's sort of gonna, gonna fully cover the gamut of what SimuLand can do?Roberto Rodriguez:Yes, yes. Definitely. So there is one scenario in there. We're trying to, to of course, you know, add more scenarios to this, uh, platform. So the only one that we have in there is what I call golden SAML two, you know, still for example, or 4J SAML token, and then use that in order to, for example, modify Azure ID applications in order to then use those applications to access mail data, for example. So that's one scenario. The, the main part is golden SAML. That's scenario for example, what we're trying to do with SimuLand is to first make sure that we prepare whoever is using SimuLand to understand what it is that you need before you even try to do anything. Roberto Rodriguez:Right? Because usually we try to jump directly to the simulation and trying to let's say, attack an environment, but there is a lot of pieces that you need to happen before, right? So SimuLand gives you what is called preparation. So in preparation, and you understand all the licensing that you might need, not every scenario needs, uh, we'll need, let's say an enterprise license, or there's going to be a couple of scenarios where are going to be simple. So not too much going on in there, but next step is how to deploy an environment. So once you take care of the licensing, once you take care of, for example, what are the additional resources that you might need to stand up before you deploy a full environment? So now we can deploy it. Roberto Rodriguez:We provide also Azure resource manager templates. So arm templates to let's say first document the environment as code, and then be able just to deploy it with a few commands, um, rather than trying to do everything manually, which is time consuming and is too complex to, to figure it out. The next step of once we have the environment, then we can start for example, running a few actions. So if we go to golden SAMLs, a golden SAMLs starts with let's for example, use a compromised account that was the one handling the Active Directory Federation Services, for example, in the organization on Prem, then we take that and then we start, for example, accessing the database where we can instill the certificate to sign tokens. Roberto Rodriguez:Once we get that, then we can go through that whole scenario step-by-step as we go executing every single action, we can start identifying detections, images of what it would look like on MDI, MD, MDE, MKAZ, Azure Sentinel, all the way to even show you some additional settings that you might need to potentially enable if you want to collect more telemetry. And then at the end, which is, you know, closest scenario with, you know, showing you what it is that you did. And then, uh, at the same time, all the alerts that trigger or the telemetry that was available.Roberto Rodriguez:And since we are sharing a full environment where everything is running, then you can just go back to the environment and go deeper. Maybe do some forensics, maybe do some additional incident response actions. So that, that will be, I would say the, the end-to-end thing with SimuLand, what you can do once you jump into the project.Natalia Godyla:And so for users who've jumped into SimuLand and gone through some of the scenarios, what is your intent for the users once they have these results, what's the use case for them and how do you want them to interact with your team as well? How do you want the community to get involved? Roberto Rodriguez:Yes, that's a great question. So initially what we want to people using SimuLand is once again, go beyond just the alerts. Because alerts, which is one thing that will trigger, we're taking care of all that. So wherever is using, for example, the Microsoft 365 Defender products in general, you know, they are protected with all these detections, right? But my goal is for a researcher or a security analyst to go deeper into that telemetry once again, around in a specific, uh, so I run a specific on alerts so that they can learn more about the adversary behavior in general.Roberto Rodriguez:Usually we just see the alert and then we stop and then we just started the incident and then we pass it to somebody else. I want people to dive into the, you know, all this telemetry that is being collected and they start putting together that whole adversary tradecraft, for example. Understanding the behavior to me is, is very important. There is a lot of different things that you can do with a telemetry already in SimuLand. So that's just one of the goals. The second goal is to see if you're even ready for those types of, you know, alerts. For example, what do you do if you get all these four or five alerts in your environment? How do you respond to that? Roberto Rodriguez:So these could also be part of our training exercise, for example. So there is a couple of things that you can do in there. Another scenario could be, you know, exporting all the data that is being collected and then probably use it for some demos. Once again, also for some training, focusing a lot on trying to understand and learn the adversary tradecraft. Like for me, that's very important once again, because we don't just want to learn about one specific indicator of compromise, we want to make sure that we're covering, uh, scenarios that would allow us to, you know, respond and understand techniques or at the tactical level.Roberto Rodriguez:Um, and then from a collaboration with us, I believe that, you know, one could be trying to give us some feedback and see what else we could do with these scenarios. There is a couple of people in the community, for example, that are sharing some cool detections on the top of the stuff that we already developed. There is a lot of detections being insured through Azure Sentinel GitHub, through enter 65, advanced square is GitHub. And there is people just building things on the top of that. So we would like to hear more of those scenarios and maybe include all those to SimuLand so that we can make SimuLand also a place where we can share those schools, those cool detections ideas that people might have. Roberto Rodriguez:And that could be shared also with others using the environment. Everything I would say from a communication perspective happens through GitHub through issues. Anything that anybody would like to add or probably request, any features. It will be nice. We had one person asking us about, can we add, for example, Microsoft Defender, so MDO, which is Microsoft Defender for Office 365, I think it is. And so those, you know, for example, products, something that I had not added yet. So that's something that is coming. So, uh, invest the type of collaboration that I expect from the community as well.Natalia Godyla:And what's on the roadmap for simulant? What's next for evolving the project?Roberto Rodriguez:Yeah. So SimuLand has a couple of things that are coming out. So one is going to be automation, automation from the execution of attacker actions. So right now the deployment is automated. I would say, I would say 90% of the deployment is automated. There is a few things that are kind of hard to automate right now. And it's just a simple, just like a few more clicks on the top of the deployment. But from the attacker's perspective, we wanted to make SimuLand a project where you can walk someone through the whole process. These are the actions that take place in the whole simulation, and then you can start exploring one-by-one. Roberto Rodriguez:So it's a very manual process to, to go through the SimuLand labs, for example. So one thing that we wanted to do is to automate those steps, those attacker actions, because, you know, we have, for example, a few people that are taking advantage of how modular SimuLand is that they do not want to deal with preparation and deployment. All they wanna do is take the execution of the actions and then just plug them into their own environment. Because they say, I already have the same deployment. Well, yeah. A similar deployment with all the tools that you ask to be deployed. Why not? Can I just take the attacker actions and then just to start a learning or maybe do it in a schedule base, right?Roberto Rodriguez:Like every Friday we execute a few scenarios. So that turned into, uh, a new project, which I'm going to be releasing in Black Hat, 2021 in August. That project is called Cloud Katana. And that's a project where I will be using Azure functions to execute actions automatically. And then the other thing that we have for SimuLand is data export. So what I wanna do also is share the data that gets generated after going through the whole SimuLand scenarios, and then just give it to the community. Because I believe that we also have a few conversations with people from the community that say, you know what, I don't have the environment to deploy this. Roberto Rodriguez:You know, for example, I don't have resources to, you know, learn about all, you know, all of this, my company doesn't want to somehow, I don't know, support these type of projects, right? So a lot of things, you know, people are having some obstacles as well, right? To try to use these things, even like having a subscription in Azure might be an obstacle or constraint for a lot of people. So why not just give them the data with all the actions that were taken, all the alerts that were collected by Azure Sentinel, and then allow them to use, for example, plain Python code or PowerShell or Jupiter notebooks on the top of that, like, you know, to analyze the data, build visualizations from the top.Roberto Rodriguez:So we want to empower those that also, you know, my want to use it, but do not have the resources to do it. So that's also, you know, second thing in the, uh, uh, in the list for SimuLand. The other thing is going to be, so we have, uh, have a lot of things going on, but, (laughs) the, the other thing is going to be, how can we provide a CICD pipeline for the deployment? That's critical because want to make sure that people can plug these into, for example, Azure DevOps, and then they can just have the environment running and they may be, you know, bring the deployment down, you know, bring it up every week and then run a few scenarios, bringing down again.Roberto Rodriguez:So we wanted to make sure that he's also flexible for those too, right, to work with. And what else. And I think that the last thing that we have would, would be trying to see if we can integrate more products from Microsoft, and just share, uh, more scenarios. We have two or three coming, uh, hopefully in the next couple of months and it's going to be fun. Yeah. We have a lot of stuff in there. (laughs)Nic Fillingham:Tell me how you built SimuLand and then worked a full-time job in the MSTIC team. Was this actually a special project that you're assigned to, or was this all extra curricular? A little column A, little column B?Roberto Rodriguez:(laughs) Yeah. So once again, when I started right, these conversations, so I, I mentioned that my role is to also empower others and help to, you know, develop, you know, environments for research, because I love to do research as well, like dissecting. Yeah. Adversary tradecraft is pretty cool. And then the question was just, "Hey, can you build this environment?" Just a simple email? And I was like, "Yeah, I can do that." And I just, to be honest, it took me maybe a week or two to figure it out the infrastructure, and then maybe took me, uh, probably close to a month to write down the whole scenario and make sure that I have the PowerShell scripts that were actually working.Roberto Rodriguez:So let's say probably two months it, it took me to do this. It was extra curriculum activities. (laughing?) Definitely besides what I was doing already. Um, and it was fun. I mean, it was fun because that's what I love to do. So some of my boss is super cool, you know, letting me do all this research and then allow me just to also spend some time and trying to get some feedback from also our internal team and other teams as well. So yeah. So it turned into just as a question, can you do this? And I love those questions and somebody says, can you do this? I was like, I would say yes, but then I don't know what I'm getting myself into. And that's the fun part of it.(laughs)Nic Fillingham:Before we, before we sort of wrap up here, we're a better, are there any projects that you're working on right now or you're contributing to that you can, you can talk about? Roberto Rodriguez:Yeah. So I would say from an open threat research perspective, there's a project called Modeler. So Modeler is a project where I decided to every time I execute or go through my research process, and, and then let's say learn about a specific attack technique, I can collect the data. And then I share those datasets through that project. So for other people that would like to learn about those techniques, they can just access the data directly. So you can learn about adversaries through the data instead of trying to go through a whole process to like to emulate or simulate an adversary. Roberto Rodriguez:Which for a lot of people, it's, it's not that easy. So, you know, so for me, I wanted to find ways to expedite that process. Uh, so that project is something that I'm, you know, revamping, uh, soon. So I'm, I'm collecting more data sets from the cloud. Most of my datasets were windows base. I have a couple of from Linux. I have some from AWS, but I wanted to get more from, you know, from Azure. So SimuLand datasets are going to live in Modeler project. So, you know, anything that, you know, gets out of SimuLand, contributed directly to an open source project as well. Roberto Rodriguez:So that's one of them. And the other one is Cloud Katana, which is the one that I talked about a couple of minutes ago. So Cloud Katana, the automation of SimuLand attack actions, that one I'm spending, uh, a lot of time to, uh, that one will be released under Azure, but this is still going to be open source. So that's also something that we want to provide to the community to use. And let's say there is a, all the projects too. Yes, I have another project. So it is a project called OSSCM, O-S-S-C-M. And OSSCM is a project that I started to document telemetry that I use during research. Roberto Rodriguez:So I believe that a lot of people that want to dive into the technicians and the starring the, you know, defender world, they need to understand the data before they want to make the decisions of like building detections. So my goal with that project was to first document events that I use from different platforms. At the same time, I wanted to create a standardization like common data model for data sets, which by the way, Azure Sentinel is building their common data models through this project OSSCM. So it's also one of our interesting collaboration and opportunities that we have. Uh, Azure Sentinel reaching out to the community and saying, "Hey, instead of Pfizer reinventing the wheel, can we explore your project?" Which is OSSCM.Roberto Rodriguez:And then the third part of OSSCM is also a way to document, for example, you know, relationships that we identify in data. So when you want to build, for example, detections, most of the time you want to understand what events can I use to build a chain of events that would actually give me context around an attack behavior. So what we do is we explore the data, we identify relationships and we just document them through that project. So that way somebody else could actually use it and understand what can they do with that telemetry.Roberto Rodriguez:So I would say, once again, you learn about that telemetry, you standardize your telemetry, and at the same time, we give you some ideas into what you can do with our telemetry to build detections. So that's another project. Last one would be, (laughs) yeah, last one would be another-Nic Fillingham:There's more?Roberto Rodriguez:Yes. There's one more. (laughing)Nic Fillingham:Do you sleep, man? When do you sleep?Roberto Rodriguez:It is being hard but I try to manage my time for sure and do that, but it is, uh, a another project, it's private right now, but it's going to be public, uh, soon. It's going to be through the open threat research community as well. This project is a way to collaborate with, for example, researchers in the community that build offensive security tools or just tools to do, for example, you know, red teaming, they want to use those tools to perform certain actions in, in, in, in a specific environment. Roberto Rodriguez:So we want to, you know, collaborate and partner with them and start documenting those tools in a way that we can share with others in the community. So for example, me as a researcher, dissecting adversary tray graph, like all, all the techniques and the behavior behind on a specific tool or a specific technique, it takes time. Like for me, like it would take probably a couple of weeks to dissect all the modules of one tool. So the goal is to why don't we partner with the authors of those tools, we document those, uh, tools and then we can start also sharing some potential ideas into how to detect those scenarios. Roberto Rodriguez:That way we, you know, we expedite the research, right? We do it, let's say in a private setting with a lot of researchers from the community, and then we just distribute that, that knowledge across the world. So that way we also help and expedite that whole process. So open through research, we have data. Now we have knowledge, we have infrastructure and then we have a way to share it with our community. So it's like a whole kind of like the main parts of your, you know, research process, but we want to give it a community touch to the, you know, you know, to all this. And that's, and that's it. So I have a couple more, but that's, (laughing) that's kinda like another project that it's, it's, it's coming soon. So-Nic Fillingham:I, I think we're going to have to let you go, Roberto. 'Cause if you're just going to get back in today's projects and start submitting some more contributions.(laughing) But before we do that, I want to, I want to circle back to SimuLand, and again, for folks listening to SimuLand, um, they're going to get rid of the blog post. We'll put the link in the, in the show notes. Tell me, what is your dream contribution? What is sort of the first scenario that you want sort of contributed back into this project?Nic Fillingham:Or sort of, where are you really hoping that the community will come and rally around either a particular scenario or some sort of other... Who is the person you, you want to be listening to this podcast right now and go like, "Oh yeah, I can do that." What's that one thing you need, or you're really looking for?Roberto Rodriguez:Well, actually two things. So one is the automation of, of the attacker actions. It will be, uh, uh, a dream, I would say because I'm, I'm building it on the top of Azure infrastructure. So it will be easier to plug in into your environments to kind of like, you know, periodically do some testing and then map it to SimuLand scenarios. So you have like the full end to end, uh, the environment. You have the labs preparation infrastructure as code all the way to even automating those, um, you know, validation of analytics, for example.Roberto Rodriguez:That, that, that's one that even though it's something that it's been done in other places, I think the way how it's going to be done through, through Azure functions is going to be very, very interesting because we're going to have potentially not only attack our actions being automated, but we could maybe have some detections being automated on the top of that. So instead of releasing a tool that will only be used, let's say to attack, right, and a specific environment, we can use a tool that can do both to attack and defend the, uh, the environment.Roberto Rodriguez:So usually you see one or the other. One tool to attack or one to defend. The automation that I'm planning to, to release, which would be one of the dreams is to be able to attack and defend automatically. And I think that that would link also very nicely with projects like CyberBattleSim. So that's also one of the dreams is how can we, uh, for example, document SimuLand in a way that could help us create synthetic scenarios that CyberBattleSim can use and then drop an agent and then learn about the most efficient path to take? Because that's, you know, CyberBattleSim, right? Roberto Rodriguez:They build environments, synthetic environments to then, you know, teach an agent to take the most efficient path through like, you know, rewards and, and, you know, all this stuff. So SimuLand, the dream would be to connect also those projects. How can, you know how you can have these nice process where you can SimuLand can provide the adversary, tradecraft knowledge, all the, for example, preconditions and all the, the context that is needed to create a CyberBattleSim scenario, and then improve a model to, for example, automate some of that execution of attacks. Roberto Rodriguez:And then that model can then be used through Cloud Katana to then execute those paths automatically. And then at the end, you can have some detections on the top where you can apply a similar context. Because SimuLand comes with the attack and detections. So we might find a way to create a data model where we could say, here's the attack here, all detection. So we can maybe build something also with CyberBattleSim the same way. And the other one, so the other dream bug is for me in SimuLand would be, since I was talking to a few coworkers today about this, um, that it would be nice to maybe provide SimuLand as a service for customers or also for, you know, people in the community.Roberto Rodriguez:It will be nice to have a platform that people can just access and start learning about these, these tools, these, these data, uh, necessarily not give somebody of course control to execute something. We take care of the execution, but then just expose all this telemetry in a way that is easier for those that, you know, might not have the resources. I love to do things, to build things that would help others to, you know, to do more. So I think that that will be also one of the dreams is how can we just take SimuLand and then just make it a service for, you know, for the community.Roberto Rodriguez:That would be pretty cool. So if anybody is listening, (laughs) and, and, you know, would like to make that happen, it would be amazing to have SimuLand as a service for those that don't have the resources like schools, uh, you know, like has anybody in general, the community that, you know, would like to, you know, learn more about this.Natalia Godyla:Wow. Roberto, you're going to be busy. Roberto Rodriguez:Yes. (laughs)Natalia Godyla:For anyone who hasn't watched episode 26, we did discuss CyberBattleSim there. So if that peaked your interest, definitely check out that episode and Roberto, as we wrap up here, are there any resources, Twitter handles that folks can follow to continue to watch your work or maybe join the threat research community? Roberto Rodriguez:Yes, yes. Yes. So my Twitter handle is Cyb3rWard0g with a three and the zero. So instead of the E and the O. So Cyb3rWard0g in Twitter. So there is what I share everything that I do is through there. Um, if you want to join the community, we would love to, you know, learn from you and collaborate, go to the Twitter handle OTR. So OT and then R_community. And then they're in the profile and description of the Twitter handle, you have a better link for the, uh, for the discourse invite. So the moment you join that discord, all you have to do is just accept the code of conduct. We want to make sure that we're inclusive, which is welcome everybody. Roberto Rodriguez:And if you agree with that, just click the 100% emoji, and then you have access to, to, (laughing) and then you have access to all these channels where you can, you know, ask questions about open source projects. So that's the best way to collaborate.Natalia Godyla:Awesome. Thank you. We'll definitely drop those links in the show notes. And thank you again for joining us on the show today, Roberto. Roberto Rodriguez:No, thank you for having me. This was amazing. Um, I had never had the opportunity to talk about a lot of projects. Uh, usually it's a one project and then we will see when we talk about. So this has been nice. So thank you very much. I really appreciate it. And I hope to see you guys in another episode. Nic Fillingham:We hope so too. Thanks for Roberto.Roberto Rodriguez:Thank you. Natalia Godyla:Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode.Nic Fillingham:And don't forget to tweet us @msftsecurity, or email us at securityunlocked@microsoft.com, with topics you'd like to hear on a future episode. Until then, stay safe.Natalia Godyla:Stay secure.