Learning Library

← Back to Library

Reducing MTTR with SOAR

Key Points

  • Effective incident response is essential to stop a breach from “sinking” an organization, much like a ship needs many hands and buckets to stop taking on water.
  • The attack timeline includes reconnaissance, the breach event (“boom”), a long mean‑time‑to‑detect (≈200 days) and mean‑time‑to‑resolution (≈70 days), which give attackers ample time in the network.
  • Threat hunting can shorten detection time by proactively seeking threats before alerts fire, addressing the gap between attack and awareness.
  • SOAR (Security Orchestration, Automation, and Response) is used to cut resolution time by automating repeatable tasks while coordinating human actions for complex or novel incidents.
  • Full automation is limited to known scenarios; unknown “black‑swans” require orchestration and human judgment because automation can only handle what it has seen before.

Full Transcript

# Reducing MTTR with SOAR **Source:** [https://www.youtube.com/watch?v=k7ju95jDxFA](https://www.youtube.com/watch?v=k7ju95jDxFA) **Duration:** 00:07:05 ## Summary - Effective incident response is essential to stop a breach from “sinking” an organization, much like a ship needs many hands and buckets to stop taking on water. - The attack timeline includes reconnaissance, the breach event (“boom”), a long mean‑time‑to‑detect (≈200 days) and mean‑time‑to‑resolution (≈70 days), which give attackers ample time in the network. - Threat hunting can shorten detection time by proactively seeking threats before alerts fire, addressing the gap between attack and awareness. - SOAR (Security Orchestration, Automation, and Response) is used to cut resolution time by automating repeatable tasks while coordinating human actions for complex or novel incidents. - Full automation is limited to known scenarios; unknown “black‑swans” require orchestration and human judgment because automation can only handle what it has seen before. ## Sections - [00:00:00](https://www.youtube.com/watch?v=k7ju95jDxFA&t=0s) **Incident Response Timeline and Delays** - The speaker uses a ship‑flood analogy to explain breach phases, emphasizing the long mean‑time‑to‑detect (~200 days) and mean‑time‑to‑resolution (~70 days) that highlight the need for robust incident response. - [00:03:11](https://www.youtube.com/watch?v=k7ju95jDxFA&t=191s) **Semi‑Automated Incident Orchestration with SOAR** - The speaker explains how SOAR platforms coordinate human‑guided, semi‑automated responses by linking SIEM, XDR, and case management to detect breaches, capture artifacts, and streamline remediation. - [00:06:18](https://www.youtube.com/watch?v=k7ju95jDxFA&t=378s) **SOAR System as Bilge Pump** - The speaker likens a SOAR platform to a powerful bilge pump that automates incident response, offering dashboards and metrics to track tickets, resolution times, and analyst workload. ## Full Transcript
0:00You've just been hacked--again. 0:03Now, if you were a ship at sea, you're basically taking on water faster than you can get rid of it. 0:09What you need is a lot of hands on deck and a lot of buckets, but you don't have enough of those. 0:16How are you going to keep from sinking to the bottom of the sea? 0:19Well, you need now an incident response capability. 0:23And this is the incident. 0:24Let's talk about an IT example, though, that would map to this. 0:28So we have a timeline that occurs like this. 0:32And the first thing that happens is the "bad guy" does reconnaissance. 0:37So, he's out here looking and casing the joint, if you will, trying to figure out what he wants to do, where the weaknesses are, this sort of thing. 0:46Then we have this moment in time, the "boom". 0:51That's when the attack occurs, or in our ship example, that's when we start taking on water. 0:56And then, ultimately, we try to get the incident resolved. 0:59And that's going to be some point over here, when we actually figure out what's going on. 1:04Well, it turns out there's a number of other things that are happening here. 1:07We've got this time between when the attack occurs and when we're actually aware of it. 1:13And we call this the mean-time-to-detect. 1:16And according to the Ponemon Institute "Cost of a Data Breach" survey, we know that this is on the order of 200 days. 1:25That's a long time. A long time for the bad guys to be in your system. 1:29Then the mean-time-to-resolution, turns out, is on the order of 70 days. 1:36So, again, a lot of time is lost in this. 1:39What can we do to reduce these time frames? 1:41Well, in a previous video, I talked about using threat hunting in order to reduce the time frame here. 1:48Threat hunting, you're trying to get out in front of the problem and discover a situation even before you've got an alarm. 1:54Now we're going to talk about what can you do to reduce this mean-time-to-resolution? 1:58And the thing we're going to talk about is a technology called SOAR. 2:02Now, what is SOAR? 2:04Well, SOAR is security, orchestration, automation and response. 2:11SOAR here is this incident response capability, but taken on steroids, with much more capability. 2:18Now, you notice I use the word orchestration and automation-- the question might come up, why wouldn't you just automate everything? 2:26Well, it turns out that's a little harder to do than you might think. 2:29So, for instance, let's think of a different continuum where we do everything manually. 2:35We manually do all of our response. 2:38Or, in a perfect world, we would have an automated capability that automates all of the response. 2:44The reality is, I can only automate what I've seen before. 2:48If I've never seen this before, then it's going to be hard to automate. 2:52And in particular, things like a first-of-a-kind, or in security, we often have what we refer to as "black swan" incidents. 3:02Swans are normally white, but every once in a while you run across a black one, and you can only automate what you've known to handle in the first place. 3:11So this is why we orchestrate-- is to handle these first of a kind black swan and things like that. 3:17Orchestration means that we still have a human involved, but it's somewhere in between-- 3:22the human is guiding the actions, but they're not doing every single action. 3:26Another way of thinking of it, it's like semi-automated. 3:29Okay, so the idea then with SOAR is ultimately to try to move as much as 3:34we can from the manual to a more automated, or somewhere along the way, orchestrated response. 3:41What does this look like? 3:42Well, let's take an example. 3:44Let's say we've got a database down here. 3:47And this database also has a security information event management (SIEM) system where it can send alerts. 3:54We have an extended detection and response system (XDR) also that can take those real time alerts. 4:01And then we have our SOAR system. 4:08Now, how would this work in an incident? 4:10Let's say this guy gets breached. 4:12Now what happens next? 4:14It sends an indication up to the SIEM. 4:16The SIEM then takes that information-- a real-time alarm 4:19--and says, I need to send this either within the SIEM and have it managed, or you can send it up to an XDR. 4:27The XDR then sends a message over to the SOAR to open a case. 4:32A case then, is the thing that we're going to use to manage this all the way through to completion and track it along the process. 4:40Another thing that we would do is capture artifacts. 4:45This is information about the attack. 4:47So we could have lots of artifacts, indicators of compromise, this kind of information. 4:52We're going to take all of that information, attach it to the case, and then we're going to assign that case to an analyst. 4:59And this analyst is going to be responsible for following through. 5:03The analyst then is going to take the SOAR system, which has this case management, which is detected and attach the appropriate artifacts. 5:11Now, they have the information they need. 5:13They can go do the investigation, and they need something to guide them along the way. 5:18What are they going to use for that? 5:20This is what we call a dynamic playbook. 5:22A dynamic playbook is basically a set of steps where we have said, in advance, this is what we want to do in certain cases. 5:30It's dynamic in the sense that it's not fixed. 5:33So what you do as the second step will depend on what the outcome of the first step was. 5:38So we may have a step that says go off and gather certain information, 5:41and then based upon what comes back from that, you do certain things, take certain actions. 5:45You may go run this automated procedure, you may go off and manually do a certain other thing. 5:51This is the orchestration process. 5:53So we have this analyst in charge running the whole shooting match, figuring out what needs to be done and guiding all of those actions. 6:02Now, in a good SOAR system, you can design these playbooks very easily 6:07with a drag-and-drop graphical user interface and put in the actions and tie all of this together. 6:14That way this person is not left to their own devices trying to reinvent the wheel. 6:18figure out, in other words, where is the fire extinguisher now that my hair is on fire? 6:23We'd rather know that in advance. 6:25Here's what the procedure is. 6:26And here's our we're guided. 6:28And ultimately a good SOAR system will probably also have a nice dashboard that could visualize what's happening here, 6:34and show that we have a certain number of tickets opened to certain number of cases, how long it takes us to resolve those cases, 6:41who has the most number of cases assigned to them, and all of this kind of statistical information so that we can do further analysis. 6:48So back to our original analogy about the sinking ship, what a SOAR system is, is like a really powerful bilge pump. 6:55It will automate, and in some cases orchestrate, the process of getting all the water that's in the boat back into the sea, which is where it belongs.