Reducing MTTR with SOAR
Key Points
- Effective incident response is essential to stop a breach from “sinking” an organization, much like a ship needs many hands and buckets to stop taking on water.
- The attack timeline includes reconnaissance, the breach event (“boom”), a long mean‑time‑to‑detect (≈200 days) and mean‑time‑to‑resolution (≈70 days), which give attackers ample time in the network.
- Threat hunting can shorten detection time by proactively seeking threats before alerts fire, addressing the gap between attack and awareness.
- SOAR (Security Orchestration, Automation, and Response) is used to cut resolution time by automating repeatable tasks while coordinating human actions for complex or novel incidents.
- Full automation is limited to known scenarios; unknown “black‑swans” require orchestration and human judgment because automation can only handle what it has seen before.
Sections
- Incident Response Timeline and Delays - The speaker uses a ship‑flood analogy to explain breach phases, emphasizing the long mean‑time‑to‑detect (~200 days) and mean‑time‑to‑resolution (~70 days) that highlight the need for robust incident response.
- Semi‑Automated Incident Orchestration with SOAR - The speaker explains how SOAR platforms coordinate human‑guided, semi‑automated responses by linking SIEM, XDR, and case management to detect breaches, capture artifacts, and streamline remediation.
- SOAR System as Bilge Pump - The speaker likens a SOAR platform to a powerful bilge pump that automates incident response, offering dashboards and metrics to track tickets, resolution times, and analyst workload.
Full Transcript
# Reducing MTTR with SOAR **Source:** [https://www.youtube.com/watch?v=k7ju95jDxFA](https://www.youtube.com/watch?v=k7ju95jDxFA) **Duration:** 00:07:05 ## Summary - Effective incident response is essential to stop a breach from “sinking” an organization, much like a ship needs many hands and buckets to stop taking on water. - The attack timeline includes reconnaissance, the breach event (“boom”), a long mean‑time‑to‑detect (≈200 days) and mean‑time‑to‑resolution (≈70 days), which give attackers ample time in the network. - Threat hunting can shorten detection time by proactively seeking threats before alerts fire, addressing the gap between attack and awareness. - SOAR (Security Orchestration, Automation, and Response) is used to cut resolution time by automating repeatable tasks while coordinating human actions for complex or novel incidents. - Full automation is limited to known scenarios; unknown “black‑swans” require orchestration and human judgment because automation can only handle what it has seen before. ## Sections - [00:00:00](https://www.youtube.com/watch?v=k7ju95jDxFA&t=0s) **Incident Response Timeline and Delays** - The speaker uses a ship‑flood analogy to explain breach phases, emphasizing the long mean‑time‑to‑detect (~200 days) and mean‑time‑to‑resolution (~70 days) that highlight the need for robust incident response. - [00:03:11](https://www.youtube.com/watch?v=k7ju95jDxFA&t=191s) **Semi‑Automated Incident Orchestration with SOAR** - The speaker explains how SOAR platforms coordinate human‑guided, semi‑automated responses by linking SIEM, XDR, and case management to detect breaches, capture artifacts, and streamline remediation. - [00:06:18](https://www.youtube.com/watch?v=k7ju95jDxFA&t=378s) **SOAR System as Bilge Pump** - The speaker likens a SOAR platform to a powerful bilge pump that automates incident response, offering dashboards and metrics to track tickets, resolution times, and analyst workload. ## Full Transcript
You've just been hacked--again.
Now, if you were a ship at sea, you're basically taking on water faster than you can get rid of it.
What you need is a lot of hands on deck and a lot of buckets, but you don't have enough of those.
How are you going to keep from sinking to the bottom of the sea?
Well, you need now an incident response capability.
And this is the incident.
Let's talk about an IT example, though, that would map to this.
So we have a timeline that occurs like this.
And the first thing that happens is the "bad guy" does reconnaissance.
So, he's out here looking and casing the joint, if you will, trying to figure out what he wants to do, where the weaknesses are, this sort of thing.
Then we have this moment in time, the "boom".
That's when the attack occurs, or in our ship example, that's when we start taking on water.
And then, ultimately, we try to get the incident resolved.
And that's going to be some point over here, when we actually figure out what's going on.
Well, it turns out there's a number of other things that are happening here.
We've got this time between when the attack occurs and when we're actually aware of it.
And we call this the mean-time-to-detect.
And according to the Ponemon Institute "Cost of a Data Breach" survey, we know that this is on the order of 200 days.
That's a long time. A long time for the bad guys to be in your system.
Then the mean-time-to-resolution, turns out, is on the order of 70 days.
So, again, a lot of time is lost in this.
What can we do to reduce these time frames?
Well, in a previous video, I talked about using threat hunting in order to reduce the time frame here.
Threat hunting, you're trying to get out in front of the problem and discover a situation even before you've got an alarm.
Now we're going to talk about what can you do to reduce this mean-time-to-resolution?
And the thing we're going to talk about is a technology called SOAR.
Now, what is SOAR?
Well, SOAR is security, orchestration, automation and response.
SOAR here is this incident response capability, but taken on steroids, with much more capability.
Now, you notice I use the word orchestration and automation-- the question might come up, why wouldn't you just automate everything?
Well, it turns out that's a little harder to do than you might think.
So, for instance, let's think of a different continuum where we do everything manually.
We manually do all of our response.
Or, in a perfect world, we would have an automated capability that automates all of the response.
The reality is, I can only automate what I've seen before.
If I've never seen this before, then it's going to be hard to automate.
And in particular, things like a first-of-a-kind, or in security, we often have what we refer to as "black swan" incidents.
Swans are normally white, but every once in a while you run across a black one, and you can only automate what you've known to handle in the first place.
So this is why we orchestrate-- is to handle these first of a kind black swan and things like that.
Orchestration means that we still have a human involved, but it's somewhere in between--
the human is guiding the actions, but they're not doing every single action.
Another way of thinking of it, it's like semi-automated.
Okay, so the idea then with SOAR is ultimately to try to move as much as
we can from the manual to a more automated, or somewhere along the way, orchestrated response.
What does this look like?
Well, let's take an example.
Let's say we've got a database down here.
And this database also has a security information event management (SIEM) system where it can send alerts.
We have an extended detection and response system (XDR) also that can take those real time alerts.
And then we have our SOAR system.
Now, how would this work in an incident?
Let's say this guy gets breached.
Now what happens next?
It sends an indication up to the SIEM.
The SIEM then takes that information-- a real-time alarm
--and says, I need to send this either within the SIEM and have it managed, or you can send it up to an XDR.
The XDR then sends a message over to the SOAR to open a case.
A case then, is the thing that we're going to use to manage this all the way through to completion and track it along the process.
Another thing that we would do is capture artifacts.
This is information about the attack.
So we could have lots of artifacts, indicators of compromise, this kind of information.
We're going to take all of that information, attach it to the case, and then we're going to assign that case to an analyst.
And this analyst is going to be responsible for following through.
The analyst then is going to take the SOAR system, which has this case management, which is detected and attach the appropriate artifacts.
Now, they have the information they need.
They can go do the investigation, and they need something to guide them along the way.
What are they going to use for that?
This is what we call a dynamic playbook.
A dynamic playbook is basically a set of steps where we have said, in advance, this is what we want to do in certain cases.
It's dynamic in the sense that it's not fixed.
So what you do as the second step will depend on what the outcome of the first step was.
So we may have a step that says go off and gather certain information,
and then based upon what comes back from that, you do certain things, take certain actions.
You may go run this automated procedure, you may go off and manually do a certain other thing.
This is the orchestration process.
So we have this analyst in charge running the whole shooting match, figuring out what needs to be done and guiding all of those actions.
Now, in a good SOAR system, you can design these playbooks very easily
with a drag-and-drop graphical user interface and put in the actions and tie all of this together.
That way this person is not left to their own devices trying to reinvent the wheel.
figure out, in other words, where is the fire extinguisher now that my hair is on fire?
We'd rather know that in advance.
Here's what the procedure is.
And here's our we're guided.
And ultimately a good SOAR system will probably also have a nice dashboard that could visualize what's happening here,
and show that we have a certain number of tickets opened to certain number of cases, how long it takes us to resolve those cases,
who has the most number of cases assigned to them, and all of this kind of statistical information so that we can do further analysis.
So back to our original analogy about the sinking ship, what a SOAR system is, is like a really powerful bilge pump.
It will automate, and in some cases orchestrate, the process of getting all the water that's in the boat back into the sea, which is where it belongs.