Learning Library

← Back to Library

AI Knowledge Graphs for Cyber Investigation

Key Points

  • A massive shortage of cybersecurity talent means organizations must rely on “force multipliers” like automation and artificial intelligence to fill and protect hundreds of thousands of open positions.
  • AI can serve as a powerful investigative tool by building knowledge graphs that model relationships between domains, IP addresses, URLs, files, malware signatures, and user activity.
  • Such a knowledge graph lets analysts trace the exact path a user took to become infected, quickly revealing other potentially compromised users, assets, or malicious sites through inference.
  • AI‑driven analysis of extensive system log records enables more precise identification and characterization of security events, turning raw timestamps, user actions, and source data into actionable insights.
  • By combining automation for efficiency with AI for intelligent reasoning, security teams can investigate, identify, and report threats far faster than traditional manual methods allow.

Full Transcript

# AI Knowledge Graphs for Cyber Investigation **Source:** [https://www.youtube.com/watch?v=4QzBdeUQ0Dc](https://www.youtube.com/watch?v=4QzBdeUQ0Dc) **Duration:** 00:06:13 ## Summary - A massive shortage of cybersecurity talent means organizations must rely on “force multipliers” like automation and artificial intelligence to fill and protect hundreds of thousands of open positions. - AI can serve as a powerful investigative tool by building knowledge graphs that model relationships between domains, IP addresses, URLs, files, malware signatures, and user activity. - Such a knowledge graph lets analysts trace the exact path a user took to become infected, quickly revealing other potentially compromised users, assets, or malicious sites through inference. - AI‑driven analysis of extensive system log records enables more precise identification and characterization of security events, turning raw timestamps, user actions, and source data into actionable insights. - By combining automation for efficiency with AI for intelligent reasoning, security teams can investigate, identify, and report threats far faster than traditional manual methods allow. ## Sections - [00:00:00](https://www.youtube.com/watch?v=4QzBdeUQ0Dc&t=0s) **AI as Force Multiplier in Cybersecurity** - The speaker explains how AI, particularly knowledge graphs, can help address the cybersecurity talent shortage by enabling faster investigation, identification, and reporting of threats. - [00:03:07](https://www.youtube.com/watch?v=4QzBdeUQ0Dc&t=187s) **Detecting Insider Threats via Log Analysis** - The speaker explains how aggregating detailed system logs and applying time‑decay functions with machine learning can identify rapid, suspicious sequences—such as privileged logins, data copying, and account deletions—as anomalous insider attacks. ## Full Transcript
0:00Right now there are hundreds of thousands of jobs open in the cybersecurity space. 0:04And we can't fill those positions fast enough and we can't make experts fast enough to fill them either. 0:10So what are we going to do? 0:11With the people we have, we're going to have to use force multipliers in order to be more effective and meet the need. 0:18And two of the things that we can do for force multipliers is we can use automation. 0:22That allows us to work more efficiently, or we can use artificial intelligence--that allows us to work more intelligently. 0:32I'm going to specifically focus on this one in the video today--to talk about how we can use AI to investigate a problem, 0:41to identify an issue, to report on a particular problem, and ultimately to research and find out more about a particular problem. 0:52So let's start with this first one: investigate. 0:55How could we use AI to investigate a particular issue, if we become aware that there might be an issue? 1:02Well, we can use a construct called a knowledge graph, 1:05which is a way of representing information about the physical or logical world, but representing it as a data structure. 1:12And the way this works is--to give you an example. 1:15Let's say we have a domain. 1:17And this would be like the name of a web domain. 1:21And that domain then resolves to a particular IP address. 1:28Also we--so this is what we normally have with a website. 1:32Now, what else do we have? 1:33Well, we might also have a URL. 1:35That's the actual link that you're going to type into your browser. 1:38And that is going to link to a particular file on the file system. 1:44Now, let's take, for instance, if that file on the file system ends up pointing-- 1:49because we know through an AV signature, an antivirus signature --what if this points to malware? 1:56Then this is some information that we can now connect together. 2:01Then, if we say that this URL is in fact contained by that domain, and then I add a user out here 2:12unsuspecting--who connects then to this IP address. 2:17Then, all of a sudden I have a path that goes all the way through from this user to this malware. 2:23And now I have this data structure that has represented, in fact, the connection that occurred. 2:29I now know this user has been infected by this malware, and here's the path it took to get there. 2:34And in fact, if this knowledge graph is good enough, 2:38I'll be able to look and see what other users might also be affected and what other malware and what other sites. 2:43So this is a way of representing information and then we can do some reasoning over that in order to do inference. 2:51Now, this is how an AI system might do this internally. 2:56Now, so that's one way we could do investigation. 2:59How about to identify in more detail a particular problem? 3:04So systems will typically write out lots of log records. 3:07Once an event occurs on a system, then we cut a log record. 3:12We put out information about--here's the time, the date, here's who did it. 3:16Here's what they did, here's the system they did it to. 3:19Here's where they did it from. 3:20Those kinds of bits of information would be contained in these log records. 3:25And we have loads and loads of these. 3:27So it's very difficult to sort through all of that and find where are the anomalous activities. 3:33Where are the outliers? 3:35Well, in particular, what we'll find is, in this case, let's go with an example 3:39and say here is a record where a privileged user logged into the system and created a new account. 3:47Then, almost immediately afterward, in almost no time, they copied all the contents of a database. 3:54And then, almost directly immediately, they deleted the account. 3:59Now, each one of these activities independently wouldn't represent necessarily a problem, 4:04but if you do all of these within a very short period of time, then we could use a time decay function and something like machine learning, 4:15which is essentially pattern matching on steroids, 4:18to look at all of these things and look at multiple factors across multiple records and realize we have an outlier, we have an anomaly. 4:27We have what may be an attack scenario where an insider has taken advantage of the system. 4:33So that's another use of AI and machine learning, in particular, in order to diagnose a problem. 4:39What else could we do? 4:40Well, we could report. 4:42There's a requirement in security circles that you report against: Are you complying with regulatory requirements or not? 4:49And some of the things that we might do in those cases is gather the log records and process those. 4:56We might also use information that we've gained here to enrich our reporting data. 5:01So that's another example where enriching the report with the information we have from the AI system, 5:08and that's also allowing us to report, we're spending less time. 5:12And then finally, to do research. Imagine I'm investigating, I'm identifying, I'm doing all these kinds of things. 5:19And what I'd like to be able to do is find out, what is this bit of malware? 5:25And I'd like to know more about it. 5:27I want to know more about any of these systems. 5:30So it would be nice if I had a natural language processing system--a chatbot 5:35that I could go and talk to and ask it questions and it has a knowledge base that it draws on. 5:40So, in fact, we're going to see more and more of this kind of capability going forward 5:44where a chatbot becomes essentially another member of the staff to answer questions as we're trying to do investigations. 5:54So you can see now, AI can help us a lot in the cybersecurity space. 5:58And that's in fact why IBM, 100% of our security software products include AI. 6:06Thanks for watching. 6:07If you found this video interesting and would like to learn more about cybersecurity, please remember to hit like and subscribe to this channel.