Learning Library

← Back to Library

Self-Driving Storage with Mobile Partitions

Key Points

  • The speaker introduces “self‑driving storage,” drawing an analogy to self‑driving cars to illustrate a new, automated approach to data‑center storage management.
  • Traditional block storage is static, so the concept hinges on making storage “mobile” by encapsulating volumes and containers into a single, movable unit called a **storage partition**.
  • A storage partition functions like an LPAR for storage, allowing many partitions to coexist on a single array and be shifted between arrays as needed.
  • To enable autonomous movement, an AIOps/AI engine is fed detailed descriptions of each partition—including capacity, performance, and protection metrics—so it can make informed decisions.
  • The AI‑driven system can then automatically relocate data across the infrastructure, optimizing placement based on workload demands and data importance.

Sections

Full Transcript

# Self-Driving Storage with Mobile Partitions **Source:** [https://www.youtube.com/watch?v=OdO8TL9MX0M](https://www.youtube.com/watch?v=OdO8TL9MX0M) **Duration:** 00:18:18 ## Summary - The speaker introduces “self‑driving storage,” drawing an analogy to self‑driving cars to illustrate a new, automated approach to data‑center storage management. - Traditional block storage is static, so the concept hinges on making storage “mobile” by encapsulating volumes and containers into a single, movable unit called a **storage partition**. - A storage partition functions like an LPAR for storage, allowing many partitions to coexist on a single array and be shifted between arrays as needed. - To enable autonomous movement, an AIOps/AI engine is fed detailed descriptions of each partition—including capacity, performance, and protection metrics—so it can make informed decisions. - The AI‑driven system can then automatically relocate data across the infrastructure, optimizing placement based on workload demands and data importance. ## Sections - [00:00:00](https://www.youtube.com/watch?v=OdO8TL9MX0M&t=0s) **Self-Driving Storage Explained** - The speaker introduces “self‑driving storage,” comparing it to autonomous cars and proposing a shift from static block storage to mobile, dynamically relocating storage resources within a data‑center architecture. - [00:03:08](https://www.youtube.com/watch?v=OdO8TL9MX0M&t=188s) **Describing Storage Partitions for AIOps** - The speaker outlines how to inform an AIOps engine by detailing a mobile storage partition’s capacity, performance metrics (IOPS, bandwidth, latency) and protection options, enabling the AI to make optimal data‑movement decisions. - [00:06:18](https://www.youtube.com/watch?v=OdO8TL9MX0M&t=378s) **Temporal Tagging for AIOps‑Driven Storage** - The speaker describes attaching timestamps to storage partition attributes so an AIOps platform can apply machine‑learning to historic data, enabling automated decisions about when, where, and how to move data—analogous to gradually trusting AI in car navigation before full self‑driving control. - [00:09:25](https://www.youtube.com/watch?v=OdO8TL9MX0M&t=565s) **Generative AI for Storage Migration** - The speaker explains how AIOps uses generative AI to forecast storage shortages, issue alerts with compatibility‑scored migration options, and allow users to manually or automatically relocate data to the optimal array. - [00:12:29](https://www.youtube.com/watch?v=OdO8TL9MX0M&t=749s) **Self-Driving Storage Automation** - The AIOps engine autonomously selects a storage system, provisions or expands partitions, and delivers ready‑to‑use volumes, enabling on‑demand performance without any user intervention. - [00:15:43](https://www.youtube.com/watch?v=OdO8TL9MX0M&t=943s) **Agentic AI Drives Autonomous Data Migration** - The AIOps platform leverages agentic AI to automatically decide when and where to relocate data across storage systems, executing the moves without interrupting application operations. ## Full Transcript
0:00Many of us know all about self-driving cars. 0:03Some of you have self-driving cars 0:06that are roaming around your city. 0:08Some of you have vehicles that have this capability built in. 0:13I'm here to explain to you a concept called self-driving storage. 0:18Now I know 0:20how is this guy going to explain 0:24a connection between a self-driving car 0:28and storage infrastructure in my data center? 0:31Well, I'm here to connect the dots 0:33and explain how we can accomplish this. So, 0:36to have a self-driving car, you need obviously a car. 0:41And then on the self-driving storage side, 0:43you need a car that moves. Typically, 0:47and we're going to use block storage as our example here. 0:50Typically block storage is not very mobile. 0:53You place your storage, 0:55you allocate your storage, you place your data on that storage 0:58and it doesn't really move anywhere. 1:00But in order for us to take full advantage of self-driving storage, 1:03we need to provide block storage 1:06that is mobile. That can move around our storage infrastructure. 1:10So let's first draw out 1:12how we typically provision resources for block storage. 1:17First thing, is we have volumes. 1:20Volumes store 1:22the data that we have 1:25from our servers or compute. 1:30Here our hosts are able to access 1:35all of our data using hosts and volumes. 1:39Now optionally, you can put containers around these volume groups 1:45and host clusters. 1:48This is the basics of block storage. 1:51All of us that know about block storage 1:54use this paradigm to allocate and provision all of our resources. 1:58Now where we're going to add 2:00something new is we're going to organize 2:04all of these resources, all of these objects 2:07together into one simple container 2:10that creates the ability to make it mobile. 2:14So we're going to draw a box around this whole thing, 2:17and we're going to call this a storage partition. 2:24We use the word partition in many of our server applications. Um. 2:29We call them LPARs. 2:31But in this case 2:32we're using a storage partition 2:35to describe a part of a storage array. A subset. 2:39Just like we do for LPARs in our servers, 2:42we do it same here in our storage. 2:46So, each of our storage arrays 2:48can contain a multitude of these storage partitions. 2:53By providing 2:54that gives us the opportunity to be able 2:56to move the storage around in our storage infrastructure. 3:02It's the basis for ah all that 3:04we're going to do in self-driving storage. 3:08So now that we have a mobile car 3:10and a mobile storage partition, 3:13the next thing we need to do is 3:14we need to feed our AIOps brain with information. 3:18And that the way we do that is we describe 3:22the storage partition in multiple ways. 3:24That helps the machine learning in our 3:27AI platform make good determinations, good decisions, 3:31and eventually be able to move the actual data from place to place. 3:36So we have some things to describe a storage partition. 3:39We have metrics that describe capacity and performance, 3:43and we have protection of ways that we can protect the data, 3:47depending on the importance of the data 3:49you're trying to store on the devices. 3:52So, in the metrics section we have some things like capacity. 3:56How much data 3:58are we storing on this device, 4:00and how much data is going to increase or decrease 4:04as we write data from the hosts? 4:06We also have metrics, such as performance metrics, such as IOPS or I/Os per second. 4:12We have bandwidth. 4:16And we have latency. 4:20Bandwidth describes how much how many cars are on the road. 4:24How congested is the pipe that's going 4:26between the server and the storage? 4:29And latency is how fast 4:31is the round trip going from point A to point B? 4:34Some applications require very, 4:36very low latency, and some applications are more lenient 4:40in terms of the type of latency that is required. 4:43These are all metrics that the AIOps machine learning will go 4:47look at as the storage partitions are in place. 4:51The second section talks about protection, 4:54and there's different types of protection 4:56depending on the importance of your data. 4:59We have things called snapshot, 5:03which are local to the storage array, 5:06which protect the data from cyberattacks, 5:10ransomware, logical corruption. 5:13They're located on the storage array itself. 5:16We have replication technologies. 5:18So. ways to replicate the data 5:20between two different systems in the data center 5:24and then also outside of the data center ah 5:26at a different site. 5:28We have ah a capability called disaster recovery, 5:32which is the ability to replicate data in a different region. 5:36So, if you have a regional disaster, 5:39power outage, or the data center goes down, 5:42you can still be able to access your data 5:44in a different site, which could be hundreds of miles apart. 5:49We also have this concept called high availability, 5:52that's within the data center. 5:54You have two storage arrays that are synchronously replicating between each other. 5:58But if any one of those storage arrays goes down, 6:03you still are able to maintain access to all of your applications. 6:06And then, also, you have the capability of some ah businesses. 6:11They wanna to have a local, replicated 6:15copy and an offsite DR copy. 6:18We call that HA plus DR. 6:21So, all of these things can describe 6:24one of these storage partitions. Again, 6:26parts of the storage array 6:30that are in our storage interconnect. 6:33These describe attributes of the storage partitions. 6:37Now, the other important thing of this whole concept is 6:42we have to take this information at a time and apply 6:47time to those things. 6:49A point-in-time copy of all of this information is interesting, 6:53but is not that interesting because you don't have a historical reference to that. 6:57So, as we're feeding this information up to our AIOps platform, 7:01we have to apply time to all of these items. 7:06And that will help us make decisions. 7:09And it'll have the help the AI make dis correct decisions as to 7:14which, when and where to move the data to. 7:16So we take all this information. 7:18We're attaching them to the storage partitions, 7:21and then we're sending them up to our AIOps platform. 7:25And by doing so, AIOps 7:29is able to do machine learning and determine 7:32and gather all this information for later use. 7:35So when we first think about jumping in a self-driving car, 7:39we have to give trust to these 7:42AI ah engines that are running in the car. 7:45Sometimes we don't really trust the 7:48AI quite yet to give complete control. 7:51So we want to take baby steps. 7:53And a good example of that is navigation software in your car. 7:57This navigation software is actually AI. 8:00It's giving you the best route from point A to point B. 8:04It's actually connecting you to your calendar 8:06to tell you where you're going to be going. 8:08Same thing goes with self-driving storage. 8:11We want to take baby steps. 8:12We don't want to give AI immediate control of your whole IT infrastructure. 8:18So let's talk about a use case in which we can take this first step 8:21into getting to self-driving storage. 8:24So capacity is a major factor in storage. 8:27When you run out of capacity, 8:29your storage goes down. 8:31So, as time goes on, 8:34we obviously ah are using more storage. 8:37We're writing more data to our storage arrays. 8:40And at some point, 8:41we hit a point where 8:44we're gonna run out of storage really quick. 8:47The AIOps engine is able to, at that point, 8:50tell you you're gonna run out of storage, give you an alert, 8:54you know. So when, uh, when 8:56you run out of storage, it's going to alert you. 8:59a reactive thing. 9:01Now, AIOps, with all the machine learning that we just talked about, 9:05is also able to determine as a trend 9:10how much data 9:11you're actually writing and give you up from 30 to 60 days, 9:17a predictive analysis 9:20and a forecast that you're going to run out in 30 to 60 days. 9:25So what we want to try to do here 9:27is we want to try to use this technology 9:31to, and the movement of that data, 9:34to better use AIOps 9:36to get you out of the situation 9:38before it becomes too late. 9:40So again, we have our storage partitions 9:43in our storage arrays. 9:46And this time what we're gonna do is 9:48AIOps is gonna say 30 to 60 days, you're 9:51gonna run out of space. 9:53You need to do something about this. 9:54So it sends you an alert. 9:56And at that point, it will give you some options. 10:00And this is using generative 10:02AI to make a determination and a recommendation 10:06of where you want it to go. 10:08So to the user, 10:10the alert will say I can go to system A. 10:13I can go to system B. 10:15I can also give a percentage compatibility 10:18score. Using all of the metrics that we had talked about before, 10:22I can determine which of the storage arrays 10:25is best suited for this. 10:27So I can give this a 90% 10:29and I can give this an 80%. 10:31At this point, the user is able to 10:34then choose which one of these storage arrays 10:38to move the data to. 10:40So that's a key difference. 10:42We're giving the user the ability to do and make a choice, 10:46and we're also letting the user actually 10:48physically move the data himself. 10:51So, we take that and then we can make that decision. 10:54Let's say I chose B for whatever reason. 10:58So then I can then manually move my data from A to B. 11:04The next step is to give 11:06AI a little bit more leeway. For ger our next use case, 11:10we're ger we can let the AIOps engine 11:12do a little bit more than just give a recommendation or a suggestion. 11:16We're going to call this one workload placement. 11:19And the idea here is I have a new application 11:21or of a new set of applications that I want to add to my storage. 11:26But where do I put it? 11:28So, the first thing we wanna do 11:30is we wanna ask the AI, 11:31where is the best place to put my storage? 11:34So, I have my application. 11:36I know it has certain requirements, 11:40very similar metrics to what we had before. 11:43So let's say I want to have 30 K IOPS. 11:46Let's say I need four terabytes of storage. 11:50And let's say that I want snapshot. 11:54And let's say that I wanna have DR 11:57for this particular application. 11:59This describes the application that I want. 12:01I can take this application and I can feed it into our AIOps engine. 12:06And our AIOps engine can then make determinations 12:09on which of the storage arrays to put it on. 12:12Now, the difference here is 12:15we are still giving you a recommendation. 12:18So again, we're going to draw out for the user 12:22the different choices that the user can make. 12:24Let's say we wanna go to system 12:26B again and system C this time. 12:29And we give them the opportunity to make that selection. 12:32So let's say this time I'm going to select system C. Now, 12:37the difference between this 12:39and what we just talked about is we got to this point. 12:42What is the next thing the AIOps engine is going to do? 12:45This time, the AIOps engine is actually going to perform 12:49the operation of provisioning all of that storage. 12:53So, what it's gonna do is it's gonna pick storage C, 12:56it's gonna to take this storage partition. 12:59It's either gonna create it 13:01or it's gonna add 13:03to the existing storage partition. 13:07And what it's going to do is it's gonna provision 13:09all of that same stuff 13:12we just talked about at the very beginning. 13:14It's gonna create a storage partition, provision all of the information. 13:18And at the very end of it, it's going to give you a storage 13:23data store or a storage volume 13:25or a set of volumes with all of the same attributes. 13:29And it's going to give it 13:31for you to use in your application or operating system. 13:35So now is the pivotal moment. 13:37It's time to let go of the wheel 13:39and let the self-driving car 13:41drive itself to your destination. 13:43On the self-driving storage front, 13:46it's time to let go of the wheel 13:48and let the AIOps engine and its agentic AI take 13:50you to new levels 13:54automatically, autonomously, without user intervention. 13:58The example we're gonna use for self-driving storage in the end 14:02is one we call on-demand performance. 14:05And what that means is 14:07there are times of the year 14:09that the AIOps engine knows that it needs to, uh, 14:14provide the most amount of system 14:17resources, storage resources to achieve that. 14:21One of our favorite times of the year 14:25where this occurs is Christmastime. 14:29So, during Christmastime, 14:34we know that around Black Friday, 14:37all the way up through Christmas, we 14:39know that in the retail industry 14:42that they have the highest amount of demand, 14:45the highest amount of input 14:48and the biggest amount of data 14:51to be written to our storage systems. 14:54The AIOps platform knows this. 14:58And how does it know this? 14:59It's been ingesting all of this information, all of the metrics, 15:03all of the protection schemes, everything in time 15:06series throughout the year. 15:08So it knows that come end of November, 15:12it needs to be able to do something 15:14to give us the best opportunity to handle the onslaught of data. 15:19So let's draw out our storage partitions again. 15:21We're gonna call our three systems. 15:25And um, we're gonna use system A 15:29as the system that we want to, ah, empty basically, 15:34in order for it to be able to handle, uh, 15:37the Black Friday onslaught. Now, 15:39Black Friday is one thing you can use tax time. 15:43You can do any of the other things 15:46that would require you to have this extra information. Now 15:50previously, when we were using the AIOps platform, we gave the user 15:55the ability to choose which storage system, 15:59which platform to put it on 16:01and then the user would then do it. 16:03Now, we gave the AIOps platform 16:06some levity to create those resources in 16:09the previous example. In this case, 16:12we are giving the AIOps 16:14platform full control 16:17of not only deciding where to place the data, 16:20but it's actually going to perform actions 16:23and move the data itself 16:26using agentic AI. 16:28So, when we're talk thinking about this, 16:31the AIOps platform is going to think about 16:35when is the right time to do this. 16:38So, based on time series of information, it chooses this day, 16:44some day in October to move that data. 16:48And then what it does is it looks at system A 16:52and it is determined, using all of those metrics, 16:55which system it wants to move it to. Well, 16:57let's say that the AI platform tells this partition 17:01move to system C. 17:03And then over the amount of days 17:06from the end of October 17:08to the end of November, 17:10it's actually gonna take the time to move that data. 17:13And moving that data doesn't cause 17:15any access problems with your applications. 17:19Your applications keep on running. 17:20Everything keeps on running. 17:22You don't have to lose any ability to run your business. Also, 17:27at the same time, the AI platform says, 17:31you know what? I'm gonna take this one 17:33and I'm gonna move it down to system B. 17:36And at the end of the day, system 17:38A now only has one partition 17:40and a fully loaded storage array 17:43ready to take the onslaught 17:45of your Black Friday and Christmas demands. 17:49So this is only the tip of the iceberg, but 17:52you can see where self-driving storage can take us. 17:54It is wholly autonomous, using agentic AI, 17:58making decisions, moving the data 18:01without any user interventions. 18:03It's just the tip of the iceberg in terms of 18:05where we're going to go with this. 18:07There are tons of other examples where self-driving storage 18:11can further optimize 18:13all of your business needs, your entire storage 18:16infrastructure going forward.