Learning Library

← Back to Library

NoSQL: Practical Flexibility Guidelines

Key Points

  • NoSQL databases embrace flexible, semi‑structured JSON documents (collections of JSON objects) instead of rigid rows and columns, allowing them to handle real‑time, unpredictable data and evolving user behavior.
  • Despite the “Not Only SQL” name, NoSQL systems still support relational features such as joins, lookups, and indexing, but they store data as collections (similar to tables) of unique JSON objects.
  • For use cases like product catalogs with highly variable attributes, NoSQL avoids the relational pitfalls of multiple tables or massive denormalized tables full of nulls, offering a more natural, schema‑agnostic representation.
  • This flexibility makes NoSQL especially suited for event‑driven, high‑transaction workloads, APIs, data pipelines, and AI model integrations where rapid iteration and performance are critical.
  • By applying practical NoSQL guidelines, developers can leverage its performance and adaptability today rather than treating it as merely a buzzword.

Full Transcript

# NoSQL: Practical Flexibility Guidelines **Source:** [https://www.youtube.com/watch?v=pYK4No7ACRE](https://www.youtube.com/watch?v=pYK4No7ACRE) **Duration:** 00:14:39 ## Summary - NoSQL databases embrace flexible, semi‑structured JSON documents (collections of JSON objects) instead of rigid rows and columns, allowing them to handle real‑time, unpredictable data and evolving user behavior. - Despite the “Not Only SQL” name, NoSQL systems still support relational features such as joins, lookups, and indexing, but they store data as collections (similar to tables) of unique JSON objects. - For use cases like product catalogs with highly variable attributes, NoSQL avoids the relational pitfalls of multiple tables or massive denormalized tables full of nulls, offering a more natural, schema‑agnostic representation. - This flexibility makes NoSQL especially suited for event‑driven, high‑transaction workloads, APIs, data pipelines, and AI model integrations where rapid iteration and performance are critical. - By applying practical NoSQL guidelines, developers can leverage its performance and adaptability today rather than treating it as merely a buzzword. ## Sections - [00:00:00](https://www.youtube.com/watch?v=pYK4No7ACRE&t=0s) **Embracing NoSQL Flexibility for Modern Apps** - The speaker outlines practical guidelines for using semi‑structured JSON‑based NoSQL databases—highlighting their ability to handle relations, joins, indexes, and dynamic data—to improve flexibility and performance in real‑time applications and data pipelines. - [00:03:08](https://www.youtube.com/watch?v=pYK4No7ACRE&t=188s) **JSON Flexibility Over Relational Schemas** - The speaker explains how storing product data as JSON in a NoSQL database avoids null‑filled columns and rigid schemas, allowing new products and variable attributes to be added without altering table structures. - [00:06:14](https://www.youtube.com/watch?v=pYK4No7ACRE&t=374s) **Optimizing Sensor Data Grouping** - It explains how aggregating sensor JSON records into time‑ or ID‑based groups reduces read latency while preserving write performance. - [00:09:18](https://www.youtube.com/watch?v=pYK4No7ACRE&t=558s) **JSON Comment Overflow Strategy** - The speaker suggests using an overflow sub‑object to shift comments beyond a set threshold (e.g., 50) into a separate section, keeping the main JSON compact and improving read/write performance. - [00:12:27](https://www.youtube.com/watch?v=pYK4No7ACRE&t=747s) **Precomputing Comment Metrics in NoSQL** - The speaker explains how to avoid costly real‑time calculations by incrementally updating aggregate values like comment counts, leveraging NoSQL’s flexible data models for high‑throughput applications. ## Full Transcript
0:00Data doesn't always fit into neat rows and columns, so why force it? 0:04Let's explore how NoSQL databases offer the flexibility that modern applications need. 0:10Our applications aren't built in a vacuum. 0:13They run on real-time data, unpredictable inputs, and ever-changing user behavior. 0:18So whether we're building APIs, crafting different data pipelines, 0:23or even building AI models, we want to make sure that we're leveraging NoSQL. 0:28to help us create all these great, efficient systems. 0:31So in this video, we're gonna define practical NoSQL database guidelines 0:36so that you can bring the flexibility and performance that comes from NoSQL technologies into your applications. 0:44By the end, you'll be an expert in NoSQL and understand 0:48how it isn't just a buzzword and how it can actually apply to your workloads today. 0:53So first of all, there's been a lot of information about NoSQL. 0:57And just to clear the air, it actually stands for not only SQL. 1:02So that means that, of course, it can still handle relations. 1:06It can still do joins. 1:07It can so do lookups. 1:09There are indexes. 1:10There are all sorts of NoSQL databases. 1:13And for the context of today's conversation, we're going to talk about a semi-structured JSON object focused database. 1:22So this is going to have groups of JSONs that essentially would represent the NoSQL database structure. 1:29So these groups are referred to as collections or sets. 1:33That's very similar to a table in a relational database. 1:37And then instead of a row, there's gonna be a unique JSON object. 1:42So these are great for event-driven, highly transactional workloads 1:46so that we can move quickly with our data and have enough flexibility. 1:51So let's go into some examples. 1:53This is where NoSQL can really do well. 1:55So think you have a product catalog, right? 1:58So with a product catalog, there might be variable, different products. 2:03And they might, in a relational database, for example, because they might be different, 2:08you actually are going to store them in completely separate tables. 2:13So even though you have three different products, you're going to actually have to break them out. 2:18Even though they maybe have some fields that overlap, 2:21there's just enough that are different that you have to keep them separate. 2:25Let's just represent these by shapes. 2:29If they weren't that different, another pattern we see in relational databases 2:33is to keep everything what's called denormalized in one big table. 2:38So your standard columns are here. 2:41Every product has a name. 2:43Every product as a manufacturer. 2:45But then what happens is that for each unique 2:49set of attributes for each different kind of product would have its own unique rows. So you would see that. 2:56kind of broken out three different ways. 2:59And then let's say that this was a, you know, particular kind. 3:05It would basically align its attributes. 3:09And then when it fill in anything for the other ones, and then it would kind of follow in like this for the others patterns. 3:17So the problem with this being that there's a lot of nulls that end up happening. 3:22And basically this table just can continue to grow. 3:26So this is where NoSQL can come in handy, because when things are held in a JSON, 3:32we have much more flexibility with what's actually needed 3:36to be held and that the schema isn't as strict as in we don't have to come up with a 3:41set list of columns that always have to be written to or accounted for. 3:49So with a JSON basically what we would do is you would just have your intro information 3:56on the product. 4:00And then you might, because you have these variable differences, have your details. 4:07as just another sub-object and that's where you would put them and then you'd be able to close it all out. 4:16And so this is a much more flexible model because this is the area here where you have your variation, 4:25so that you can make sure that that variable data is maintained in the JSON, 4:29but this is much more maintainable overall because go through the scenario of let's say a new product comes online. 4:36Now in this scenario, you have to either build a whole new table or you have build new columns. 4:42In this case, you have do nothing. 4:44You can still insert the JSON just like normal. 4:46You're just gonna change the product name to whatever this new product is 4:51and you have enough flexibility that you can just add in these details as needed. 4:56Now, if you wanted to make certain fields required, you can make this as strict as you 5:00want so that it basically could reflect this, but then you lose some of that good flexibility. 5:05So as you can see, 5:06there are some situations where just a NoSQL storage method 5:10is really going to be more advantageous to use just for a maintainability perspective alone. 5:17Another great use case for NoSQL is when working with sensor data or the internet of Things. 5:22let's imagine we have a system where we have tons of sensors 5:27that are constantly pinging your system with some kind of update. 5:32Maybe it's taking the temperature, maybe it's just checking a status, but regardless, 5:37you can multiply that every so many seconds by 100 different sensors, 5:42you're gonna have a lot of information coming in constantly streaming. 5:46So how would this be solved in a relational database? 5:49You would see just probably one row for a transaction. 5:52In this case, we're going to see an individual JSON that's going to come in 5:57so that it basically will continue to be very optimized for your rights. 6:02So you can get that data in very quickly. 6:05So that's really going to help not only just with the speed that things 6:10need to operate, but we can actually optimize our reads as well with NoSQL. 6:15This is where we have the opportunity to actually group these. 6:19So let's say our sensor data, we wanna look across time. 6:22We can actually aggregate all these little JSONs as actually an array or a nested group, 6:29within one larger JSON that represents all the transactions that came in in the last hour. 6:36So basically we can break this out in a couple different ways so that when we are doing our reads, 6:42we don't have to iterate through millions of objects to check where the time stamps are accurate. 6:48We can actually be much more focused and just pull one JSON that has 6:53every single transaction that happened from the last hour and that'd be a much more effective lookup 6:58than going through and trying to find each individual rows. 7:02So this is actually going to help not only optimize our reads, but it's not going to compromise your writes at all. 7:09And this is also another great way, if you don't want to partition it or break up by different timestamps. 7:17Another example is to break it up by categories. 7:19You could do this by maybe unique sensor or let's say each sensor has an ID that's like a number. 7:26These groups can actually represent, if you want to just randomize the groups, 7:30you can come up with just the first two or three digits of the unique ID 7:36so that you're not actually creating a separate group for every single different sensor itself, 7:43but a group of sensors. 7:45And that's another way that you can then tune it so you get the right grouping to optimize your reads. 7:52All right, let's keep thinking about ways that we can use NoSQL databases to help us support modern data workloads. 8:00So here's a common example of social media posts. 8:04Think of this as any kind of situation whereas with a relational database, 8:09you might see what we call a one-to-many relationship. 8:13So that's usually where you have one parent table, 8:16and then it has a child table, and it's usually depicted in this way if you've ever seen a architectural diagram for a database. 8:26So that doesn't really exist within NoSQL. 8:31It can to some degree, but generally what we wanna do first is we wanna nest certain objects. 8:36So let's think about a social media post. 8:39there's usually comments, right? 8:40And those comments can only relate to one post. 8:44So that relationship is pretty well defined. 8:47And why might we want to actually nest the comments in with the post? 8:52Because they're displayed together. 8:54So in the world of relational databases, you're going to be constantly joining between these two tables. 9:00Now, relational databases are built for joins, so there really isn't going to a problem with that. 9:05However, there might be a better way to do this with NoSQL. 9:09Especially when you have one-to-many relationships where this is fine if it's probably like a social media post by you or by me, 9:17and we don't get that many comments. 9:18There's there's only a couple because generally with this you want to make sure your JSON doesn't get blown out, 9:24but what happens when there's 500 comments in the case of relational database? 9:30You just continue to have this one-money relationship, 9:33but you're gonna run into trouble here with your Your NoSQL solution. 9:39So what we can actually do here is we can build what's called like an expansion set of the comments. 9:47So basically you'll just be able to create a whole new, so this is just an example of one JSON, 9:53but we'll be creating a new example just of kind of a subset of the comments. So, we can call this, like, overflow. 10:07And from here, you would just then have the exact same structure. 10:11So we'd have like our comments 10:22just like before. 10:23And basically from here, we can actually have different amounts of comments. 10:29So you can kind of decide where this threshold really is going to be. 10:34For example, let's say we don't want to display more than 50 comments at once. 10:40And anything over 50, so 51, that's where you're going to create this overflow object, 10:45so that we can basically move them over here. 10:49And this allows you to optimize most posts because they aren't going to be hitting this overflow amount, 10:55but we have a control for those exceptional cases where we see a lot. 11:02And this is going to help optimize not only the reads but also the writes as well. 11:09Now it will have to do a little bit more logic in the writes to kind of make it Oh, am I at 50? 11:14Is it 51?" to decide whether or not it goes here, 11:18but that's kind of a small price to pay, to actually have the optimization that's needed 11:25to allow for this kind of really fast readability that you're going to get out of this object. 11:32So then, thinking about it in another way, maybe we don't want to display comments by chronological order, 11:39that we want to actually display the top three ranked. 11:43You can actually do this in the exact same way. 11:46just you would basically limit. 11:48these to only showing three instead of 50 11:52and then you can put anything that isn't in those top three and push them to this overflow. 11:57Again this is going to seriously optimize your reads because then your the actual post JSON is going remain very very small 12:04and it basically prevents you from having to read every single comment every single time you post these posts. 12:13and then recalculate what is the top three ranked comments based on whatever algorithm you have. 12:20So by having this basically pre-filtered and pre-ranked or sorted, you're going to get a lot of benefit. 12:27Again, the complexity here comes in your rights that you will have to, every time 12:31a new comment comes in, you'll have to recalculate that ranking. 12:36But if you do the equation of how often am I writing a comment versus reading one? 12:42because the read is so much higher and more important. 12:45It's a, you know, it's a fine trade-off. 12:48Now, lastly in this example, let's think about maybe we want some kind of summary statistics 12:53because we don't, similar to what I was saying with the ranking, we don' want to have to calculate that every time. 13:01So maybe like a total, you now, comment count would be needed. 13:06So that might be something that we end up adding over here. 13:11So that, you know, let's say it's, 13:17there's 4,200 different comments that are written out. 13:23And from there, we can basically find that we can do that math ahead of time based on every time we add a comment. 13:32We can just basically add a plus 1 every time that we add one in. 13:39that's going to help us. 13:40And by pre-writing out all these calculations, 13:43that can get displayed very easily, and no recalculations really need to be done. 13:49So from there, you can see how we're able to do these kind of optimizations 13:54so that we're well aligned to the kind of high demand workflows that we are supporting in our modern applications. 14:02Today, we've demonstrated the strength of NoSQL and that the strength here really lies in its flexibility. 14:09Variable data models, nested objects, and efficient grouping strategies allow NoSQL to really shine. 14:17These patterns allow you to build fast, scalable applications that adapt to dynamic, real-time data. 14:24Mastering these strategies means you're not just reacting to data, you're staying ahead of it. 14:28So I encourage you to keep refining these skills because the future of data engineering, 14:33software development, and data science will depend on building systems that move as fast as the data itself.