Learning Library

← Back to Library

Lego Analogy for Data Governance

Key Points

  • The rise of foundation models and big‑data AI creates a new need for both model governance and data governance to ensure responsible use.
  • Data governance is likened to a well‑organized LEGO set, providing a standardized, secure, and high‑quality foundation for an organization’s most valuable asset—its data.
  • Consistency in data governance means establishing universal standards (e.g., date formats) so all teams can seamlessly share and interpret data.
  • Secure data practices involve classifying sensitive information (like PII) to comply with regulations such as HIPAA and GDPR and to prevent harmful data mix‑ups.
  • High‑quality data governance ensures completeness (no nulls or missing columns) so decisions are based on reliable, “complete‑piece” information.

Full Transcript

# Lego Analogy for Data Governance **Source:** [https://www.youtube.com/watch?v=Ixt-4T6oxk4](https://www.youtube.com/watch?v=Ixt-4T6oxk4) **Duration:** 00:05:34 ## Summary - The rise of foundation models and big‑data AI creates a new need for both model governance and data governance to ensure responsible use. - Data governance is likened to a well‑organized LEGO set, providing a standardized, secure, and high‑quality foundation for an organization’s most valuable asset—its data. - Consistency in data governance means establishing universal standards (e.g., date formats) so all teams can seamlessly share and interpret data. - Secure data practices involve classifying sensitive information (like PII) to comply with regulations such as HIPAA and GDPR and to prevent harmful data mix‑ups. - High‑quality data governance ensures completeness (no nulls or missing columns) so decisions are based on reliable, “complete‑piece” information. ## Sections - [00:00:00](https://www.youtube.com/watch?v=Ixt-4T6oxk4&t=0s) **Data Governance Explained with Lego** - The speaker uses a Lego analogy to illustrate how data governance ensures consistent standards, secure handling of sensitive information, and high‑quality data across an organization. - [00:03:10](https://www.youtube.com/watch?v=Ixt-4T6oxk4&t=190s) **Model Governance: Metrics and Maintenance** - The speakers outline how defect‑free, purpose‑driven AI models require continuous testing, inspection, and specific performance metrics—such as ROUGE scores, knowledge retention, and latency—to uphold model and data governance standards. ## Full Transcript
0:00Now that we've hit an inflection point of foundation models, 0:04machine learning models and other big data terms, you might be thinking, how do we govern these things? 0:09Well, that brings us to two very important topics. 0:12We have model governance. 0:13And data governance. 0:15So let's start with data governance. 0:16Data governance is how organizations protect and get value from their most important asset, their data. 0:22Let me explain this using Lego. 0:24Lego bricks are designed to work together seamlessly. 0:26When you get a Lego set, you expect all the right pieces that fit together with nothing missing. 0:32That's what data governance does for our data across different teams. 0:35So let's break this out into three different blocks, 0:39consistent, 0:41secure, 0:43and high quality. 0:44Let's start with consistent. 0:46Just like how Lego bricks are designed to fit together perfectly and connect to each other in standard ways. 0:52Data governance means we need to create standards and definitions such as how we format our dates. 0:58So like America versus European standards? 1:01Yes, that's exactly right. 1:02So we want to make sure we have the month and the date in the same way for our for our organization. 1:07So we all understand as a shared data asset. 1:11Moving on to secure. 1:13Just as Lego pieces come in specific bags for each section of a build. 1:17Data governance enables secure data practices by classifying data like PII or personally identifiable information. 1:26So it's something like a Social Security number that we want to keep private. 1:29And this allows us to protect our data from getting mixed up or misused, 1:33and finally, high quality. 1:35When you build with Lego, you expect to have all of the right pieces. 1:38Data governance ensures our data is complete and of high quality by checking for things 1:43like missed values or incomplete columns. 1:46So something like null. 1:48Awesome. 1:48So that helps clarify some things. 1:50But can you tell me a little bit more what you mean by secure data practices? 1:54Yeah, of course Anisa, 1:54I'm glad you asked. 1:56So when organizations follow data governance practices, they can meet important regulations like HIPAA or GDPR. 2:03This is especially important in health care and financial services. 2:06So, for example, in health care, if patient records get mixed up 2:10or in financial services, bank account information gets mixed up, there could be serious consequences. 2:15So think about building a Lego castle. 2:18If you mix up the bags, you might get the wrong pieces in the wrong places. 2:22But with data mixing up sensitive and public data is a much bigger problem than a misplaced Lego brick. 2:28That's why we need these practices. 2:30That make sense. 2:31So just like how you need to use the right Lego pieces to build a story castle, 2:35I need to make sure I'm using high quality, secure data to make informed decisions, right? 2:40That's exactly right. 2:42So just as organized Lego pieces let you build a sturdy foundation. 2:47Proper data governance gives you the foundation to build something reliable and valuable. 2:53Awesome. 2:53So now imagine we're building a Lego castle which will represent our AI/machine learning model. 2:59So using the Lego pieces from the box, 3:02model governance is like ensuring that the Lego Castle we're building 3:06are built with high quality Lego pieces similar to what's down here. 3:10And we want to make sure they're free from defects or biases. 3:14Next, we want to make sure that they have a clear purpose and a functionality in mind, right? 3:19So that way it's very targeted. 3:22We want to make sure that it is also tested and ensured to meet performance standards. 3:29So we don't want it to drift and hallucinate. 3:31And finally, we want to make sure that it is regularly inspected and maintained to prevent deterioration or errors in the future. 3:40Okay. 3:40So so that all makes sense. 3:42But what do you mean by performance standards? 3:44Yeah, that's a great question. 3:45So there are tons of metrics out there that are associated with model governance. 3:49In fact, we are releasing metrics every single day. 3:53There are literally hundreds of thousands that exist today. 3:57Some of the big ones are Rouge. 3:59Rouge stands for recall oriented under study for just evaluation. 4:06Wow, that is a mouthful. 4:07Yeah. Tell me about it. Say it three times faster. 4:10So we use that to evaluate the quality of machine generated summaries by comparing them against reference summaries. 4:16Next, we have knowledge retention. 4:19Knowledge retention is when we make sure that the LLM is able to 4:26retain factual information presented throughout a conversation. 4:29Makes sense? 4:30It does. 4:31And next, we have latency. 4:33So latency refers to the process of taking your input prompt and generating a corresponding output response. 4:41Okay, so that all makes sense, but I don't understand yet how this is important to model governance and data governance. 4:47Yeah, that's a great question. 4:48So the Lego creation represents model governance because it 4:52focuses on ensuring that the model is built with high quality components. 4:56We want to make sure governance is in place to make sure our tools are not biased or 5:01have any defects in functions as intended, or else our castle is going to fall apart. 5:05no, that's not good. 5:07So you're basically saying that data governance is like organizing your Lego box, 5:11making sure that all the pieces are sorted, secure and ready to use? 5:15Exactly. 5:16And model governance is like building and maintaining a specific Lego creation. 5:21And sure enough, the AI/machine learning model is reliable, transparent and performs as intended. 5:27And together they give us the foundation that we need to make better decisions with our data. 5:31Exactly. 5:32How's that for a bricktastic explanation?