Learning Library

← Back to Library

Data Observability: Driving ROI Benefits

Key Points

  • Data observability delivers ROI by helping both data producers (engineers, platform teams) and data consumers (ML engineers, analysts, scientists) detect and resolve hidden issues throughout the data pipeline.
  • In a typical journey—ingestion → lakehouse transformation → warehouse storage → consumer access—subtle bugs (mis‑formatted records, transformation errors, duplicate loads) can silently corrupt data before it reaches analysts.
  • Without observability, engineers spend 10‑30% of their time firefighting data quality problems, while consumers waste effort building models on unreliable data, leading to frustration and reduced productivity.
  • Implementing an observability solution surface‑to‑surface these incidents in real time, allowing engineers to focus on building pipelines and consumers to trust the data for accurate analysis and ML training.

Full Transcript

# Data Observability: Driving ROI Benefits **Source:** [https://www.youtube.com/watch?v=j8X0xiHTW54](https://www.youtube.com/watch?v=j8X0xiHTW54) **Duration:** 00:12:06 ## Summary - Data observability delivers ROI by helping both data producers (engineers, platform teams) and data consumers (ML engineers, analysts, scientists) detect and resolve hidden issues throughout the data pipeline. - In a typical journey—ingestion → lakehouse transformation → warehouse storage → consumer access—subtle bugs (mis‑formatted records, transformation errors, duplicate loads) can silently corrupt data before it reaches analysts. - Without observability, engineers spend 10‑30% of their time firefighting data quality problems, while consumers waste effort building models on unreliable data, leading to frustration and reduced productivity. - Implementing an observability solution surface‑to‑surface these incidents in real time, allowing engineers to focus on building pipelines and consumers to trust the data for accurate analysis and ML training. ## Sections - [00:00:00](https://www.youtube.com/watch?v=j8X0xiHTW54&t=0s) **Data Observability ROI Benefits** - The speaker explains how implementing a data observability solution boosts ROI for both data producers (engineers/platform teams) and data consumers (analysts, scientists) by improving the end‑to‑end data journey from source ingestion through lakehouse to warehouse. ## Full Transcript
0:00hi everybody today we are exploring the 0:02specific benefits in Roi of a data 0:05observability solution further building 0:08upon a previous video which laid out 0:10what data observability is and why it's 0:13important in today's climate if you 0:15haven't already watch that hit pause now 0:18you can find the link in the description 0:19below and if you have well then welcome 0:22and let's get started I'm going to start 0:25with a typical overview of data Journey 0:29before we do that I want to call out two 0:30personas within an organization that 0:32we're going to be highlighting in our 0:34example below those are data producers 0:37and data consumers producers are your 0:40data engineers and your data platform 0:42teams whereas consumers are your ml 0:45Engineers data analysts and data 0:48scientists both of these groups are 0:51going to see Mutual benefit as well as 0:53unique benefit to each by implementing 0:56an observability solution so like I said 0:59let's get started with this overview of 1:00a typical data Journey so as you can see 1:03below we have our 1:05sources um we have our Lakehouse we have 1:07our warehouse and we have our access the 1:09Journey Begins With the data engineer 1:12ingesting raw data from various sources 1:15into the lakeh house the data 1:21engineer then transforms and loads the 1:24data into the lake house performing the 1:26necessary cleansing and 1:28standardization that 1:31data is then processed for storage 1:33within the data warehouse the data 1:36scientist can then access that data to 1:39perform any 1:41relevant models training analyses Etc 1:47now this seems pretty straightforward 1:49but what we're seeing um is actually 1:52what we're not seeing which is the risks 1:54and everything going on behind the 1:56scenes throughout this data Journey so 2:00at the data ingestion Point your data 2:03engineer has ingested raw data from 2:05various Source however unbeknownst to 2:08them um a subtle issue arises during 2:11that ingestion process where certain um 2:15records are misconfigured or formatted 2:22incorrectly as the data engineer then 2:24transforms and loads the data into the 2:26lake house unintentionally a 2:29transformation script introduces a bug 2:32that alters the value of a specific 2:34column impacting Downstream 2:41analyses the cleanse data that's then 2:44stored in a warehouse however due to 2:47misconfiguration certain data is 2:49actually duplicated during that loading 2:54process now the data scientist who was 2:56so excited about conducting all these 2:58analyses with the data that they've 3:00received is unaware of all these 3:03incidents that have occurred before and 3:05is now completing models with inaccurate 3:09and unreliable 3:11data in this current state your data 3:14producers are entirely overwhelmed and 3:17they're constantly fighting fires and 3:19your data consumers are also really 3:22frustrated because they're unable to 3:24perform the correct models that they 3:25want to do neither of these personas can 3:28focus on what it is that they're skilled 3:30to do that's because their data is 3:33unreliable so let's try now associate 3:37some numbers with this common 3:40scenario a typical engineer will spend 3:44roughly 10 to 30% of their time just 3:46uncovering data issues additionally 3:50they'll spend again between 10 and 30% 3:53of their time resolving those issues so 3:56let's say 20% for both so based on a 4:0040-hour work week we work about 4:061,920 4:09hours 4:11annually now if we multiply that by this 4:13combined um 40% of 4:19time in today's environment data 4:22Engineers are spending approximately 4:26777 4:28hours just just identifying and 4:31resolving data issues let's break that 4:34down a little bit further let's say you 4:36know on average a data engineer has a 4:39annual salary of 100K that correlates to 4:43about 4:43$52 per 4:45hour if we multiply the $52 by an hour 4:49by the 777 hours that they're spending 4:53that is 4:57$40,000 that is being spent just 4:59detecting and resolving data issues I 5:03think we can all agree that is not a 5:05very good use of 5:07time data Engineers need to be able to 5:11detect things earlier especially unknown 5:14data incidents when they are reactive in 5:17nature they're forced to rely on their 5:19data analysts or data scientists to 5:22uncover data issues this often means 5:25that data qualities are discovered too 5:28late or in in fact not at all data 5:32observability is this more shiftleft 5:35approach detecting things as they occur 5:37at the source and allowing for you to 5:40resolve them before it actually gets to 5:43that access 5:45layer the outcome of this and where we 5:49see three cor improvements with a data 5:53observability 5:55solution are meantime to 5:58detection 6:03with a DAT of observability solution in 6:06place meantime to detection becomes 6:09almost 6:10instantaneous most a most alerts that 6:13fire are in real time the second one is 6:16meantime to 6:24resolution improving meantime to 6:26resolution is all about helping the data 6:28platform teams quick walk through the 6:30context of the problem such as where the 6:33problem is occurring why it's occurring 6:35and then resolving it as quickly as 6:38possible the last core metric is overall 6:42enhanced data 6:52quality all these things rule together 6:54to improve things for your data consumer 6:57data scientists and ml engineers rely 7:00again on this high quality data to do 7:01their high value tasks um like training 7:04and deploying accurate models so by the 7:06data producers using an observability 7:09tool it helps them establish trust in 7:11their data and really focus on those 7:14high value tasks rather than wasting 7:16time doing activities like uncovering 7:19bad data and and and sending things back 7:20to their 7:23producers so we talked about earlier how 7:25we got to this 40% number I'm just going 7:27to jop them down here so that we recall 7:30them so we said the average meantime to 7:33detection between 10 and 30% we're 7:36saying 20 again for same thing for 7:38resolution and currently data quality is 7:41low and 7:46untrusted so now let's re-explore this 7:49example of a data Journey but this time 7:52imagining that you have a data 7:54observability solution in 7:57place so now now at your data ingestion 8:01site at the source a data observability 8:04solution immediately Flags any 8:07improperly formatted records during that 8:11ingestion data Engineers receive their 8:14alerts in real time allowing them to 8:16rectify the issues very 8:22promptly once we get to the 8:24transformation and loading section where 8:27previously um we had a bug that altered 8:30values in a specific column um these 8:33again are instantly identified your data 8:36Engineers have received a notification 8:38allowing for them to correct um that bug 8:41and ensuring that it's not impacting any 8:43Downstream analyses with this in place 8:47the data engineer resolves the 8:49transformation in real 8:52time in the data warehouse we had some 8:56misconfigurations that caused data 8:58duplication 8:59these are now being detected in real 9:02time um during that loading process and 9:05again the alerts are notifying the 9:07engineers allowing for them to make the 9:09necessary configuration adjustments 9:11preventing redundant data in the 9:14warehouse lastly at our access layer our 9:17data scientist is now working with more 9:19transparent and reliable data set 9:22therefore encountering fewer anomalies 9:24during their model training and they're 9:27now experiencing a smoother and more 9:29efficient model training process with 9:31improved data 9:35quality so for each of these different 9:38stages the alerts have been fired and 9:41everything is quickly 9:43resolved resulting in better data 9:46quality at the 9:48end if we bring this back to our little 9:51calculation by implementing a data 9:53observability solution your meantime to 9:56detection can be reduced from 10:0020% to 1% so essentially real time your 10:06meantime to resolution will be increased 10:09by 2x by surfacing the root cause 10:11analysis and exposing any Downstream 10:14impact so since this is 2x this will be 10:18resolved to 10:2110% and then lastly your data quality 10:25that was previously very low and 10:27untrusted now is 10:30very 10:33high and 10:37trusted now if we come back to the 10:39example below uh before um when we were 10:42talking about how much time a single 10:44engineer is spending just uncovering and 10:47resolving data issues by implementing a 10:51data observability solution and using 10:54these um these new numbers that we have 10:56around meantime to detection and 10:58meantime to resol solution you will be 11:00saving your data Engineers 11:04600 80 11:07hours of work 11:09annually if we take that back to the 11:12average salary of a data engineer the 52 11:15hour $52 per hour that results in a 11:22$333,000 call savings annually for 11:26single data 11:27engineer 11:29so imagine the possibilities of what you 11:33could save with a data observability 11:35solution with an engineering team of 10 11:3950 100 well lucky for you you don't 11:43actually have to imagine it you can just 11:45click the link in the description below 11:47of this video and find out just how much 11:50Roi a data observability solution can 11:52bring your organization 11:55today if you like this video and want to 11:57see more like it please like And 12:00subscribe if you have questions please 12:02drop them in the comments below