ChatGPT‑5 Won’t Solve Data Readiness
Key Points
- The speaker argues that most AI challenges faced by businesses are rooted in human and organizational factors, not shortcomings of the models themselves.
- Data readiness is identified as the single biggest obstacle—roughly 78 % of firms cite poor‑quality, unstructured data as the reason AI projects stall, and no LLM can magically fix messy inputs.
- Relying on “magic wand” thinking—simply dumping raw documents into a model—fails because the data lacks semantic organization, sub‑corpora, and clear meaning, which are essential for effective AI outcomes.
- Consequently, expectations that a future model like ChatGPT‑5 will automatically solve these data‑related problems are unrealistic; disciplined, often “boring” data‑preparation work remains the critical path to success.
Sections
- Why AI Won’t Fix Data Problems - The speaker explains that despite hype around ChatGPT‑5, most current AI challenges stem from human and organizational factors—particularly poor data readiness—and cannot be solved by any new model alone.
- Aligning AI Projects to KPIs - The speaker emphasizes that AI models are merely components of a broader workflow and that successful AI initiatives depend on defining clear, measurable business objectives tied to important KPIs to prevent vague or shifting goals.
- Beyond Off‑the‑Shelf LLMs - The speaker cautions that relying on generic foundation models for niche back‑office tasks is misguided, recommending a focus on data quality, constraints, architecture, prompt engineering, and retrieval‑augmented techniques instead of attempting costly model training.
- Overhyped AI Needs Change Management - The speaker warns that businesses often demand impossibly high AI accuracy, overlook necessary human fallback and change‑management processes, and consequently fail to capture the technology’s true value.
- Prioritizing AI Security Over Speed - The speaker argues that rushing AI deployments without addressing security and privacy risks is unacceptable, emphasizing that compliance is straightforward and must be addressed from the outset regardless of a startup’s urgency or an enterprise’s timeline.
- Proper Integration Over Hasty Adoption - The speaker warns that merely inserting a powerful AI model like GPT‑5 into an unprepared business will fail, emphasizing the necessity of solid change‑management foundations before chasing rapid tech upgrades.
Full Transcript
# ChatGPT‑5 Won’t Solve Data Readiness **Source:** [https://www.youtube.com/watch?v=sf_OYY9lMlw](https://www.youtube.com/watch?v=sf_OYY9lMlw) **Duration:** 00:21:25 ## Summary - The speaker argues that most AI challenges faced by businesses are rooted in human and organizational factors, not shortcomings of the models themselves. - Data readiness is identified as the single biggest obstacle—roughly 78 % of firms cite poor‑quality, unstructured data as the reason AI projects stall, and no LLM can magically fix messy inputs. - Relying on “magic wand” thinking—simply dumping raw documents into a model—fails because the data lacks semantic organization, sub‑corpora, and clear meaning, which are essential for effective AI outcomes. - Consequently, expectations that a future model like ChatGPT‑5 will automatically solve these data‑related problems are unrealistic; disciplined, often “boring” data‑preparation work remains the critical path to success. ## Sections - [00:00:00](https://www.youtube.com/watch?v=sf_OYY9lMlw&t=0s) **Why AI Won’t Fix Data Problems** - The speaker explains that despite hype around ChatGPT‑5, most current AI challenges stem from human and organizational factors—particularly poor data readiness—and cannot be solved by any new model alone. - [00:05:09](https://www.youtube.com/watch?v=sf_OYY9lMlw&t=309s) **Aligning AI Projects to KPIs** - The speaker emphasizes that AI models are merely components of a broader workflow and that successful AI initiatives depend on defining clear, measurable business objectives tied to important KPIs to prevent vague or shifting goals. - [00:08:28](https://www.youtube.com/watch?v=sf_OYY9lMlw&t=508s) **Beyond Off‑the‑Shelf LLMs** - The speaker cautions that relying on generic foundation models for niche back‑office tasks is misguided, recommending a focus on data quality, constraints, architecture, prompt engineering, and retrieval‑augmented techniques instead of attempting costly model training. - [00:12:32](https://www.youtube.com/watch?v=sf_OYY9lMlw&t=752s) **Overhyped AI Needs Change Management** - The speaker warns that businesses often demand impossibly high AI accuracy, overlook necessary human fallback and change‑management processes, and consequently fail to capture the technology’s true value. - [00:17:20](https://www.youtube.com/watch?v=sf_OYY9lMlw&t=1040s) **Prioritizing AI Security Over Speed** - The speaker argues that rushing AI deployments without addressing security and privacy risks is unacceptable, emphasizing that compliance is straightforward and must be addressed from the outset regardless of a startup’s urgency or an enterprise’s timeline. - [00:20:38](https://www.youtube.com/watch?v=sf_OYY9lMlw&t=1238s) **Proper Integration Over Hasty Adoption** - The speaker warns that merely inserting a powerful AI model like GPT‑5 into an unprepared business will fail, emphasizing the necessity of solid change‑management foundations before chasing rapid tech upgrades. ## Full Transcript
Today I want to talk to you about the
things Chat GPT5 will not fix. What?
Why? I don't have a special magic
magnifying glass and a magic time
machine that allow me to go and examine
closely what it will be like in the
future. Instead, I have a thorough
understanding based on lots of company
and boardroom experience of how business
is actually using AI. And I know that
most of the issues that we're seeing
today with AI are human and
organizational problems, not AI
problems. And so when the model makers
make these big claims about how
incredible their new models are, I
always filter them back through the
actual organizational realities. I want
to go through with you a few of the
specific problem areas that have very
boring solutions that I see over and
over and over again in companies that
Chad GPT5 is not likely to magically
fix. Number one, this is the biggest
single one I see. Magic wand thinking
about your data over and over and over
again. I see the fallacy. We just
thought we could give the data to
whatever LLM it is that they're using a,
you know, they're on the Azure cloud,
they're using Copilot, they're using
Gemini, they're using Chad GBT. We just
thought we could give it to the LLM and
it will fix it. No, it will not fix it.
In fact, the 8020 rule is literally true
here. 78% of firms, according to
Techraar, that struggle with AI point to
data readiness is the root cause. Data
readiness is not something that an LLM
will magically fix. There is a school of
thought in the more advanced sort of
researchy side of AI that eventually
this will get fixed because AI will
become so good at recognizing the mess
of human data and have such big context
windows and so much processing power
that it will just read over all of this
mess and magically make sense of it for
us. And that is often voiced as if it is
present and here now and able to do that
work today. None of those things are
true. And there's a lot of concrete
reasons why even if that were true, it
would still be a bad idea to give your
AI bad data. The cleaner your data
inputs, the more likely you are to have
a strong AI experience. Do not use magic
wand thinking about your data. I have
been in situations where people will
tell me, "What is wrong with my data?"
And I will look at the data and it has
no semantic meaning to an LLM. It's just
a blob of data that is like
unatategorized, unorganized.
That's that's the issue right there.
Like you don't have to look farther for
a problem at that point. Like if you're
telling me that you have thousands and
thousands and thousands of documents
with this sort of undifferentiated like
gigantic blob of text and you're
expecting it to make sense of all of it,
it's not going to because you haven't
given it any sense of the semantic
meaning in the larger data structures.
you have no subcorpus
semantic structures to work with. And by
corpus I mean the whole collection of
documents. You should have some sense of
meaning. So for example, if it's in a
Wikipedia, maybe your internal wiki has
different sections that have some
semantic meaning. And then there's
article titles that give give it they're
worth separating out. They give you a
sense of where you are in the wiki. And
that's just a very simple example. If
you're dealing with official documents
like health records, you're going to
have semantic meaning for the patient
name and semantic meaning for the for
the diagnosis and this and that. The
more you get really really clear about
what data you want to convey, the easier
it is going to be to actually use AI to
pull the data. The second issue I see is
closely correlated magic wand thinking
about models. People tend to assume they
need faster, better, stronger reasoner
models. I am a big advocate of moving
your daily driver off of chat GPT40 onto
a stronger model. I have said so
explicitly. Go to 03, go to Gemini 2.5
Pro, go to Opus 4. But as much as that's
helpful for personal productivity, it
does not mean that every single aspect
of a AI job has to be done by the best
reasoner model available. Look, if you
just want to get columns sorted
correctly in a PDF, it does not have to
be sorted by the best reasoner model on
the planet. If you just want to
carefully go through a nicely delineated
data set and extract all of the values
associated with a particular firm, that
doesn't necessarily have to be a
reasoner model either. In fact, that
might not even be an LLM. That might
just be a fancy SQL query. people tend
to overpower their pipelines and they
pay for it. They're they're basically
paying a Ferrari premium. They're paying
so that they feel like they've got the
best model, which they think is what
intelligence is. But really,
intelligence is well organized data, the
right model applied against that data
with the right queries, the right guard
rails, and the right evals surfaced in a
way that a human can find useful. Does
that make sense? It's the right data.
It's the right model constrained in ways
that enable it to do useful work
surfaced in a way that a human can
understand and use. That's how
intelligence actually works in the
workplace. And do you notice how small a
role a model plays in that? Chad GPT5
may be the best Ferrari in the business
when it comes out, but it's a tiny part
of that overall flow of value. And so
you have to think more broadly if you
are trying to build interesting AI work.
The third issue is vague or shifting
objectives. a human problem again and
again and again and again. You need to
have a meaningful business KPI that you
are nailing your AI project to that
solves a specific business problem and
the business problem has to matter. If
your business problem doesn't matter,
I've seen over and over again people
give up, they walk away. If your
business problem matters but isn't tied
to a KPI, so it's just annoying to a
small section of the of the team, also
not going to get prioritized. It has to
matter to the organization, be
measurable in a business KPI, and you
have to nail that objective down
publicly every time you can in order to
keep the objective from moving and
shifting when you inevitably run into
problems. AI projects are a series of
nested problem sets that you continue to
solve until you actually get to value.
And you're not going to have the
patience to go through those nested
problem sets if you don't have a clear
KPI that the business cares about that
the seauite cares about that they are
going to move. That's really what like
leads people to persist. It's what leads
teams to persist. The fourth issue I see
is treating AI strategy as separate from
business strategy. It's a subtle one.
Again, a model doesn't fix this. AI
strategy cannot be separate from
business strategy if you want to avoid
wasting budget. If you want to actually
make progress on becoming an AI native
business, you cannot have the AI
strategy in the corner with the AI guy
or AI gal. And that's what they do. You
cannot just do AI as a project. You
actually have to integrate AI strategy
into your business strategy in a way
that makes sense, which requires
executives taking the time to understand
how large language models work at a
fairly granular level so that they can
see the leverage points in their
business where it applies. Let's say
that you are working in a business that
you don't think has anything to do with
AI. Let's say customer service is
paramount in this business. It's all
about putting your people in front of
the customer with a white glove
experience. I see executives in those
situations often tell me, I mean, AI
might be useful here and there, but it's
not transformational for our business.
Like, we're not buying the hype. Wrong.
There are going to be people in your
business and in competitor businesses
that see opportunities that you will
miss because of that attitude. You need
to recognize that just because your
value proposition is very resilient to
AI, like congratulations. Like you have
a human touch, that's going to be very
valuable. Love that. But the back office
piece, there is no reason that doesn't
need an entire AI strategy focused on
cutting down your business KPI costs.
And by the way, that doesn't just mean
firing people. That may also mean simply
keeping track of all of the items that
you sell more efficiently or being able
to actively query against your data sets
more efficiently or being able to
thoughtfully price more efficiently.
There's a half a dozen things you can do
in the back office. It's document
management, right? Like it's a very
boring thing, but it becomes a really
critical piece of running the business
well and AI can help. Number five, over
relying on offthe-shelf foundation
models. People think that a generic LLM
will fit every niche domain and if it
doesn't, they assume immediately they
have to train their own model, but they
don't even know what training their own
model means. Like I've had people in
almost every conversation say, "Do we
need to train a model for this?" And in
a sense, I sort of blame the model
makers. They've made it feel believable
and they've made it feel plausible to
train the models. It's not an easy task
to do. I don't recommend it. Instead,
think about your data and the
constraints and guardrails that enable
your model to flourish. Think about your
architecture. Think about your prompt
engineering. Think about whether you
need a data set that's formulated as a
rag or not, a retrieval augmented
generation data set or not. Think about
the degree to which you need to help the
model have the context to process the
job appropriately and what providing
that context means. But people don't.
They say, well, the model should know
it. Like there's this again this fallacy
that the model is intelligent because of
what the model knows when in reality the
model at a granular level is intelligent
because of the way it transforms. These
are transformer-based architectures and
the intelligence comes from the way it
transforms and predicts the next token.
It is actually not what it magically
knows. We think that because of the way
it's trained and reinforcement learned
to be helpful, but it's not actually
true when you're building production
systems. Number six, ignoring
integrations and ignoring operations for
AI. Teams will demo a proof of concept
and they will discover they have no
eval. They have no monitoring. They have
no rollbacks. They don't have a way for
the model to get pulled back out of
production if there's an issue. They
don't know what the bar is to reach
production. if it reaches production.
They don't know how to monitor it. They
don't know what tool sets they are
integrated with and therefore what
vulnerabilities they have if those tool
sets change. They don't know how they're
going to refresh the underlying data
sets. They just think again the model
will do it. If we put the model into
production, the model will solve the
problem. That is exactly the mindset
that had Air Canada in court over
bereavement policy that that their AI
made up. You cannot ignore AI
operations. In fact, I would argue if
you want to look for a career path,
getting into AI operations, figuring out
how to stably and safely deploy AI in
production and pull it back and handle
sandboxes, it's a big deal. It is not a
trivial thing. It is something that most
organizations ignore at their peril. And
I have to remind people over and over
again that this this has the
characteristics of software. You cannot
deploy it as if it is not software and
expect it to work.
Okay. Number seven. Similarly, no human
in the loop. This was the Clara story
famously, but I see it in other cases
where you have overeager CEOs who have
read the LinkedIn hype and they are
like, "We don't need these people. I I
want to hire you so that like uh you
know people people will lose their jobs
and I can cut this team and they and
they don't realize that Clara had to
rehire their CS team." They don't
realize that if you do not have an
ability to go to a human being and
actually find out what the real truth is
when AI goes off the rails, you're
inviting hallucinations, you're inviting
compliance breaches, you're inviting
brand damage, you're inviting a customer
experience that costs you the heart of
the business. And so it's really, really
important when you design systems to
anticipate non-happy paths. This is just
what we drill in if we're in product
management. Don't just anticipate the
happy path. Anticipate the miserable
path. How do you make that more
graceful, less miserable, more likely to
retain you, the customer? Similarly with
AI, when the AI goes wrong and the human
knows it at the end of the conversation
and the AI is not admitting it, how do
you get the human help? People don't
spend enough time thinking about that.
They expect the AI to be 100% accurate
when they would never expect a human to
be 100% accurate. That's not a
reasonable bar. Something can be
tremendously useful and only 87%
correct. And so depending on your
application, you may be in a situation
where the AI is 87% correct and you need
a human for the other 13% and your job
is to design a system that switches
cleanly between those two use cases. I
see very little investment from most
businesses in figuring out that number
and how you solve that problem. Number
eight, underinvesting in change
management. Massive issue. People just
assume that if they put the AI in front
of the team, it's going to magically
work. Again, model makers are somewhat
guilty here. They they try and tell you
over and over again, like like the leaks
we've seen on Chad GPT5 this week. Leak
after leak after leak, it's the next
thing since sliced bread. It's going to
be incredible. It's amazing. People are
going to go through this cycle all over
again where they think that they can
just give organizations Chad GPT5 and it
will magically do wonders for their
bottom line. And that may be convenient
for the model makers from a sales
perspective, but it's not true. You have
to go through a change management and
upskilling process to get people using
AI. Otherwise, the chatbot loads nicely
and they interact with it for like two
or three basic tasks a week and you
don't come close to realizing the power
of what the model can do for you. Not
close. And yet, most organizations are
investing more in the model and more in
the AI technical stack than they are in
the people. They're not investing in the
change management. They're not investing
in the upskilling and it's sort of like
pulling teeth to get them to do that to
be fair because no one talks about it,
right? Like the model makers are
emphasizing the tech, the tech, the
tech, the tech. And of course, you're
going to listen to them and think it's
the tech. And you're not thinking about
it from the fact that this is a new
general purpose technology. We need
people change in order to usefully take
advantage of this new technology. It's
not like you can expect that people will
be put on what is effectively an
entirely new digital assembly line and
told to just figure it out and it's just
going to magically work. Like we would
not do that in a factory. Why would we
do it here? And yet that's what we're
doing. Number nine, forgetting the total
cost of ownership. So often people just
they don't think about the token costs.
They don't think about the developer
sustainment cost. They don't think about
the uh the hit by a bus problem where
one developer is doing this and if if
that developer, god forbid, has
something happen to them, then you know
they're done. They don't think about the
sustainment cost of evaluating these
models in production. None of that.
They're just like, can we get this to
production? If we can get it to
production, great. We'll worry about the
rest later. Because there's such a
forced march risk on approach to beating
your competitors to market. And I get
it. It is a existential imperative. Like
you have to be able to get AI into
market. So I understand the incentive
there. I don't think they're incorrect.
But understanding cloud inference costs,
understanding how vector DB queries
works, understanding how cost guard
rails can be maintained is really
important or you can be upside down on
your margins really fast. And that is
just for serving the model if you're
serving it to customers. If you are
serving it internally, it is also an
issue of making sure that you're
extending your use cases across more of
the internal footprint in a cost
sustainable manner. If you are trying to
build subsequent systems, it is also
recognizing that AI systems take more
sustaining in production than
traditional software. You have to
evaluate them continuously. You can't
just test them in QA and forget them. In
fact, I would argue the 80/20 ratio
flips. And 80% of your time should be
spent looking at production use cases
because you can't adequately test the
model inherently for all use cases
before you launch. You have to hit a
certain threshold and say we're going to
launch and keep evaluating. And that
means more sustainment cost that no one
tends to factor in. The total cost of
ownership is largely a broken calculator
at most of the organizations I work
with. It's bad. And I don't again the
model makers say that tech will fix it
implicitly and explicitly over and over
and over again. And the they're sort of
victims of their own success because
normally when you say that people
discount what you say because everyone
has 70 years of advertising in western
markets in their heads and they always
discount claims. But these guys have
actually done it right. They've taught
the rocks to think. They've developed an
incredible new general purpose
technology. It's really, really, really
good. And so our discount disappears
with model makers. And we think, well,
maybe they're actually right. Maybe
they're not hyping it. maybe this is
really this good because they've done
this incredible job with this product,
which they have. Like none of this
should be taken to say that like these
LLMs aren't incredible and can't do
great work. It's about how you set them
up to do great work in business. And
that's the part where people just
there's a missing stare. People people
miss that. The last one I want to call
out, number 10, security and privacy
shortcuts. We'll worry about where the
data lives later. That's a classic one.
I'm not sure what the security
requirements are. Let's just get
started. let's get it up and we'll
figure it out. Or I haven't looked at
the terms of service for the vendor. I
don't know how they relate to the
foundation modelmaker. I'll let someone
else figure that out. You can't do that.
This is one of those things where you
have to know the story of the data from
day one because the risk of misuse is
too high. This is not a case where it
really helps you to go faster to ignore
those things. And partly it doesn't help
you to go faster because solutions
exist. You can read the terms of service
quickly and easily. You can quickly
understand which secure CL cloud
environment you want to deploy into. You
can become compliant and secure with
relatively little effort two and a half
years into the AI revolution. So there
is no excuse. You can't say going faster
is a reason for this. You have to take
security and privacy seriously. And
again this is one where businesses are
of two minds. This the scrappy startupy
ones in a desperate position tend to be
like you know what we'll deal with it
later. Whereas the ones that are more
enterprisey tend to be like we'll do
security and privacy and we'll do
nothing but security and privacy for 6
months and then eventually we'll go on
to the next thing. That is its own risk
because it's actually not that hard to
deploy a secure and private AI at this
point. It does not take 6 months in most
cases for most footprints. And so if
you're spending that long on it, you are
probably using what I would call from
Amazon day two thinking where you're
like looking at it as a process rather
than looking to the outcome you want to
drive. So it is possible to take this
last tenth one and to push it too far
the other way and be too over obsessed
with security and privacy. And I say
that because we know this is an issue
and the cloud providers are heavily
incentivized to make this something that
is solvable by businesses at scale very
quickly because they want your business.
Google Cloud and Azure see this as the
best chance they've had in years to
steal business from AWS because they are
farther along on the AI side. they are
absolutely going to be obsessed with
delivering for you a secure private
environment to your spec. So there's no
excuse for that. Like you should get it,
you should get it quickly and then you
should move on. Okay. I hope that you
look through these 10, which by the way,
these 10 are not an exhaustive list.
There's other stuff too. The CEO not
knowing how to use AI is my favorite
secret 11th one. Like that is a problem.
I've mentioned it before. If the CEO
doesn't know how to use AI, it's really
hard to drive AI transformation. Period.
It's not a general purpose technology
that works if half the people at the top
of the chain don't really know how to
use it. There are lots of other
examples. I want you to look at these
examples and I want you to notice how
many of the examples here are people and
process and how many of them are
technical architecture, not just models
or data, not just models. Again and
again and again, I've called that out.
This is why I continue to say new models
are great. I'm glad we're getting them.
It's a phenomenal time to be alive and
working. But they're not magic bullets.
They don't magically solve everything.
Instead, they help you when you have
good architecture, when you've solved
the problems I've outlined here to make
more return on investment than you would
otherwise. A better model in a clean
data environment with an excellent human
in the loop safety net with good MLOps
deployment practices with a good AI
strategy is going to go farther, right?
Like if you it's it's like putting an
engine in a properly constructed car,
you're actually going to be able to get
the most out of it as opposed to the
people who are trying to jam the engine
into the janky T- Model Ford and be
like, "Well, we've got a Formula 1
engine in here. You know, just floor it,
right? Like, I'm sure it'll work." No,
it's not going to work. Take the time to
set your business up to be ready for new
models. And by the way, chat GPT5 is
going to be great, but there's going to
be another model along in a month, two
months, 3 months. We are in the middle
of an exponential curve. And so, we're
going to see more exponential
improvements. That is why it's so
important to focus on these durable
aspects of change management for
business effectively. There's no other
substitute. I hope that you have enjoyed
thinking about all the ways that chat
GBT5 will not solve all your problems
magically. Cheers.