Three Questions to Vet AI Tools
Key Points
- The market is flooded with over 100,000 AI tools, most of which add complex integration points and failure modes that can be harmful if an organization isn’t prepared to sustain them.
- Successful AI adoption hinges on asking three critical evaluation questions, starting with whether the tool directly eliminates a clearly measurable pain point.
- A concrete example is Lera Guard, which mitigates prompt‑injection attacks in production AI systems, illustrating disciplined tool selection based on a specific risk.
- For personal productivity, Nessie Labs offers a Mac app that consolidates chats from Claude, ChatGPT, and Perplexity, showing how targeted tools can solve niche user problems.
Sections
- Evaluating AI Tools Effectively - The speaker warns that most AI tools increase integration complexity and failure risk, and outlines a three‑question framework—starting with whether the tool alleviates a measurable pain—to responsibly assess any AI solution.
- Evaluating Tool Adoption Viability - Deciding whether a tool like Nessie fits requires clearly identifying the problem it solves and carefully weighing the effort, behavioral changes, integration complexity, and long‑term support needed to sustain it.
- Evaluating Unregulated AI Tool Spending - The speaker cautions against the unchecked VC‑driven rush to buy AI tools, stresses the importance of assessing usefulness, and spotlights two niche solutions—LERA for prompt‑injection protection and Nessie as a personal AI knowledge‑base.
Full Transcript
# Three Questions to Vet AI Tools **Source:** [https://www.youtube.com/watch?v=vDtwS1w16K4](https://www.youtube.com/watch?v=vDtwS1w16K4) **Duration:** 00:08:53 ## Summary - The market is flooded with over 100,000 AI tools, most of which add complex integration points and failure modes that can be harmful if an organization isn’t prepared to sustain them. - Successful AI adoption hinges on asking three critical evaluation questions, starting with whether the tool directly eliminates a clearly measurable pain point. - A concrete example is Lera Guard, which mitigates prompt‑injection attacks in production AI systems, illustrating disciplined tool selection based on a specific risk. - For personal productivity, Nessie Labs offers a Mac app that consolidates chats from Claude, ChatGPT, and Perplexity, showing how targeted tools can solve niche user problems. ## Sections - [00:00:00](https://www.youtube.com/watch?v=vDtwS1w16K4&t=0s) **Evaluating AI Tools Effectively** - The speaker warns that most AI tools increase integration complexity and failure risk, and outlines a three‑question framework—starting with whether the tool alleviates a measurable pain—to responsibly assess any AI solution. - [00:03:06](https://www.youtube.com/watch?v=vDtwS1w16K4&t=186s) **Evaluating Tool Adoption Viability** - Deciding whether a tool like Nessie fits requires clearly identifying the problem it solves and carefully weighing the effort, behavioral changes, integration complexity, and long‑term support needed to sustain it. - [00:06:12](https://www.youtube.com/watch?v=vDtwS1w16K4&t=372s) **Evaluating Unregulated AI Tool Spending** - The speaker cautions against the unchecked VC‑driven rush to buy AI tools, stresses the importance of assessing usefulness, and spotlights two niche solutions—LERA for prompt‑injection protection and Nessie as a personal AI knowledge‑base. ## Full Transcript
You know, there are more than a 100,000
AI tools out there and most of them are
going to be useless. And in fact,
they're going to be actively harmful.
Let me explain why. If you add any AI
tool to your system, you are adding at
least two new handoff and integration
points, not to mention a whole host of
failure modes. Because generative AI
products, they're more complex. They
solve things that are harder to solve.
And so, there are more ways they can
fail. And so unless you are bought in
and ready to sustain the product, you
are buying yourself a load of failure.
And that is why we see study after study
coming out showing companies investing
in AI tools and being disappointed by
what they buy. Sometimes it's not even
just the tool itself. It's the fact that
the organization isn't ready. And so
today I want to walk you through the
three critical questions that I ask when
I'm looking at AI tooling so that you
have a framework. Then I want to show
you a couple of tools that I think are
worth thinking about for specific pain
points that you should ask yourself
those questions for. And if you want to
go deeper, I've got a whole load of
tools to review over on the substack 45
or so that you can dig into that I've
started to ask these questions for.
Frankly, I think we should be asking
these questions whenever we evaluate a
tool. Question number one, does it kill
a pain that we can measure? So when you
are trying to find an AI tool, so often
you think about hopes, you think about
dreams, you think about how far you can
go with the tool, the vendor sells you a
lot of cool stuff. Do you have a
specific painoint? Do you have something
that is absolutely crystal clear? So as
an example, Lera Guard, which I'm going
to show here in a second, it cuts down
prompt attacks. That is its purpose. It
stops prompt injection attacks. Maybe
not perfectly, but a lot of them. If you
have a production AI system, you may
want to think about a tool like Laragard
because of that. And I'm going to show
these tools at the end of the video. So,
we're going to stay with the principles
here. We'll get to the tools at the end.
The point is simple though, right? I I
can name a painoint. I can say I have
prompt injection risk. Therefore, I need
a tool to address that. Therefore, I
need to review some vendors to go after
it. I rarely see that level of
discipline from people who are shopping
for tools, whether they're individuals
or whether they are larger companies.
Either way, that I'll give you another
example that's individual sort of
focused. I want to keep track of my
chats when I have them with Claude, when
I have them with chat GPT, when I have
them with Perplexity. But I I don't have
one place to do that. Well, there is a
startup that's addressing that now. It's
called Nessie Labs. and they have a
product out for Mac called Nessie and
that's exactly what it does. It imports
your chat GPT chats. If you use Chrome,
it's going to work with you to keep
track of your chats. It is laser focused
on that specific painoint. But again,
it's not perfect. It does not work if
you use the clawed app builtin. It
doesn't automatically keep track of your
chat GPT chats that are not in Chrome.
So there are weaknesses, but you can't
assess whether it's right for you or not
if you don't know very specifically what
the pain is that you're trying to solve.
If you really care about getting all
your chats in one place and organizing
them and you can name that pain, then
maybe it's worth it. Maybe that's the
right one to go after. Question number
two, can we integrate and sustain this
tool? So you need to map out the effort
of change. If it's an individual tool
like Nessie, the effort is a change in
behavior. Maybe you're using another
browser, not Chrome. Maybe you're used
to using the desktop apps for AI. Maybe
you're not ready to do the work of
exporting a zip file of old chat GPT
chats to get the organization started
inside Nessie Labs to have your memory
layer. But you have to decide if that's
worth it. You have to decide if the cost
of sustaining that over time of changing
your behavior over time is worth it. If
you are installing an enterprise tool,
it's of course much more complicated.
Your teams will need training. They will
need to understand edge cases where the
tool doesn't work. Your IT department is
going to have to support it. It's
exponentially more complex. And every
single tool you add adds edges that you
have to sustain. It touches other tools
in your ecosystem. and it touches other
teams. Have you mapped that out? Are you
ready to sustain the tool? Good tools
are going to make it as easy as possible
to own setup, to tune your alerts, to
figure out what ongoing maintenance
looks like in a way that is sustainable
for your business. Tools that are poorly
constructed assume most of the work for
figuring that out falls on you. That's
why I'm a big fan of looking at
documentation when you want to figure
out what tools work. Number three, when
you're asking yourself about tools, ask
yourself, what is the worst failure mode
here? And can we stomach it? Now,
individuals sort of get away with one
here. The worst failure mode for
individuals is usually not too bad. You
can actually just look at a particular
tool set and say, "Yeah, you know what?
I'm going to try Nessu Labs. It's going
to be fine. The worst thing that could
happen is that I end up with a tool with
some memory that I didn't end up using.
That's not too bad." or I forget to use
Chrome for a chat. That's not too bad.
Companies have a much higher bar to
meet. Let's say you're using mem zero.
It's a memory layer for customer success
agents so that the customer success AI
agent can remember your customer and
interact with them more personally.
Great idea. What if there's a
catastrophic failure and there's a
memory leakage of some sort? Can you
stomach the lack of trust that comes
with that? Do you have asurances? You
have architecture in place to make sure
that that's mitigated. What if you are
using Lera Guard and a prompt injection
attack does succeed? What do you do
then? And so a lot of what we're doing
when we look at tools is we're
essentially trying to reference like do
we understand the pain? Does this
actually act like a heat-seeking missile
and just go after that particular pain
point? If it does, can we integrate and
sustain it? And if we can integrate and
sustain it, do we understand the
downside and have we mitigated for it?
If we ask ourselves those questions, we
are going to be so much farther along on
unreged tool purchases. There are, look,
there are billions of dollars being
thrown around here. Part of how the VC
industry is sustaining itself right now
is that people are throwing money at AI
tools and not asking themselves, is it
useful? So, without further ado, I'm
going to show you just a peek at
Memzero, at Laragard, and at Nessu Labs
because I think those are all ones that
I've referenced here. And if you're
curious, I'd love you to dive in.
There's, as I'm saying, a bunch more
tools that I'll have up on the substack
as well. So, this is LERA. The idea is
it's a layer in between your generative
AI applications and bad actor. And this
is a tool that enables you to
proactively understand what is going on.
You get visibility. It's going to
protect you from prompt injection
attacks. You can control and configure.
Think of it as like it's the classic
security play. It's a shield, right?
Like, and you can decide how you
configure your shield, etc. Nothing is
perfect, but it's an example of a tool
that's aimed at a particular risk that
companies tend to articulate is very
painful. This is Nessie. Nessie is
exactly what I talked about, an
individual AI knowledge base for the
mind. You can download it for Mac. The
idea is that it's able to capture your
overall chats and get them into
summaries. You can organize, you can
play with, you can use laser focused on
a particular painoint I hear from a lot
of people. And last but not least, this
is me zero. It's focused specifically on
how you can help AI agents to remember
customer success use cases. And so this
is a travel example, but you can imagine
this for a lot of other examples, too.
If you have generative AI applications
that you're focused on, this becomes a
really powerful way to connect with your
customers. So, I don't pick these
because they paid me or anything. They
don't know they're being talked about.
I'm mentioning them because I think that
they do a good job talking about a
particular painoint and I think that
they are enabling us to talk about how
you choose tools. Well, I don't care if
you choose these tools or not. I want
you to understand how to pick tools that
work for you. If you'd like to dig in
further and understand how I think about
tools, I have the same set of questions
that I just outlined around sustainment,
around picking tools, well, around
finding the pain point, around worst
case scenario planning. And I have that
for like 45 tools because I think that
we need to have honest conversations
about this and we have to start
somewhere and we have to start with
harder questions than we're asking. So,
this is the no tools tool episode. It's
it's I want you to bias to not buying
the tool unless it says yes to these
questions because I think too often
we're too easy like we open the wallet
too quickly. We need to be sort of
hard-nosed about what tools really
matter. So there you go. That's my take.
It's how you build a no tools tool
culture and pick AI tools that actually
matter.