Keyboard Control vs Screen Collaboration
Key Points
- Two competing approaches are emerging: Anthropic’s Claude directly controls your keyboard and mouse, while OpenAI’s ChatGPT reads your screen and collaborates without taking control.
- Claude’s “cursor” mode lets the LLM drive the UI, whereas ChatGPT’s new desktop app for Plus/Enterprise users merely observes specific apps (initially coding environments) and offers feedback.
- The speaker finds ChatGPT’s read‑only assistance feels more stable and less risky, suggesting OpenAI will quickly broaden it to more applications.
- This read‑only model is positioned as a step toward AI‑augmented development environments that provide insight and debugging help without automatically writing code.
- Developers are split: some prefer the hands‑off guidance ChatGPT offers, while others favor tools like Cursor that can generate code directly.
Full Transcript
# Keyboard Control vs Screen Collaboration **Source:** [https://www.youtube.com/watch?v=Cj1m2O-Tmow](https://www.youtube.com/watch?v=Cj1m2O-Tmow) **Duration:** 00:03:21 ## Summary - Two competing approaches are emerging: Anthropic’s Claude directly controls your keyboard and mouse, while OpenAI’s ChatGPT reads your screen and collaborates without taking control. - Claude’s “cursor” mode lets the LLM drive the UI, whereas ChatGPT’s new desktop app for Plus/Enterprise users merely observes specific apps (initially coding environments) and offers feedback. - The speaker finds ChatGPT’s read‑only assistance feels more stable and less risky, suggesting OpenAI will quickly broaden it to more applications. - This read‑only model is positioned as a step toward AI‑augmented development environments that provide insight and debugging help without automatically writing code. - Developers are split: some prefer the hands‑off guidance ChatGPT offers, while others favor tools like Cursor that can generate code directly. ## Sections - [00:00:00](https://www.youtube.com/watch?v=Cj1m2O-Tmow&t=0s) **AI Keyboard Control vs Screen Collaboration** - The speaker compares Claude’s mouse‑driven automation with ChatGPT’s code‑reading assistance, highlighting both experimental approaches to AI‑augmented computer work. ## Full Transcript
all right which would you rather have
would you rather have the artificial
intelligence control your keyboard and
your mouse or would you rather have the
artificial intelligence see your screen
and collaborate with you those are the
two bets that are being made right now
Claude made the first bet and thropic
released Claud for computer use and you
can actually drive it around on the
screen and use the mouse and all of that
it runs at about a million tokens every
15
minutes on the other hand Chad GPT today
released a
update to their desktop app that allows
you if you are paying for their service
right if you're a plus or an Enterprise
customer to use chat GPT to read
specific apps on your computer so it's
designed to read coding apps initially
and they will expand it to other things
eventually basically the bet is would
you rather collaborate with Chad GPT
while you are working on these apps and
CH GPT can just look directly at what
you're doing maybe at the code that
you're writing and give you ideas or do
you think it's more helpful for Claude
to directly Drive the
screen my sense is both of these are
experimental and we are going to
converge I see these as modalities of
operation I think Claude has made the
Bolder bet and the bet that is more
likely to feel like a beta I played with
with chat GPT it doesn't really feel
like a beta even though they're labeling
it a beta it reads your code just just
fine and you can talk with it and it
gives you perspective it feels really
stable it also feels like a smaller bet
for that company to
make I suspect that because it feels
stable they're going to be looking at
expanding it in the next few weeks to
cover cover other
apps we will see but my sense
is giving the llm the ability to read
but not write is much much less risky
than trying to get it to fully use the
system autonomously and so that's a way
for chat GPT to expand the usage to
expand the surface of your work that it
covers and to do so relatively
efficiently right this is not
necessarily a difference in intelligence
for them it's just giving chat GPT a
pair of eyes that it didn't have before
to read very specific
programs so we will see if you look at
it this way it looks like the kind of
positioning that is designed long-term
to displace AI assisted development
environments because you could be an any
development environment and be AI
assisted right there now it doesn't
actually write the
code directly in the development
environment
yet and so one of the key differences
for example with cursor is that cursor
literally will write the code using the
large language model
and I have heard varying responses from
developers on this that if you're a
developer I know you have opinions some
developers are going to prefer the model
that chat GPT just released today where
it doesn't write the code but it gives
you a perspective it gives you an
opinion it helps you
debug so I think there's a market for
this if you have downloaded the new chat
GPT give it a try let me know what you
think