Scaling Compute vs Software for AI Reasoning
Key Points
- The panel debated whether advancements in AI reasoning will come primarily from scaling compute and algorithmic breakthroughs (voiced by Vmar and Skylar) or from traditional software engineering improvements (voiced by Chris).
- A new paper from Mulon on “Agent Q” showcased that combining LLMs with tools such as search, self‑critique, and reinforcement learning can boost planning tasks—e.g., restaurant reservation booking—by an order of magnitude in success rate.
- Skylar explained that while LLMs excel at constructing a statistical world model for next‑token prediction, they historically lack the motivation or agency to actively explore and act within that model, which hampers their reasoning abilities.
- Recent research is therefore focusing on augmenting LLMs with external reasoning mechanisms and tool‑use to give them purposeful agency and improve their ability to solve complex, multi‑step problems.
- The discussion highlighted excitement about renewed investment in large‑scale hardware (“big computers”) as a key enabler for these next‑generation AI systems.
Full Transcript
# Scaling Compute vs Software for AI Reasoning **Source:** [https://www.youtube.com/watch?v=emVMHYQVmdQ](https://www.youtube.com/watch?v=emVMHYQVmdQ) **Duration:** 00:47:21 ## Summary - The panel debated whether advancements in AI reasoning will come primarily from scaling compute and algorithmic breakthroughs (voiced by Vmar and Skylar) or from traditional software engineering improvements (voiced by Chris). - A new paper from Mulon on “Agent Q” showcased that combining LLMs with tools such as search, self‑critique, and reinforcement learning can boost planning tasks—e.g., restaurant reservation booking—by an order of magnitude in success rate. - Skylar explained that while LLMs excel at constructing a statistical world model for next‑token prediction, they historically lack the motivation or agency to actively explore and act within that model, which hampers their reasoning abilities. - Recent research is therefore focusing on augmenting LLMs with external reasoning mechanisms and tool‑use to give them purposeful agency and improve their ability to solve complex, multi‑step problems. - The discussion highlighted excitement about renewed investment in large‑scale hardware (“big computers”) as a key enabler for these next‑generation AI systems. ## Sections - [00:00:00](https://www.youtube.com/watch?v=emVMHYQVmdQ&t=0s) **AI Reasoning: Compute vs Software** - A panel of AI experts debates whether the next breakthroughs in planning and reasoning agents will come from scaling compute, new algorithms, or traditional software engineering, amid renewed hype for powerful hardware and a preview of Mulon’s new Agent Q paper. ## Full Transcript
AI agents what are we expecting next how
do we put um planning and reasoning
alongside this large representation of
the worlds we have now are we going to
have products that truly never
incorporate generative AI I think never
is such a strong word and what's the
most exciting thing happening in
Hardware today it's nice to see that
finally we built Big computers again
I'm Brian Casey and welcome to this
week's episode of mixture of experts uh
we let Tim go on vacation this week so
you're stuck with me and I'm joined by a
distinguished panel of experts across
product and research and Engineering
vmar ulick who is the VP of AI
infrastructure Chris Haye who is the CTO
of customer transformation and Skylar
Speakman senior research
[Music]
scientist there's been a lot of
discussion in the market around
reasoning and agents um over the last
you know six months or so and so the
question to the panel is do we think
we're going to get more progress in
building reasoning capabilities through
scaling compute and this is just over
the next year or so scaling compute
algorithmic progress or from Good Old
Fashion software engineering so vmar
over to you uh very clear algorithmic
progress
Chris software engineering all right
Skyler algorithmic that's the next step
all all right I like it we got we got
some different opinions on this and this
actually leads us into our first segment
that we're going to be covering today um
which is a company called mulon uh
released a new paper around agent Q uh
and this paper is demonstrating
improvements in reasoning and planning
and the scenario they defined in the
paper which was using an agent to
actually book restaurant reservations
was using llms combined with other
techniques like search self-critique uh
reinforcement learning and they
demonstrated some like order of
magnitude Improvement uh in just the
success rates of llms and so maybe
Skyler as a way of just kicking us off
I'd love to hear a little bit about just
like why do lmms struggle so much today
with with reasoning and like why is you
know some of the work going on in this
space exploring other ways like so
important to to making progress so llms
have this amazing ability to build a
world model um I think I've seen that
phrase popping up more and more
sometimes it will get criticized and say
oh all they're doing is predicting the
next word but in order to predict the
next word as well as they do they
actually do have this I'm not going to
say understanding might be too long of a
stretch but they have this model of the
world up until these new recent
advancements they had no real reason
motivation agency whatever you want to
call it to really go out and explore
that world but they had created that
model of the world and they could ask
answer questions about it uh so I think
this idea of llms being limited to
creating the model of the world they did
a very good job of that I think some of
these next steps now are all right now
that we've got a representation of the
world which is pretty good at the next
token prediction problem how do we
actually execute um actions or make
decisions based on that representation
and so I think that's kind of this this
next step we're seeing um not just from
aging q but lots of research Labs here
are really trying to figure out how do
we put um planning and reasoning
alongside this large representation of
the worlds we have now so I think these
guys are off to a good start uh one of
the first ones to kind of put something
out there um uh the paper down you know
uh out available for people to read um
lots of other companies are working on
it as well so I wouldn't necessarily
these guys I wouldn't necessarily they
they're ahead of the pack yeah maybe
Chris I know we were talking a little
bit about this which is like how
indicative do you think um some of the
work that the team did here is of just
like where everybody's going um in in
this space like is this is this paper
just like another piece of of data in
like what is a continuation of everybody
sort of exploring um the same sort of
problems and do we think this is you
know pretty dialed in on kind of where
the problem space is going to be around
agents over the next year or so I think
it is actually pretty dialed in so when
I when I read the paper it's kind of
similar to some of the stuff that we're
doing with agents herself so that's
always kind of goodness there but if if
you really look at what's going on there
is they're not really using the llm for
the hard bits right they're using the
Monte Carly Tre search right to to
actually work out so one of the major
things that they're doing is they're
using a web browser as a tool so if
they're trying to book a restaurant for
example then what they're actually doing
is doing a mon Carl research and they're
navigating using that Tool uh to
different spaces they're using the llm
to self-reflect they're using the llm to
create a plan in the first place of how
they're going to book that restaurant
but they are relying on outside tools
they're relying on outside pieces like
uh the tree search to be able to work
out uh where they're going and the fact
is that is cuz llms are not great at
that right so it's like it's more of a
kind of hybrid architecture in that
sense and everybody's doing the same
thing with agents as well right you're
bringing in tools you're bringing in
outside memory you're bringing in things
like uh graph searches for example so
graph racks becoming really popular in
these spaces everybody's sort of
bringing in Planet and reasoning as well
I think they're doing some really
interesting stuff there with the
self-reflection and the fine tuning so
that it's more of a kind of virtuous
circle in there within the paper so I I
think they're probably further ahead
than than a lot of people in those
spaces but even if you look at the open
source tools the open source agent
Frameworks we started with things like
Lang Jan but now you'll see things like
glang grath is becoming really popular
um and then you're moving into other
multi-agent collaborations such as crew
AI so I everybody's on a different
slightly different slant on where they
are in this journey but they're
definitely on the right track I would
say at this point in time and and by the
way back to my earlier argument that is
software engineering my friend that is
not doing anything different with the
llm it is engineering and putting stacks
and Frameworks around your tool
set to that point Brian I do want to
hear uh vul Mar's take on why
algorithmic was his was his pick so you
have to hold you have to hold us to our
answers and he's going to go
next so um my background is we we I
built self-driving cars for seven years
and we this was always this U decision
between you know how much software
engineering can we do and how much can
we train into a model and then in many
cases what Chris just said is you know
it's often times a packaging of
different Technologies together and I
think where we are where we are right
now is we we have as you mentioned this
really powerful tool which is LM so we
have some basic form of world
understanding and we have the world
model and now we are trying to make some
something do stuff which we haven't seen
it's not oh just predict predict the
next thing you do on Open Table right
and so now you're on a in an unknown
open world where you need to explore
different uh you know different choices
and then I think what the next step will
be you know you run this Brute force and
then once you have those choices you
actually will train a model that's my
expectation because that's the path I've
been on with with driving so we always
came up with some euristic huge Data
Corpus tried something out and then in
the end it was always like oh yeah now
that we figured out what the underlying
problem is let's train a model to make
this more efficient in execution and so
in the end the model is just an
approximation of an of an extensive
search right and so I think that's why
algorithmically I believe that um the uh
uh the algorithms we will build uh are
effectively those you know graph
searches Tre searches Etc which
ultimately then will feed into a simpler
representation which is easier and in
real time to compute I was I was kind of
disappointed by the paper if I'm honest
and I'll tell you why and uh and and
Brian's dreading what I'm about to say
now but um but I'll tell you why I was
disappointed because the whole example
was the Open Table example now unless I
am wrong and I don't think I am isn't
mulon the company that claimed that they
were the agents behind the strawberry
man the uh I rule the world Mo Twitter
account so uh you know that would have
been the uh the agent example I would
have wanted to see in the paper it is
that that was actually a question um I
was like I was thinking a lot about
because they they they talked about
reinforcement learning as part of that
and like one of the interesting things
that I've just seen in the market the
last I don't know a few months or so is
there's this like like light backlash
happening to to llms within the ml
Community even a little bit particularly
I think the people who have worked a lot
in reinforcement learning um you know
and you even heard you know folks like
like people talking about llms being a
detour on the path to to AGI and I'm
seeing like as as we've slowed down a
little bit in terms of progress I've
seen like the folks who love who operate
in those kind of reinforcement learning
spaces like starting to pop their heads
up more and being like hey it's back
like um the only way we're going to make
progress around here is some of the
other techniques and you know I'm
curious like maybe two questions is um
maybe I'll start with this one is like
do you all think if we fast forward to a
world where like agents are a much more
significant part of just like the
software that we're all using every day
do we think llms are like the most
important part of that or Chris to your
point around this paper that make
extensive use of lots of other
techniques do we think like a bunch of
other techniques are going to come and
like rise back to prominence as we
actually try to like make these things
do stuff um and um so yeah maybe I'll
stop there and just see if like anybody
has a take on that yeah I I definitely
think RL is going to come back into this
um I know they were using RL and that
paper and they were also using things
like DPO and stuff but I I think it's
going to come back into this so I keep
thinking back to alphago and the
deepmind team and you know winning at go
there and and again they were using
similar techniques as you could see in
that paper there um but but if if you
take a deep learning algorithm today on
your machine and you get it to play the
simple game of snake or play the Atari
games like deep-minded
um very very simple architectures like
uh CNN DNN type things absolutely rock
that game if you get an llm to play and
it doesn't matter whether it's an agent
or not that is the worst playing of
snake I've ever seen from Frontier
models right and GPT 40 is Terri
at it um you know Claude is terrible at
it they're all terrible playing at these
games but really simple RL deep learning
uh you know CNN style uh architectures
actually rocket those games and
therefore I I think that as we try and
solve and try and generalize I think
some of those techniques that were
really successful in the path in the
past have to come back into the future
and I'm I'm pretty sure that's where a
lot of people are going at the moment so
we're going to see software engineering
we're going to see improvements in
architecture we're going to see
improvements in algorithms it's going to
stack stack stack and hopefully all of
these techniques will come together into
hybrid architecture but but when you
take llms and put them into an old sort
of gaming style environment they
absolutely fail today do we think there
will be like general purpose agentic
systems like over the next you know
short term let's say like next couple
years or is everything going to be task
specific um because like one of the nice
things Chris like to the point about
this thing being an open table like go
book of reservation it's a very easily
definable objective um right and that
means that you can pull in a bunch of
these other techniques in a ways that
are harder to make kind of like fully
generalizable and so it's like when we
look at agents do we think we're going
to make a lot of progress on kind of
generalizable Agents over the next you
know year or two or is is everything
going to be just in this Tas specific
land Skyler maybe it looks like you got
some slots on that no don't think we'll
have General within two years I think
there will be some areas and this might
even lead to our next topic areas around
uh language creativity I think that will
that will surpass uh some humans
abilities but the world works on much
more boring mundane business processes
and I think there's still a lot more
ground to make on that to to get those
systems to a level of of trust uh that
people will use it's one thing to to
have these methods you know create a
funny picture write a funny story uh but
to have llms execute Financial
transactions on your behalf different
different ball game and we're not going
to be there within two
years I I'll be proven wrong you can
timestamp this that's okay but uh yeah
yeah no we're always accountable for our
predictions on this show so um so Brian
I I think where we may go is we will
probably get you know now we are going
through examples you know open table and
we try another 20 I think we will get
into a tooling phase where you know you
you can actually explore a domain and um
with some human intervention and some
human guidance you know you will have
tools which can explore let's say a web
page how to interact with it and then
you may go through some pruning process
which may be manual but I think we will
get to more automation that it will be
you know 10 times or 100 times faster to
build this but I think as Chris said
there will be a software engineering
component to it uh which you know for
until we are fully autonomous you just
point at something and say learn uh that
will take a while and then the question
is where does the information come from
is it through trial and error or we
could even just read the source code of
the web page right I mean we we have
source code in puton business processes
I can just give you you know here's my
billion lines of code of sap
[Music]
adoption for the Second Story there was
the CEO of this company procreate um
they are a company uh that builds and
Designs illustration tools and I think
it was on Sunday night um their CEO came
out and released a video um in which he
said that they are never that one he
actually said he hates gen AI um I think
he actually used the word hates um to
describe it um and he said that they
were never gonna include gen
capabilities um inside of their product
and like the reaction from their
community and the design Community
broadly was was like super excited and
supportive of of this statement like I
think as timer recording um that video
has got like almost 10 million views um
on on Twitter and I have like a bunch of
different reactions um to that that
hopefully we can you know pick apart
here a little bit but one of the things
that was like most striking to me is
that the way two different sets of like
Creator communities have reacted to the
arrival of llms like within the I have
friends and col colleages who are
software engineers and like llms for
code um people are generally pretty
enthusiastic about that look at it as a
great productivity tool they get more
work done than they were ever able to do
before I also have friends and
colleagues who are writers who work at
Hollywood who are creatives and who like
look at the arrival of some of this
technology like the Grim Reaper um
basically and so it's just like wildly
different responses um from from these
two communities and I'm just curious
like maybe Chris throw it over to you to
you know maybe get some initial thoughts
and reactions to it is like you have any
sense of like why these communities are
responding so different differently um
to to this
technology I think never is such a
strong word that be one of my other
reactions to it never so far really uh
no feature at all yeah
yeah yeah I I'm never ever gonna stream
video content because I believe physical
is more important well you know what
you're out of business Blockbusters so I
don't know I I think there is a general
wave I applaud them right I think they
make tools for their particular uh
audience and Their audience doesn't want
that and I I think that's going to be a
unique differentiator um I'm not sure
how that stands the test of time I I
think never is such a strong word there
the industry is moving fast and
different audiences have different needs
right I mean I'm pretty sure that if I
use procreate there's no chance ever I'm
going to produce anything that is of any
artistic quality and that is cuz I have
no artistic talent but you you're not
the target
audience I am not the target audience
but I am grateful for AI generated art
because it allows me to produce
something that I would never be able to
produce otherwise so things like
PowerPoint slides Etc so if they are
they are focused on the creative
professionals and creative professionals
don't always want to have ai geni within
that and I understand that that's great
you've got your audience you've got your
Target and that's fine but I think and I
think there will always be an audience
for that but I think the tide of time
will uh push against them there and I
think that's that's really going to be a
very strong Artisan statement to make
before we move on Chris what what sort
of PowerPoint art are you doing um like
that was was
my I I mean generally if I'm honest it's
almost always of unicorns with rainbow
colored hair that is that is my pretty
CEO presentations
um every CEO loves a
picture sure all the other ones do you
know that's it resonates um with with me
but Skyler vmore I'm curious if either
of you have takes I'm just like the
community's really reaction to like
these two different sets of tools so I
think we are in a world where um you
know we have artists and
craftsmanship and we are going through a
phase of automation of this Artistry and
craftsmanship and so the bar will be
really really high and there will be
always unique art we still today you
know I can buy photography I can buy you
know a copy of a mon you know some of
the greatest artists in the world and
can hang it on my wall but there is
still a need and a demand by people to
have you art which is theirs and I think
that will stay like and and we've seen
this across you know the progression of
time you know horses used to be forms of
transportation and now they are a hobby
right and so and car old cars is going
the same way and you know hopefully at
some point that's with airplanes and I
think um these these unique pieces of
art if I can automate the creation and I
can you know industrialize it the
industrialization wins it always wins
but it doesn't mean that those tools and
those artists and that craftsmanship
shouldn't be supported it will just
shrink dramatically because uh you know
the the capabilities become more
accessible to everybody you know if you
used to have typus now everybody can
type all the typists are gone right and
there will be the same thing one of the
things I thought was interesting is that
you made this point about craft like I
think a lot of people choose their
life's work because they like the Craft
um of of that right they chose to be an
artist or a developer because they like
like doing that work and so having a
tool come in and like do all of it for
it is like robbing you know some degree
of value from um you know the things
that they do day in and day out and um
one of the things that I was also
thinking about and I'm just curious if
in
your in within your teams within your
own like set body of work you're doing
with clients that y'all are working at
do you also see like of the other places
where I was thinking about tension um
around this sort of dynamic is um in the
relationship between management and
practitioners um where like one of my
observations is that like management is
oftentimes particularly enthusiastic
about adopting these tools because of
the productivity benefits like I can get
more things done I could reduce my cost
I can you know drive more Revenue
whatever it might be and you know
because those are the things that like
they're running their entire
organization to like Drive deliver those
results and in some cases they've become
as they've gotten more senior maybe one
step removed from actually doing the
craft so the loss of The Craft maybe
feels like less of a consequence um to
management sometimes but to
practitioners it's like this is my thing
uh and this tool is coming around and
just like doing it for me in some cases
so I'm curious if youall have also
observed any sort of like when it comes
to adoption of some of this stuff any
tension between like management and
practitioners um in terms of like their
level of enthusiasm for for this
technology I'm not sure about tension of
management and practitioners uh there
might be a sum of I've witnessed of uh
which flavor or which version so they're
going to say no we're going to use this
one and uh back actually behind the scen
somebody's using a different a different
tool and some tension back back and
forth on that one so it's not
necessarily the adoption uh but maybe
the channel or the Tool uh has had has
had a bit of uh that one or this one and
um so yeah that would be what I've
observed I think it's also the question
you know when you look at at um
Craftsmen um there's
20% of work you love and 80% of work you
hate often times it's like the majority
I mean ask a data scientist like 80% is
data cleaning do you think they like
data cleaning no right so um if
you I think the tools like if they
support the the toiling the useless work
and make people more productive then you
know you shift more into the the work
which you actually like and appreciate
so I think there is from the from the
engineering I mean I'm mostly talking
software Engineers here from the
engineering perspective I think it's
actually an improvement you know nobody
likes jro ticket reviews and writing
comments and all that stuff if that can
be automated away then that's you know
an improvement in the life of people or
I don't need to go to St overlow and try
to find that algorithm I can just ask
the model to write it and I'm done and
so I'm more at the architectural level
um and I think uh from a management
perspective I mean they want to get
productivity out but there also
productivity in an Engineering Process
in many cases is that you you know need
to convince all the people to do these
pieces of work because they're necessary
for the product but everybody hates them
so and I think to a certain extent you
know it's an improvement on both
sides that's that's a great point I I
always well it's probably not a probably
not safe for description of it but I
always like to tell we we share those
things amongst the team so everyone
should just mentally come to terms with
some percentage of your job is the work
that none of us want to do in this team
but we're at least going to spread it
around um the group a little bit but um
but that description like actually so I
like a lot of the teams that I work with
are operate a lot on just like ibm.com
do a lot of things around content and
like we the dot property has tens
hundreds of thousands million ions of
pages as part of and we're trying to do
way more with like Automation and like
how we connect content together and
stuff like that it turns out in order to
do that like all your tagging has to be
like really good across the entire
property across tens of thousands of
pages and it's like oh my God the amount
of time that we are going to spend
cleaning up the metadata on like this
chunk of the website it's like just just
kill your calendar for three days for
like some whole chunk of the
organization to go through this stuff
and if we can instead like build just
like a really good classifier um um and
you know ways of doing that it's like
that type of stuff actually lands like a
huge relief and like lets us focus on
doing the work that we actually signed
up to do so like at least within my team
like that's a lot of what we're doing is
we're looking at this type of tedious
work that is really um it's important
and it has to get done to your point but
like nobody really wants to spend their
day um doing that can we do as much of
that so we can actually like focus on
doing the work we want to do but like
when it comes to using llms for like the
core core thing that we're doing
everybody's still a little skidish um
honestly at least in some of these now
it's not on the software engineering
side of our teams but on like some of
like the you know more Creator side of
it so it's like some of this some of
these announcements like kind of resona
with me because I see it with some of
the folks that I work with a lot I think
one of the other things is I don't think
it's just tedious stuff I think for kind
of prototyping type stuff you know and
ideating it's really good like so and I
don't think it matters whether you're
producing content or you're producing
code or you're producing images
sometimes you're like I have an idea is
this going to work H it's going to take
me quite a lot of time to sort of build
that up let's just get the llm to do
something or the image generator to go
through this a little bit I get an idea
what it looks like and then I'm going to
start pruning it and then I'm going to
start building the idea a little bit
more and I I personally again more from
a software development side of things
that's kind of how I work so I at the
moment I'm sort of trying to create a
distribut a distributed parameter
service for training llms there is no
chance that I would be able to just sit
and code that straight up myself right I
need an llm to help me out figure this
out a little bit and then I will
engineer through where I need to be with
that right and and I think that is true
and it's the same with image generation
right it's like you know uh if you're
doing a concept and you need that
unicorn with rainbow colored hair get
the get the image model got it yeah
exactly get it get it out there and then
you go okay you know that that doesn't
quite work in context you know I need
this and then you can go and draw your
pretty unicorn at that point right but I
I think prototyping is a really
important use case and I think Chris
like when when you're doing that
prototyping right it's like you can have
a dialogue you know with with a machine
and you get major refactorings done in
in seconds right because you can just
like I want this other thing let me
split this into four classes or let me
collapse them the amount of work you
would have to do and that's all the
tedious stuff you know refactoring of
code and we have idees to do that but
they kind of suck so if you can actually
get an llm to do that H it's just
amazing and and like the time you can do
it in an hour you know somewhere on a
plane and you can actually write massive
amounts of code and experiment with it
Brian before we leave this topic I think
we just need to remind ourselves that
you asked kind of an art question to
three nerds I'm I'm I'm safe in saying
that right I mean just put a disclaimer
here I think it would be a fascinating
conversation
uh to have uh artist representation on
this question uh so all of this just
taking you know we're talking about
inevitability and tools and all of that
and I think that's that's where our
brains go but uh uh really fascinating
to have this conversation uh with with
the artists with the a very like one of
the reasons why is because I do have
like like I said I do have like friends
who do both of these things um and I
have just like observed how different
the reaction is um from them and from
like the community um that that they
operate um and and like there's a bunch
of like interesting economic factors
here that play into like this like I
think there's less concern in some cases
about like more like real industry
disruption happening with like the
software engineering community than
there is on the creative side so it's
like I think there is that just like a
little bit of that kind of core
underlying economic anxiety that is not
quite the same in in those two places
even though um you know you're really
just dealing with like just different
types of models um that are helping
improve productivity in different types
of domains um but it'll end up Landing I
think pretty differently potentially so
I think it's a great point we did not
totally represent that other side of
that um of this but it is um it is just
a super interesting topic I think and I
think one of the things will be
interesting is just to the point about
never um I feel like there's so many
tools that like you use them as part of
a workflow and you don't even know what
the underlying technology is it's like
you know if you want to take a
background out of an image like do I
know that's gen or something else or
what like do I even care um in some
cases so you know in some of those
places I'm like man never really U but I
think it will be it will be interesting
to see like how this SP evolves um over
the next couple
[Music]
years earlier this week AMD announced
the acquisition of ZT systems um and so
I think as everybody knows like the
hardware space has been like one of the
biggest winners if not the biggest
winner um so far in terms of like the
early days at least of like the geni and
uh llm sort of cycle um and AMD is a
company like obviously we've talked and
everybody's talked a ton about Nvidia
but like AMD is obviously making um big
play in this space um as well their CEO
Lisa Sue was on cmbc um earlier this
week and she was talking about the
acquisition and one of the things is
that like AMD historically has invested
a lot in Silicon uh they've invested a
lot um and even doing more on the
software side of it and that the way
that they talked about this acquisition
is that they were starting to bring
together a stronger set of capability
from like a systems um perspective and
so maybe vmar as just like a way of
kicking things off like why is it so
important like why is this Market moving
from just like Silicon silicon to
systems and like why are systems and
like these almost like vertically
Integrated Systems within this Bas like
almost like so uniquely
important so if you look at um AMD and
the am the AMD offering
AMD acquired ATI you know a decade or
two decades back and that's the heritage
of their AI accelerators uh and they are
kind of head-to-head with uh Nvidia over
the years and they own some spaces and
Nidia some spaces I think what Nvidia
did very well over the last couple of
years is to look not only at the GPU
itself but looking at you know many gpus
in a box and then when you go into
training you go multibox so you need
many machines and the integration if you
look at the Acquisitions Nvidia did is
they acquired um a company which is you
know providing the software stack to run
very large scale clost stores uh which
is the base command product and then uh
they also acquired melanox which is the
leader in like reliable network
communication and so AMD is sitting
there and like okay so what do we do um
and they don't have a uh a Consolidated
Story how they can put you know a 10,000
GPU training system on the floor so
they're kind of locked in the box and
they are not yet at the scale where they
could actually compete on the training
side and that's I think also the reason
why Nvidia you know owns like 96% of the
market um a when you when you're trying
to train you can pretty much only use
Nvidia and then you already did all the
coding on Nvidia systems and all the
operators are implemented for Cuda and
performance optimized because otherwise
you didn't train the model then running
it's kind of trivial right and so
switching an ecosystem is really hard um
Nvidia went down this route of you know
having like the dgx system so they built
full SS with all the network
communication Etc and AMD I think is
just now catching up so they're catching
up on the network against melanox they
announced Ultra ethernet and now they
are catching up you know how to get
these big systems into at scale into
into the industry and you know they need
to get into the cloud providers and so I
think systems you know being a boutique
shop which makes very large scale
infrastructure deployments happen is is
a lot of good conclusion that makes one
of the um I think you mentioned training
a lot like one maybe as like a follow-up
question um to that you know one of the
observations I have just about like the
GPU Market in particular is that it
feels like more vertically integrated
than the world of CPUs um does at least
like somewhat um and is like one I guess
would you agree with that sort of
characterization and two if you do like
is is building out the sort of unique
set of um requirements maybe around the
training stack like is that like the
underlying core force around why this
Market is like behaving the way it is
and why it's behaving differently or do
you kind of see those that story like
differently than the way I just kind of
laid it out I think the training system
Market is a a traditionally very
esoteric Market which is the high
performance computer market and you know
at IBM we built like top 500 like like
number one and number two top 500 superc
computers with blue Gene you know lpq
and uh the the follow on systems um and
suddenly we are on a world where that is
not anymore a a domain of the labs which
drop you know $100 million dollars uh
and get a computer uh suddenly every
company which wants to train a network
at scale needs similar technology and so
what we are seeing is after 20 years or
40 years almost like HPC being a very
esoteric field of you know let's say 50
supercomputers in the world Suddenly
It's a you know it's a commodity and you
start up it to we should all have a
superu exactly you know like oh yeah I
need a super computer you don't have one
so and you know the I got unfinished
basement like you know the joke con I
was like I'm GPU poor right so I only
have like 100 so the uh and and if you
want to play in that market you need to
actually offer a solution and I think
AMD has been traditionally in the
desktop market with the GPU or like
Enterprise market with the GPU and they
they sell s but they never build these
systems Nvidia being an an actual GPU
vendor amazingly has captured like 85%
of the dollars spent in the data center
right so it's like yeah your Intel chip
good luck and a little bit of memory and
everything else we take we take the
switches and we take the the the
ethernet cards and we take the GPU and
that's the other 85%
and so for AMD to get something deployed
at scale I think they need to have an
offering which is on par I think Intel
with gud is in a little bit better shape
because they have Partnerships over you
know 50 years with Dell and Lenova etc
for them it will be easier to get into
that market because they already have an
ecosystem and that's not the case for
AMD this is why I don't get a vmar I
actually don't get the acquisition
because if like let's say I was an Apple
company not Apple but an Apple company
and my market everybody bought red
delicious apples cuz they're great
apples but my company sold Granny Smiths
and nobody ate Granny Smith apples why
would I buy a company that makes better
packing boxes for my apples I I that's
that's my problem with it I'm I'm kind
of like if I'm spending $5
billion you know spend the5 billion on
getting better gpus right and and come
go compete with Nvidia that's that's
where I don't quite understand it in my
mind I think the uh the uh Nvidia
figured out a way of actually delivering
it deploying it to Partners and to a
certain extent AMD got locked out in
that space so they need to find a a way
to Market and what that way to Market if
you look in the training space a huge
percentage of the training is actually
happening with the
hyperscalers um companies like they want
to put Nvidia cards on their premises
but in many cases in for early
Beginnings they go into the cloud nctt
is deliver to the hyperscalers so for
them it's a way to get into the
hyperscalers with a solution where they
say okay we give you the whole thing so
you you take down the risk on the
highers
scalers I'm not sure people do want to
use Nvidia I think I you know I I think
that nvidia's got this Market Lo and
Nvidia is awesome they make great gpus
but but at the same time Apple seems to
be doing well on the desktop Market or
the laptop Market with their uh with
their chips and with mlx as a framework
so you know custom Apple silicon seems
to be working out well you're seeing
companies like Google invested in their
own kind of Asic based chips chips with
tpus you see other people move into as6
as well I I think there is a space for a
lowcost alternative to Nvidia chips and
I I think there is a market for that
because otherwise other other companies
hypers scales Etc wouldn't be investing
in that and that's why I'm saying I
don't get it I you know Nvidia by far
makes the best gpus across the board
they're an incredible company I I just
think if I was a competitor I would try
and find an eanet space which isn't the
packing boxes yeah I I think the the uh
the really for in the training market
right now Nvidia is just the only choice
you have and I think this is primarily
where indd is trying to break in I think
in the inflence market there will be you
know like you said apple and you know
there's Qualcomm there's a ton of Chip
vendors and there's a you know a pleora
of startups in Silicon Valley who are
trying to make like super low power Etc
but in the training Market if you look
where AMD is going and the wattages they
are putting down you know where it even
goes above a th000 Watts on a on a GPU
in in the next Generations that is um we
are you know Nvidia is effec the only
game in town and I think they want to
put something up against it and you only
have for pre-train maybe for pre-train
maybe but not necessarily fine tune fine
tuning I think you can in many cases you
can do in a box like you do not need a
huge system yes but in the pre-training
market you you do and this is where are
right now you buy Nvidia or you buy
Nvidia and you know gudy isn't there yet
AMD isn't there yet and so I think this
is effec a an attempt and who knows how
let's see how it plays out right I mean
I thank God I didn't have to make the
decision um but um you know I think this
is an an attempt of breaking into that
large scale training market and
delivering you know very very large HPC
systems you know companies run 100,000
GPU training cost building that takes
you know a year it's massive investment
you know it's in billions of dollars and
so if you want to capture some of those
revenues then you need to have it's it's
not you know um like oh we we collect
like three engineers and they put up a
supercomputer it's like no this is a
this is a construction process right and
and this is where where AMD with this
acquisition finally has a chance of of
you know bringing the guys with the
hardheads in as well because you need to
put Power in and Cooling and all this
stuff right and I think they don't right
now because that's all rled out they do
not have the experience and so they I
think they're buying the the competence
but that point about competence was
actually something I saw come out a lot
in the discussion um post the
acquisition where um you know this is a
company that does have a lot of
capability around doing exactly that
building out large scale clusters some
of the biggest in the world essentially
um and it's a kind of inter in theme
that I've heard at every level of the
whole gen stack at different points over
the last year or so you know you hear it
in the hardware side you hear it and
it's really like to the point about
being like you're almost rate limited by
the amount of expertise that's in the
market right now it's like in the
hardware side I heard it on like the
training side you heard it for a while
on even like the prompt engineering side
like you know people refer to them as
like you know magic encampments um for a
little while and there was like this
like only a certain like group of people
even really knew how to prompt the model
um correctly and a little bit of what
I've observed over the course of like
the last I guess like almost two years
at this point is that like as this thing
has blown up like it feels like some of
those skill shortages are like getting
less acute like more people know how to
train models more people are getting
competent uh working with models more
people obviously are like attracted to
the hardware side of the equation
because of some of what's happened over
the last couple years um I'm curious
like across the board
like how much do you feel like our
progress in AI is still rate limited by
just like raw expertise um across the
world in in this space and like how much
has that improved or not um over the
course of the last like year or two and
so maybe Skyler just kick it over to you
sir I I have this conversation pretty
regularly uh with our our director and I
would say it's not necessarily the
overall amount of skills I think that
definitely is monoton Al increasing but
how it's distributed across the globe
that's becoming more extreme and so I
think that's something that uh we are we
are experiencing you know we're IBM
research Africa we represent a billion
people uh but uh the talent that here is
probably going to immigrate and what
does it look like to have that Talent uh
here and and bring that culture here so
yes it is increasing but I think at very
different rates across the globe that'
probably be my my my short summary of
that and it is something that uh we we
do talk about on a regular basis is what
does uh capacity in generative AI look
like on a really global scale so that's
probably another another session
entirely in itself that was a I was not
expecting that that was a fascinating
perspective on that so yeah Chris vmar
thoughts
okay yeah I think um there is such a uh
Gold Rush
and it's a new technology and so it's a
lot about you know even trying it out
and every day there's something new so
you need people who are really
passionate about it um and you know that
they you know spent their living and you
know half sleeping hours on it uh and so
the skill set I think will develop over
time it's I I feel like you know we are
repeating the the Gold Rush of the repb
era where I was like oh my God you can
write a web service isn't that amazing
and now it's like yeah you know
everybody can do it and so I think we we
are just in this in this uptick with a
very like extreme Supply shortage and
because it's it's so deep like you know
when you just pluged a computer into a
network it was relatively easy I mean
it's like okay you know here's a
computer on a network go now it's like
the training is different you know do
you need to even understand what math is
and most Engineers hate math that's why
they like computers and so there's this
this set of skills which need to be
built up and you know until it actually
rolls to the universities and we get
people who are truly practitioners so
you first you need to get the education
and then you need to become a
practitioner and you need to toy around
with it for 5 years so I think for the
next 10 years we will probably be in
this and plus the speed of change we
will in we will be in this world of you
know there's Supply shortage everywhere
uh I think on the flips side coming from
the systems corner it's nice to see that
finally we built Big computers again so
I I really like this and you know that
we are actually going away from like the
cloud providers do everything for us and
we need to actually look at system
design with a you know fresh angle I
think that's a that's a goodness for the
industry so it was kind of locked in and
the only you know there are like five
companies in the world who still know
how to plug a computer into into a
network into power socket and I I think
it's good that we are actually going
through more of a of a Renaissance of
you know computer architecture and and
and design at least you know
yeah I'm the total opposite I think that
people
are I I think skills people are learning
the skills and they're doing a great job
of that across the globe um but at the
end of the day if you want to train a
large language model you need an awful
lot of gpus and you need access to an
awful lot of data and that is outside of
the access to the average human being so
there is a lot of really great skill
Talent and they are not going to be able
to practice their craft because access
to the gpus to be able to learn what is
the effect of this data it just isn't
there now they can you can learn from
doing things like fine-tuning and
training very very very small models Etc
but at the end of the day we know that
for the larger models it it it emerges
uh on the higher scale and and therefore
and and at the scale now is it's tens of
thousands of of uh gpus to be able to do
that and I think that is what's locking
out average practitioner so me
personally I I want to see more
distributed compute I want to see more
access to gpus and skills and therefore
I think to kind of Skyler's point I
think that will open up a really
talented set of people that are uh
distributed across the globe to be able
to uh make great contributions in that
area but at the moment it's going to be
concentrated in the big tech companies
because they're the ones with the gpus
Chris I want to fight back on your
fighting back that's why we do this
right
if if I have a researcher that comes to
me and says the only way they can make
their case is that they need 10,000 gpus
that's that's not a good argument that
researcher needs to be able to make
their case off of two gpus so AG where
you know where's that conversation start
about making the case off of this uh
this 2 GPU example show that then we can
talk about the 100 the 2000 the 100,001
I don't I don't think it's it's fair to
say I can't make progress unless I have
10,000 I don't
I I I I agree Skylar but again we're
sitting in a company who has 10
thousands of gpus right so they can go
to you make the argument with two gpus
and then you can give them access to to
scale right but the average person they
might get so far with two gpus and then
they're like huh I don't have the money
now well I'm gonna go and do something
else so we're moving to a world of
universal basic compute um it sounds
like I feel like that's been a little
bit mey um recently so we will we will
call it a day there um vmar Chris Skyler
thank you all for joining great
discussion uh today U and for those of
you out who are listening to the show uh
you can grab mixture of experts on Apple
podcast Spotify and podcast platforms
everywhere so until next week thank you
all for joining we'll see you next time