Learning Library

← Back to Library

Scaling Compute vs Software for AI Reasoning

47m • Unknown Channel • ai-ml • interview • intermediate • Watch on YouTube ↗

Key Points

The panel debated whether advancements in AI reasoning will come primarily from scaling compute and algorithmic breakthroughs (voiced by Vmar and Skylar) or from traditional software engineering improvements (voiced by Chris).
A new paper from Mulon on “Agent Q” showcased that combining LLMs with tools such as search, self‑critique, and reinforcement learning can boost planning tasks—e.g., restaurant reservation booking—by an order of magnitude in success rate.
Skylar explained that while LLMs excel at constructing a statistical world model for next‑token prediction, they historically lack the motivation or agency to actively explore and act within that model, which hampers their reasoning abilities.
Recent research is therefore focusing on augmenting LLMs with external reasoning mechanisms and tool‑use to give them purposeful agency and improve their ability to solve complex, multi‑step problems.
The discussion highlighted excitement about renewed investment in large‑scale hardware (“big computers”) as a key enabler for these next‑generation AI systems.

Sections

00:00:00 AI Reasoning: Compute vs Software - A panel of AI experts debates whether the next breakthroughs in planning and reasoning agents will come from scaling compute, new algorithms, or traditional software engineering, amid renewed hype for powerful hardware and a preview of Mulon’s new Agent Q paper.

Full Transcript

# Scaling Compute vs Software for AI Reasoning **Source:** [https://www.youtube.com/watch?v=emVMHYQVmdQ](https://www.youtube.com/watch?v=emVMHYQVmdQ) **Duration:** 00:47:21 ## Summary - The panel debated whether advancements in AI reasoning will come primarily from scaling compute and algorithmic breakthroughs (voiced by Vmar and Skylar) or from traditional software engineering improvements (voiced by Chris). - A new paper from Mulon on “Agent Q” showcased that combining LLMs with tools such as search, self‑critique, and reinforcement learning can boost planning tasks—e.g., restaurant reservation booking—by an order of magnitude in success rate. - Skylar explained that while LLMs excel at constructing a statistical world model for next‑token prediction, they historically lack the motivation or agency to actively explore and act within that model, which hampers their reasoning abilities. - Recent research is therefore focusing on augmenting LLMs with external reasoning mechanisms and tool‑use to give them purposeful agency and improve their ability to solve complex, multi‑step problems. - The discussion highlighted excitement about renewed investment in large‑scale hardware (“big computers”) as a key enabler for these next‑generation AI systems. ## Sections - [00:00:00](https://www.youtube.com/watch?v=emVMHYQVmdQ&t=0s) **AI Reasoning: Compute vs Software** - A panel of AI experts debates whether the next breakthroughs in planning and reasoning agents will come from scaling compute, new algorithms, or traditional software engineering, amid renewed hype for powerful hardware and a preview of Mulon’s new Agent Q paper. ## Full Transcript

0:00AI agents what are we expecting next how 0:03do we put um planning and reasoning 0:08alongside this large representation of 0:10the worlds we have now are we going to 0:12have products that truly never 0:13incorporate generative AI I think never 0:17is such a strong word and what's the 0:19most exciting thing happening in 0:21Hardware today it's nice to see that 0:23finally we built Big computers again 0:31I'm Brian Casey and welcome to this 0:32week's episode of mixture of experts uh 0:34we let Tim go on vacation this week so 0:36you're stuck with me and I'm joined by a 0:37distinguished panel of experts across 0:40product and research and Engineering 0:41vmar ulick who is the VP of AI 0:44infrastructure Chris Haye who is the CTO 0:46of customer transformation and Skylar 0:48Speakman senior research 0:51[Music] 0:54scientist there's been a lot of 0:56discussion in the market around 0:57reasoning and agents um over the last 1:01you know six months or so and so the 1:03question to the panel is do we think 1:05we're going to get more progress in 1:06building reasoning capabilities through 1:09scaling compute and this is just over 1:11the next year or so scaling compute 1:13algorithmic progress or from Good Old 1:15Fashion software engineering so vmar 1:18over to you uh very clear algorithmic 1:21progress 1:23Chris software engineering all right 1:26Skyler algorithmic that's the next step 1:29all all right I like it we got we got 1:31some different opinions on this and this 1:32actually leads us into our first segment 1:35that we're going to be covering today um 1:37which is a company called mulon uh 1:40released a new paper around agent Q uh 1:43and this paper is demonstrating 1:45improvements in reasoning and planning 1:47and the scenario they defined in the 1:49paper which was using an agent to 1:50actually book restaurant reservations 1:52was using llms combined with other 1:55techniques like search self-critique uh 1:58reinforcement learning and they 1:59demonstrated some like order of 2:01magnitude Improvement uh in just the 2:03success rates of llms and so maybe 2:05Skyler as a way of just kicking us off 2:08I'd love to hear a little bit about just 2:09like why do lmms struggle so much today 2:12with with reasoning and like why is you 2:14know some of the work going on in this 2:16space exploring other ways like so 2:18important to to making progress so llms 2:21have this amazing ability to build a 2:24world model um I think I've seen that 2:27phrase popping up more and more 2:29sometimes it will get criticized and say 2:31oh all they're doing is predicting the 2:33next word but in order to predict the 2:36next word as well as they do they 2:38actually do have this I'm not going to 2:41say understanding might be too long of a 2:43stretch but they have this model of the 2:46world up until these new recent 2:49advancements they had no real reason 2:52motivation agency whatever you want to 2:54call it to really go out and explore 2:57that world but they had created that 2:58model of the world and they could ask 2:59answer questions about it uh so I think 3:02this idea of llms being limited to 3:05creating the model of the world they did 3:06a very good job of that I think some of 3:08these next steps now are all right now 3:11that we've got a representation of the 3:13world which is pretty good at the next 3:15token prediction problem how do we 3:18actually execute um actions or make 3:22decisions based on that representation 3:25and so I think that's kind of this this 3:26next step we're seeing um not just from 3:29aging q but lots of research Labs here 3:31are really trying to figure out how do 3:33we put um planning and reasoning 3:37alongside this large representation of 3:40the worlds we have now so I think these 3:42guys are off to a good start uh one of 3:44the first ones to kind of put something 3:45out there um uh the paper down you know 3:48uh out available for people to read um 3:51lots of other companies are working on 3:52it as well so I wouldn't necessarily 3:53these guys I wouldn't necessarily they 3:55they're ahead of the pack yeah maybe 3:57Chris I know we were talking a little 3:59bit about this which is like how 4:01indicative do you think um some of the 4:04work that the team did here is of just 4:06like where everybody's going um in in 4:09this space like is this is this paper 4:11just like another piece of of data in 4:13like what is a continuation of everybody 4:15sort of exploring um the same sort of 4:17problems and do we think this is you 4:19know pretty dialed in on kind of where 4:20the problem space is going to be around 4:22agents over the next year or so I think 4:24it is actually pretty dialed in so when 4:26I when I read the paper it's kind of 4:28similar to some of the stuff that we're 4:29doing with agents herself so that's 4:32always kind of goodness there but if if 4:34you really look at what's going on there 4:36is they're not really using the llm for 4:40the hard bits right they're using the 4:42Monte Carly Tre search right to to 4:45actually work out so one of the major 4:47things that they're doing is they're 4:48using a web browser as a tool so if 4:50they're trying to book a restaurant for 4:52example then what they're actually doing 4:54is doing a mon Carl research and they're 4:56navigating using that Tool uh to 4:59different spaces they're using the llm 5:00to self-reflect they're using the llm to 5:02create a plan in the first place of how 5:04they're going to book that restaurant 5:06but they are relying on outside tools 5:09they're relying on outside pieces like 5:12uh the tree search to be able to work 5:13out uh where they're going and the fact 5:16is that is cuz llms are not great at 5:18that right so it's like it's more of a 5:20kind of hybrid architecture in that 5:22sense and everybody's doing the same 5:23thing with agents as well right you're 5:24bringing in tools you're bringing in 5:26outside memory you're bringing in things 5:28like uh graph searches for example so 5:31graph racks becoming really popular in 5:33these spaces everybody's sort of 5:34bringing in Planet and reasoning as well 5:36I think they're doing some really 5:38interesting stuff there with the 5:39self-reflection and the fine tuning so 5:41that it's more of a kind of virtuous 5:43circle in there within the paper so I I 5:45think they're probably further ahead 5:47than than a lot of people in those 5:48spaces but even if you look at the open 5:51source tools the open source agent 5:52Frameworks we started with things like 5:54Lang Jan but now you'll see things like 5:56glang grath is becoming really popular 5:59um and then you're moving into other 6:00multi-agent collaborations such as crew 6:02AI so I everybody's on a different 6:05slightly different slant on where they 6:07are in this journey but they're 6:09definitely on the right track I would 6:12say at this point in time and and by the 6:14way back to my earlier argument that is 6:17software engineering my friend that is 6:19not doing anything different with the 6:20llm it is engineering and putting stacks 6:23and Frameworks around your tool 6:27set to that point Brian I do want to 6:30hear uh vul Mar's take on why 6:32algorithmic was his was his pick so you 6:35have to hold you have to hold us to our 6:36answers and he's going to go 6:38next so um my background is we we I 6:42built self-driving cars for seven years 6:45and we this was always this U decision 6:47between you know how much software 6:48engineering can we do and how much can 6:51we train into a model and then in many 6:53cases what Chris just said is you know 6:56it's often times a packaging of 6:57different Technologies together and I 7:00think where we are where we are right 7:02now is we we have as you mentioned this 7:04really powerful tool which is LM so we 7:06have some basic form of world 7:08understanding and we have the world 7:09model and now we are trying to make some 7:12something do stuff which we haven't seen 7:14it's not oh just predict predict the 7:16next thing you do on Open Table right 7:18and so now you're on a in an unknown 7:20open world where you need to explore 7:22different uh you know different choices 7:26and then I think what the next step will 7:27be you know you run this Brute force and 7:30then once you have those choices you 7:32actually will train a model that's my 7:34expectation because that's the path I've 7:35been on with with driving so we always 7:38came up with some euristic huge Data 7:40Corpus tried something out and then in 7:42the end it was always like oh yeah now 7:44that we figured out what the underlying 7:46problem is let's train a model to make 7:48this more efficient in execution and so 7:50in the end the model is just an 7:51approximation of an of an extensive 7:53search right and so I think that's why 7:56algorithmically I believe that um the uh 8:00uh the algorithms we will build uh are 8:03effectively those you know graph 8:05searches Tre searches Etc which 8:07ultimately then will feed into a simpler 8:09representation which is easier and in 8:11real time to compute I was I was kind of 8:14disappointed by the paper if I'm honest 8:16and I'll tell you why and uh and and 8:18Brian's dreading what I'm about to say 8:20now but um but I'll tell you why I was 8:22disappointed because the whole example 8:24was the Open Table example now unless I 8:27am wrong and I don't think I am isn't 8:29mulon the company that claimed that they 8:31were the agents behind the strawberry 8:34man the uh I rule the world Mo Twitter 8:37account so uh you know that would have 8:39been the uh the agent example I would 8:42have wanted to see in the paper it is 8:44that that was actually a question um I 8:47was like I was thinking a lot about 8:48because they they they talked about 8:50reinforcement learning as part of that 8:51and like one of the interesting things 8:52that I've just seen in the market the 8:55last I don't know a few months or so is 8:57there's this like like light backlash 8:58happening to to llms within the ml 9:01Community even a little bit particularly 9:03I think the people who have worked a lot 9:04in reinforcement learning um you know 9:07and you even heard you know folks like 9:09like people talking about llms being a 9:10detour on the path to to AGI and I'm 9:14seeing like as as we've slowed down a 9:17little bit in terms of progress I've 9:19seen like the folks who love who operate 9:21in those kind of reinforcement learning 9:22spaces like starting to pop their heads 9:24up more and being like hey it's back 9:26like um the only way we're going to make 9:27progress around here is some of the 9:29other techniques and you know I'm 9:31curious like maybe two questions is um 9:34maybe I'll start with this one is like 9:37do you all think if we fast forward to a 9:39world where like agents are a much more 9:42significant part of just like the 9:44software that we're all using every day 9:46do we think llms are like the most 9:48important part of that or Chris to your 9:50point around this paper that make 9:52extensive use of lots of other 9:54techniques do we think like a bunch of 9:55other techniques are going to come and 9:57like rise back to prominence as we 9:59actually try to like make these things 10:01do stuff um and um so yeah maybe I'll 10:04stop there and just see if like anybody 10:05has a take on that yeah I I definitely 10:07think RL is going to come back into this 10:10um I know they were using RL and that 10:12paper and they were also using things 10:13like DPO and stuff but I I think it's 10:15going to come back into this so I keep 10:18thinking back to alphago and the 10:20deepmind team and you know winning at go 10:23there and and again they were using 10:25similar techniques as you could see in 10:27that paper there um but but if if you 10:29take a deep learning algorithm today on 10:32your machine and you get it to play the 10:33simple game of snake or play the Atari 10:36games like deep-minded 10:37um very very simple architectures like 10:41uh CNN DNN type things absolutely rock 10:44that game if you get an llm to play and 10:48it doesn't matter whether it's an agent 10:50or not that is the worst playing of 10:53snake I've ever seen from Frontier 10:56models right and GPT 40 is Terri 10:59at it um you know Claude is terrible at 11:02it they're all terrible playing at these 11:04games but really simple RL deep learning 11:09uh you know CNN style uh architectures 11:12actually rocket those games and 11:14therefore I I think that as we try and 11:18solve and try and generalize I think 11:20some of those techniques that were 11:22really successful in the path in the 11:24past have to come back into the future 11:27and I'm I'm pretty sure that's where a 11:29lot of people are going at the moment so 11:31we're going to see software engineering 11:32we're going to see improvements in 11:34architecture we're going to see 11:35improvements in algorithms it's going to 11:36stack stack stack and hopefully all of 11:39these techniques will come together into 11:40hybrid architecture but but when you 11:42take llms and put them into an old sort 11:45of gaming style environment they 11:47absolutely fail today do we think there 11:49will be like general purpose agentic 11:52systems like over the next you know 11:55short term let's say like next couple 11:56years or is everything going to be task 11:58specific um because like one of the nice 12:01things Chris like to the point about 12:02this thing being an open table like go 12:04book of reservation it's a very easily 12:06definable objective um right and that 12:09means that you can pull in a bunch of 12:11these other techniques in a ways that 12:13are harder to make kind of like fully 12:15generalizable and so it's like when we 12:17look at agents do we think we're going 12:18to make a lot of progress on kind of 12:20generalizable Agents over the next you 12:22know year or two or is is everything 12:24going to be just in this Tas specific 12:25land Skyler maybe it looks like you got 12:27some slots on that no don't think we'll 12:29have General within two years I think 12:32there will be some areas and this might 12:34even lead to our next topic areas around 12:38uh language creativity I think that will 12:41that will surpass uh some humans 12:43abilities but the world works on much 12:46more boring mundane business processes 12:48and I think there's still a lot more 12:50ground to make on that to to get those 12:53systems to a level of of trust uh that 12:56people will use it's one thing to to 12:58have these methods you know create a 13:00funny picture write a funny story uh but 13:03to have llms execute Financial 13:04transactions on your behalf different 13:07different ball game and we're not going 13:08to be there within two 13:09years I I'll be proven wrong you can 13:12timestamp this that's okay but uh yeah 13:14yeah no we're always accountable for our 13:16predictions on this show so um so Brian 13:19I I think where we may go is we will 13:22probably get you know now we are going 13:24through examples you know open table and 13:26we try another 20 I think we will get 13:29into a tooling phase where you know you 13:32you can actually explore a domain and um 13:35with some human intervention and some 13:38human guidance you know you will have 13:40tools which can explore let's say a web 13:42page how to interact with it and then 13:44you may go through some pruning process 13:46which may be manual but I think we will 13:49get to more automation that it will be 13:51you know 10 times or 100 times faster to 13:52build this but I think as Chris said 13:55there will be a software engineering 13:56component to it uh which you know for 14:00until we are fully autonomous you just 14:01point at something and say learn uh that 14:04will take a while and then the question 14:06is where does the information come from 14:08is it through trial and error or we 14:09could even just read the source code of 14:11the web page right I mean we we have 14:13source code in puton business processes 14:15I can just give you you know here's my 14:17billion lines of code of sap 14:21[Music] 14:25adoption for the Second Story there was 14:28the CEO of this company procreate um 14:31they are a company uh that builds and 14:33Designs illustration tools and I think 14:35it was on Sunday night um their CEO came 14:38out and released a video um in which he 14:40said that they are never that one he 14:43actually said he hates gen AI um I think 14:45he actually used the word hates um to 14:46describe it um and he said that they 14:49were never gonna include gen 14:51capabilities um inside of their product 14:54and like the reaction from their 14:55community and the design Community 14:58broadly was was like super excited and 15:01supportive of of this statement like I 15:03think as timer recording um that video 15:06has got like almost 10 million views um 15:08on on Twitter and I have like a bunch of 15:10different reactions um to that that 15:12hopefully we can you know pick apart 15:13here a little bit but one of the things 15:16that was like most striking to me is 15:20that the way two different sets of like 15:22Creator communities have reacted to the 15:24arrival of llms like within the I have 15:28friends and col colleages who are 15:29software engineers and like llms for 15:32code um people are generally pretty 15:34enthusiastic about that look at it as a 15:36great productivity tool they get more 15:37work done than they were ever able to do 15:40before I also have friends and 15:42colleagues who are writers who work at 15:44Hollywood who are creatives and who like 15:48look at the arrival of some of this 15:49technology like the Grim Reaper um 15:52basically and so it's just like wildly 15:54different responses um from from these 15:57two communities and I'm just curious 15:59like maybe Chris throw it over to you to 16:01you know maybe get some initial thoughts 16:03and reactions to it is like you have any 16:05sense of like why these communities are 16:06responding so different differently um 16:09to to this 16:12technology I think never is such a 16:15strong word that be one of my other 16:17reactions to it never so far really uh 16:21no feature at all yeah 16:24yeah yeah I I'm never ever gonna stream 16:29video content because I believe physical 16:32is more important well you know what 16:34you're out of business Blockbusters so I 16:37don't know I I think there is a general 16:40wave I applaud them right I think they 16:42make tools for their particular uh 16:44audience and Their audience doesn't want 16:46that and I I think that's going to be a 16:48unique differentiator um I'm not sure 16:51how that stands the test of time I I 16:54think never is such a strong word there 16:56the industry is moving fast and 16:58different audiences have different needs 17:00right I mean I'm pretty sure that if I 17:03use procreate there's no chance ever I'm 17:06going to produce anything that is of any 17:08artistic quality and that is cuz I have 17:11no artistic talent but you you're not 17:13the target 17:15audience I am not the target audience 17:18but I am grateful for AI generated art 17:21because it allows me to produce 17:23something that I would never be able to 17:24produce otherwise so things like 17:26PowerPoint slides Etc so if they are 17:29they are focused on the creative 17:31professionals and creative professionals 17:32don't always want to have ai geni within 17:35that and I understand that that's great 17:36you've got your audience you've got your 17:38Target and that's fine but I think and I 17:41think there will always be an audience 17:43for that but I think the tide of time 17:47will uh push against them there and I 17:49think that's that's really going to be a 17:51very strong Artisan statement to make 17:53before we move on Chris what what sort 17:55of PowerPoint art are you doing um like 17:58that was was 18:00my I I mean generally if I'm honest it's 18:04almost always of unicorns with rainbow 18:07colored hair that is that is my pretty 18:09CEO presentations 18:12um every CEO loves a 18:15picture sure all the other ones do you 18:18know that's it resonates um with with me 18:21but Skyler vmore I'm curious if either 18:23of you have takes I'm just like the 18:24community's really reaction to like 18:26these two different sets of tools so I 18:29think we are in a world where um you 18:33know we have artists and 18:35craftsmanship and we are going through a 18:37phase of automation of this Artistry and 18:40craftsmanship and so the bar will be 18:42really really high and there will be 18:44always unique art we still today you 18:46know I can buy photography I can buy you 18:49know a copy of a mon you know some of 18:52the greatest artists in the world and 18:53can hang it on my wall but there is 18:55still a need and a demand by people to 18:58have you art which is theirs and I think 19:01that will stay like and and we've seen 19:03this across you know the progression of 19:05time you know horses used to be forms of 19:08transportation and now they are a hobby 19:12right and so and car old cars is going 19:15the same way and you know hopefully at 19:16some point that's with airplanes and I 19:18think um these these unique pieces of 19:22art if I can automate the creation and I 19:25can you know industrialize it the 19:27industrialization wins it always wins 19:29but it doesn't mean that those tools and 19:32those artists and that craftsmanship 19:34shouldn't be supported it will just 19:36shrink dramatically because uh you know 19:39the the capabilities become more 19:41accessible to everybody you know if you 19:43used to have typus now everybody can 19:44type all the typists are gone right and 19:47there will be the same thing one of the 19:48things I thought was interesting is that 19:51you made this point about craft like I 19:52think a lot of people choose their 19:55life's work because they like the Craft 19:57um of of that right they chose to be an 20:01artist or a developer because they like 20:02like doing that work and so having a 20:05tool come in and like do all of it for 20:07it is like robbing you know some degree 20:09of value from um you know the things 20:12that they do day in and day out and um 20:15one of the things that I was also 20:16thinking about and I'm just curious if 20:18in 20:19your in within your teams within your 20:22own like set body of work you're doing 20:24with clients that y'all are working at 20:27do you also see like of the other places 20:29where I was thinking about tension um 20:31around this sort of dynamic is um in the 20:35relationship between management and 20:37practitioners um where like one of my 20:39observations is that like management is 20:41oftentimes particularly enthusiastic 20:44about adopting these tools because of 20:46the productivity benefits like I can get 20:47more things done I could reduce my cost 20:49I can you know drive more Revenue 20:51whatever it might be and you know 20:53because those are the things that like 20:55they're running their entire 20:56organization to like Drive deliver those 20:58results and in some cases they've become 21:01as they've gotten more senior maybe one 21:03step removed from actually doing the 21:05craft so the loss of The Craft maybe 21:07feels like less of a consequence um to 21:10management sometimes but to 21:11practitioners it's like this is my thing 21:14uh and this tool is coming around and 21:17just like doing it for me in some cases 21:19so I'm curious if youall have also 21:20observed any sort of like when it comes 21:22to adoption of some of this stuff any 21:24tension between like management and 21:26practitioners um in terms of like their 21:28level of enthusiasm for for this 21:29technology I'm not sure about tension of 21:32management and practitioners uh there 21:34might be a sum of I've witnessed of uh 21:38which flavor or which version so they're 21:40going to say no we're going to use this 21:41one and uh back actually behind the scen 21:44somebody's using a different a different 21:46tool and some tension back back and 21:48forth on that one so it's not 21:49necessarily the adoption uh but maybe 21:51the channel or the Tool uh has had has 21:54had a bit of uh that one or this one and 21:57um so yeah that would be what I've 22:00observed I think it's also the question 22:02you know when you look at at um 22:05Craftsmen um there's 22:0820% of work you love and 80% of work you 22:12hate often times it's like the majority 22:15I mean ask a data scientist like 80% is 22:17data cleaning do you think they like 22:18data cleaning no right so um if 22:22you I think the tools like if they 22:26support the the toiling the useless work 22:30and make people more productive then you 22:32know you shift more into the the work 22:35which you actually like and appreciate 22:36so I think there is from the from the 22:38engineering I mean I'm mostly talking 22:40software Engineers here from the 22:42engineering perspective I think it's 22:43actually an improvement you know nobody 22:45likes jro ticket reviews and writing 22:47comments and all that stuff if that can 22:49be automated away then that's you know 22:51an improvement in the life of people or 22:54I don't need to go to St overlow and try 22:55to find that algorithm I can just ask 22:57the model to write it and I'm done and 22:59so I'm more at the architectural level 23:02um and I think uh from a management 23:04perspective I mean they want to get 23:06productivity out but there also 23:08productivity in an Engineering Process 23:09in many cases is that you you know need 23:11to convince all the people to do these 23:13pieces of work because they're necessary 23:15for the product but everybody hates them 23:17so and I think to a certain extent you 23:19know it's an improvement on both 23:22sides that's that's a great point I I 23:24always well it's probably not a probably 23:27not safe for description of it but I 23:29always like to tell we we share those 23:31things amongst the team so everyone 23:32should just mentally come to terms with 23:35some percentage of your job is the work 23:37that none of us want to do in this team 23:39but we're at least going to spread it 23:40around um the group a little bit but um 23:43but that description like actually so I 23:45like a lot of the teams that I work with 23:47are operate a lot on just like ibm.com 23:50do a lot of things around content and 23:53like we the dot property has tens 23:57hundreds of thousands million ions of 23:59pages as part of and we're trying to do 24:01way more with like Automation and like 24:03how we connect content together and 24:04stuff like that it turns out in order to 24:06do that like all your tagging has to be 24:08like really good across the entire 24:09property across tens of thousands of 24:11pages and it's like oh my God the amount 24:13of time that we are going to spend 24:14cleaning up the metadata on like this 24:16chunk of the website it's like just just 24:19kill your calendar for three days for 24:21like some whole chunk of the 24:22organization to go through this stuff 24:24and if we can instead like build just 24:26like a really good classifier um um and 24:29you know ways of doing that it's like 24:31that type of stuff actually lands like a 24:33huge relief and like lets us focus on 24:34doing the work that we actually signed 24:36up to do so like at least within my team 24:38like that's a lot of what we're doing is 24:40we're looking at this type of tedious 24:42work that is really um it's important 24:46and it has to get done to your point but 24:48like nobody really wants to spend their 24:50day um doing that can we do as much of 24:52that so we can actually like focus on 24:54doing the work we want to do but like 24:55when it comes to using llms for like the 24:58core core thing that we're doing 25:00everybody's still a little skidish um 25:01honestly at least in some of these now 25:02it's not on the software engineering 25:04side of our teams but on like some of 25:06like the you know more Creator side of 25:07it so it's like some of this some of 25:09these announcements like kind of resona 25:10with me because I see it with some of 25:11the folks that I work with a lot I think 25:13one of the other things is I don't think 25:14it's just tedious stuff I think for kind 25:17of prototyping type stuff you know and 25:19ideating it's really good like so and I 25:22don't think it matters whether you're 25:24producing content or you're producing 25:25code or you're producing images 25:27sometimes you're like I have an idea is 25:29this going to work H it's going to take 25:31me quite a lot of time to sort of build 25:33that up let's just get the llm to do 25:35something or the image generator to go 25:37through this a little bit I get an idea 25:39what it looks like and then I'm going to 25:40start pruning it and then I'm going to 25:42start building the idea a little bit 25:43more and I I personally again more from 25:46a software development side of things 25:48that's kind of how I work so I at the 25:50moment I'm sort of trying to create a 25:52distribut a distributed parameter 25:55service for training llms there is no 25:58chance that I would be able to just sit 26:01and code that straight up myself right I 26:04need an llm to help me out figure this 26:06out a little bit and then I will 26:07engineer through where I need to be with 26:09that right and and I think that is true 26:11and it's the same with image generation 26:13right it's like you know uh if you're 26:16doing a concept and you need that 26:17unicorn with rainbow colored hair get 26:20the get the image model got it yeah 26:23exactly get it get it out there and then 26:25you go okay you know that that doesn't 26:27quite work in context you know I need 26:30this and then you can go and draw your 26:31pretty unicorn at that point right but I 26:34I think prototyping is a really 26:36important use case and I think Chris 26:38like when when you're doing that 26:40prototyping right it's like you can have 26:42a dialogue you know with with a machine 26:45and you get major refactorings done in 26:48in seconds right because you can just 26:50like I want this other thing let me 26:52split this into four classes or let me 26:54collapse them the amount of work you 26:56would have to do and that's all the 26:57tedious stuff you know refactoring of 26:59code and we have idees to do that but 27:01they kind of suck so if you can actually 27:04get an llm to do that H it's just 27:06amazing and and like the time you can do 27:09it in an hour you know somewhere on a 27:11plane and you can actually write massive 27:13amounts of code and experiment with it 27:15Brian before we leave this topic I think 27:17we just need to remind ourselves that 27:19you asked kind of an art question to 27:21three nerds I'm I'm I'm safe in saying 27:24that right I mean just put a disclaimer 27:26here I think it would be a fascinating 27:27conversation 27:28uh to have uh artist representation on 27:31this question uh so all of this just 27:33taking you know we're talking about 27:34inevitability and tools and all of that 27:37and I think that's that's where our 27:39brains go but uh uh really fascinating 27:41to have this conversation uh with with 27:44the artists with the a very like one of 27:46the reasons why is because I do have 27:48like like I said I do have like friends 27:49who do both of these things um and I 27:52have just like observed how different 27:54the reaction is um from them and from 27:57like the community um that that they 27:59operate um and and like there's a bunch 28:01of like interesting economic factors 28:04here that play into like this like I 28:06think there's less concern in some cases 28:09about like more like real industry 28:11disruption happening with like the 28:13software engineering community than 28:14there is on the creative side so it's 28:16like I think there is that just like a 28:18little bit of that kind of core 28:19underlying economic anxiety that is not 28:21quite the same in in those two places 28:23even though um you know you're really 28:25just dealing with like just different 28:26types of models um that are helping 28:29improve productivity in different types 28:30of domains um but it'll end up Landing I 28:33think pretty differently potentially so 28:35I think it's a great point we did not 28:37totally represent that other side of 28:40that um of this but it is um it is just 28:43a super interesting topic I think and I 28:45think one of the things will be 28:46interesting is just to the point about 28:47never um I feel like there's so many 28:50tools that like you use them as part of 28:52a workflow and you don't even know what 28:54the underlying technology is it's like 28:56you know if you want to take a 28:57background out of an image like do I 28:58know that's gen or something else or 29:00what like do I even care um in some 29:02cases so you know in some of those 29:04places I'm like man never really U but I 29:07think it will be it will be interesting 29:08to see like how this SP evolves um over 29:11the next couple 29:12[Music] 29:16years earlier this week AMD announced 29:19the acquisition of ZT systems um and so 29:22I think as everybody knows like the 29:24hardware space has been like one of the 29:26biggest winners if not the biggest 29:27winner um so far in terms of like the 29:30early days at least of like the geni and 29:32uh llm sort of cycle um and AMD is a 29:36company like obviously we've talked and 29:38everybody's talked a ton about Nvidia 29:39but like AMD is obviously making um big 29:41play in this space um as well their CEO 29:44Lisa Sue was on cmbc um earlier this 29:47week and she was talking about the 29:48acquisition and one of the things is 29:50that like AMD historically has invested 29:53a lot in Silicon uh they've invested a 29:55lot um and even doing more on the 29:57software side of it and that the way 29:58that they talked about this acquisition 30:00is that they were starting to bring 30:01together a stronger set of capability 30:04from like a systems um perspective and 30:06so maybe vmar as just like a way of 30:08kicking things off like why is it so 30:10important like why is this Market moving 30:12from just like Silicon silicon to 30:14systems and like why are systems and 30:16like these almost like vertically 30:17Integrated Systems within this Bas like 30:19almost like so uniquely 30:21important so if you look at um AMD and 30:26the am the AMD offering 30:28AMD acquired ATI you know a decade or 30:31two decades back and that's the heritage 30:34of their AI accelerators uh and they are 30:37kind of head-to-head with uh Nvidia over 30:40the years and they own some spaces and 30:42Nidia some spaces I think what Nvidia 30:45did very well over the last couple of 30:47years is to look not only at the GPU 30:50itself but looking at you know many gpus 30:53in a box and then when you go into 30:55training you go multibox so you need 30:58many machines and the integration if you 31:01look at the Acquisitions Nvidia did is 31:04they acquired um a company which is you 31:07know providing the software stack to run 31:09very large scale clost stores uh which 31:11is the base command product and then uh 31:13they also acquired melanox which is the 31:15leader in like reliable network 31:19communication and so AMD is sitting 31:21there and like okay so what do we do um 31:24and they don't have a uh a Consolidated 31:27Story how they can put you know a 10,000 31:30GPU training system on the floor so 31:33they're kind of locked in the box and 31:35they are not yet at the scale where they 31:36could actually compete on the training 31:38side and that's I think also the reason 31:40why Nvidia you know owns like 96% of the 31:43market um a when you when you're trying 31:46to train you can pretty much only use 31:48Nvidia and then you already did all the 31:50coding on Nvidia systems and all the 31:53operators are implemented for Cuda and 31:55performance optimized because otherwise 31:56you didn't train the model then running 31:58it's kind of trivial right and so 32:00switching an ecosystem is really hard um 32:03Nvidia went down this route of you know 32:05having like the dgx system so they built 32:07full SS with all the network 32:09communication Etc and AMD I think is 32:12just now catching up so they're catching 32:13up on the network against melanox they 32:16announced Ultra ethernet and now they 32:18are catching up you know how to get 32:20these big systems into at scale into 32:23into the industry and you know they need 32:25to get into the cloud providers and so I 32:27think systems you know being a boutique 32:30shop which makes very large scale 32:32infrastructure deployments happen is is 32:35a lot of good conclusion that makes one 32:37of the um I think you mentioned training 32:39a lot like one maybe as like a follow-up 32:41question um to that you know one of the 32:44observations I have just about like the 32:45GPU Market in particular is that it 32:48feels like more vertically integrated 32:50than the world of CPUs um does at least 32:54like somewhat um and is like one I guess 32:57would you agree with that sort of 32:59characterization and two if you do like 33:02is is building out the sort of unique 33:05set of um requirements maybe around the 33:07training stack like is that like the 33:09underlying core force around why this 33:12Market is like behaving the way it is 33:14and why it's behaving differently or do 33:15you kind of see those that story like 33:17differently than the way I just kind of 33:19laid it out I think the training system 33:22Market is a a traditionally very 33:25esoteric Market which is the high 33:27performance computer market and you know 33:29at IBM we built like top 500 like like 33:32number one and number two top 500 superc 33:34computers with blue Gene you know lpq 33:37and uh the the follow on systems um and 33:41suddenly we are on a world where that is 33:43not anymore a a domain of the labs which 33:46drop you know $100 million dollars uh 33:48and get a computer uh suddenly every 33:51company which wants to train a network 33:53at scale needs similar technology and so 33:55what we are seeing is after 20 years or 33:5940 years almost like HPC being a very 34:01esoteric field of you know let's say 50 34:04supercomputers in the world Suddenly 34:07It's a you know it's a commodity and you 34:09start up it to we should all have a 34:10superu exactly you know like oh yeah I 34:13need a super computer you don't have one 34:15so and you know the I got unfinished 34:17basement like you know the joke con I 34:19was like I'm GPU poor right so I only 34:22have like 100 so the uh and and if you 34:26want to play in that market you need to 34:27actually offer a solution and I think 34:29AMD has been traditionally in the 34:31desktop market with the GPU or like 34:33Enterprise market with the GPU and they 34:35they sell s but they never build these 34:36systems Nvidia being an an actual GPU 34:40vendor amazingly has captured like 85% 34:44of the dollars spent in the data center 34:47right so it's like yeah your Intel chip 34:49good luck and a little bit of memory and 34:50everything else we take we take the 34:51switches and we take the the the 34:54ethernet cards and we take the GPU and 34:56that's the other 85% 34:58and so for AMD to get something deployed 35:00at scale I think they need to have an 35:02offering which is on par I think Intel 35:04with gud is in a little bit better shape 35:06because they have Partnerships over you 35:08know 50 years with Dell and Lenova etc 35:11for them it will be easier to get into 35:13that market because they already have an 35:14ecosystem and that's not the case for 35:16AMD this is why I don't get a vmar I 35:19actually don't get the acquisition 35:20because if like let's say I was an Apple 35:23company not Apple but an Apple company 35:25and my market everybody bought red 35:27delicious apples cuz they're great 35:29apples but my company sold Granny Smiths 35:32and nobody ate Granny Smith apples why 35:35would I buy a company that makes better 35:37packing boxes for my apples I I that's 35:41that's my problem with it I'm I'm kind 35:43of like if I'm spending $5 35:45billion you know spend the5 billion on 35:49getting better gpus right and and come 35:51go compete with Nvidia that's that's 35:53where I don't quite understand it in my 35:55mind I think the uh the uh Nvidia 35:57figured out a way of actually delivering 36:00it deploying it to Partners and to a 36:03certain extent AMD got locked out in 36:06that space so they need to find a a way 36:08to Market and what that way to Market if 36:11you look in the training space a huge 36:14percentage of the training is actually 36:16happening with the 36:17hyperscalers um companies like they want 36:19to put Nvidia cards on their premises 36:22but in many cases in for early 36:24Beginnings they go into the cloud nctt 36:27is deliver to the hyperscalers so for 36:29them it's a way to get into the 36:30hyperscalers with a solution where they 36:32say okay we give you the whole thing so 36:34you you take down the risk on the 36:35highers 36:36scalers I'm not sure people do want to 36:39use Nvidia I think I you know I I think 36:43that nvidia's got this Market Lo and 36:44Nvidia is awesome they make great gpus 36:47but but at the same time Apple seems to 36:49be doing well on the desktop Market or 36:51the laptop Market with their uh with 36:54their chips and with mlx as a framework 36:56so you know custom Apple silicon seems 36:58to be working out well you're seeing 37:01companies like Google invested in their 37:03own kind of Asic based chips chips with 37:05tpus you see other people move into as6 37:07as well I I think there is a space for a 37:11lowcost alternative to Nvidia chips and 37:15I I think there is a market for that 37:16because otherwise other other companies 37:18hypers scales Etc wouldn't be investing 37:20in that and that's why I'm saying I 37:22don't get it I you know Nvidia by far 37:24makes the best gpus across the board 37:27they're an incredible company I I just 37:30think if I was a competitor I would try 37:32and find an eanet space which isn't the 37:35packing boxes yeah I I think the the uh 37:38the really for in the training market 37:41right now Nvidia is just the only choice 37:43you have and I think this is primarily 37:46where indd is trying to break in I think 37:47in the inflence market there will be you 37:50know like you said apple and you know 37:52there's Qualcomm there's a ton of Chip 37:54vendors and there's a you know a pleora 37:57of startups in Silicon Valley who are 37:59trying to make like super low power Etc 38:01but in the training Market if you look 38:04where AMD is going and the wattages they 38:06are putting down you know where it even 38:08goes above a th000 Watts on a on a GPU 38:12in in the next Generations that is um we 38:15are you know Nvidia is effec the only 38:17game in town and I think they want to 38:19put something up against it and you only 38:21have for pre-train maybe for pre-train 38:24maybe but not necessarily fine tune fine 38:26tuning I think you can in many cases you 38:28can do in a box like you do not need a 38:30huge system yes but in the pre-training 38:33market you you do and this is where are 38:36right now you buy Nvidia or you buy 38:38Nvidia and you know gudy isn't there yet 38:40AMD isn't there yet and so I think this 38:43is effec a an attempt and who knows how 38:46let's see how it plays out right I mean 38:47I thank God I didn't have to make the 38:49decision um but um you know I think this 38:52is an an attempt of breaking into that 38:55large scale training market and 38:56delivering you know very very large HPC 38:59systems you know companies run 100,000 39:02GPU training cost building that takes 39:04you know a year it's massive investment 39:07you know it's in billions of dollars and 39:09so if you want to capture some of those 39:11revenues then you need to have it's it's 39:13not you know um like oh we we collect 39:16like three engineers and they put up a 39:18supercomputer it's like no this is a 39:20this is a construction process right and 39:22and this is where where AMD with this 39:25acquisition finally has a chance of of 39:27you know bringing the guys with the 39:28hardheads in as well because you need to 39:30put Power in and Cooling and all this 39:32stuff right and I think they don't right 39:34now because that's all rled out they do 39:36not have the experience and so they I 39:38think they're buying the the competence 39:41but that point about competence was 39:42actually something I saw come out a lot 39:44in the discussion um post the 39:46acquisition where um you know this is a 39:48company that does have a lot of 39:49capability around doing exactly that 39:51building out large scale clusters some 39:53of the biggest in the world essentially 39:55um and it's a kind of inter in theme 39:57that I've heard at every level of the 40:00whole gen stack at different points over 40:03the last year or so you know you hear it 40:05in the hardware side you hear it and 40:07it's really like to the point about 40:09being like you're almost rate limited by 40:10the amount of expertise that's in the 40:12market right now it's like in the 40:14hardware side I heard it on like the 40:16training side you heard it for a while 40:18on even like the prompt engineering side 40:21like you know people refer to them as 40:23like you know magic encampments um for a 40:25little while and there was like this 40:27like only a certain like group of people 40:29even really knew how to prompt the model 40:31um correctly and a little bit of what 40:33I've observed over the course of like 40:35the last I guess like almost two years 40:36at this point is that like as this thing 40:38has blown up like it feels like some of 40:40those skill shortages are like getting 40:43less acute like more people know how to 40:45train models more people are getting 40:47competent uh working with models more 40:49people obviously are like attracted to 40:51the hardware side of the equation 40:52because of some of what's happened over 40:54the last couple years um I'm curious 40:56like across the board 40:57like how much do you feel like our 40:59progress in AI is still rate limited by 41:04just like raw expertise um across the 41:06world in in this space and like how much 41:09has that improved or not um over the 41:11course of the last like year or two and 41:12so maybe Skyler just kick it over to you 41:14sir I I have this conversation pretty 41:17regularly uh with our our director and I 41:21would say it's not necessarily the 41:23overall amount of skills I think that 41:26definitely is monoton Al increasing but 41:28how it's distributed across the globe 41:30that's becoming more extreme and so I 41:32think that's something that uh we are we 41:35are experiencing you know we're IBM 41:37research Africa we represent a billion 41:40people uh but uh the talent that here is 41:44probably going to immigrate and what 41:46does it look like to have that Talent uh 41:48here and and bring that culture here so 41:51yes it is increasing but I think at very 41:54different rates across the globe that' 41:56probably be my my my short summary of 41:58that and it is something that uh we we 42:00do talk about on a regular basis is what 42:03does uh capacity in generative AI look 42:06like on a really global scale so that's 42:10probably another another session 42:11entirely in itself that was a I was not 42:13expecting that that was a fascinating 42:15perspective on that so yeah Chris vmar 42:19thoughts 42:21okay yeah I think um there is such a uh 42:26Gold Rush 42:27and it's a new technology and so it's a 42:30lot about you know even trying it out 42:32and every day there's something new so 42:34you need people who are really 42:36passionate about it um and you know that 42:39they you know spent their living and you 42:42know half sleeping hours on it uh and so 42:44the skill set I think will develop over 42:47time it's I I feel like you know we are 42:50repeating the the Gold Rush of the repb 42:52era where I was like oh my God you can 42:54write a web service isn't that amazing 42:57and now it's like yeah you know 42:58everybody can do it and so I think we we 43:00are just in this in this uptick with a 43:03very like extreme Supply shortage and 43:06because it's it's so deep like you know 43:08when you just pluged a computer into a 43:10network it was relatively easy I mean 43:12it's like okay you know here's a 43:13computer on a network go now it's like 43:15the training is different you know do 43:17you need to even understand what math is 43:19and most Engineers hate math that's why 43:20they like computers and so there's this 43:24this set of skills which need to be 43:26built up and you know until it actually 43:29rolls to the universities and we get 43:31people who are truly practitioners so 43:33you first you need to get the education 43:35and then you need to become a 43:35practitioner and you need to toy around 43:37with it for 5 years so I think for the 43:39next 10 years we will probably be in 43:41this and plus the speed of change we 43:44will in we will be in this world of you 43:47know there's Supply shortage everywhere 43:50uh I think on the flips side coming from 43:52the systems corner it's nice to see that 43:54finally we built Big computers again so 43:56I I really like this and you know that 43:59we are actually going away from like the 44:01cloud providers do everything for us and 44:03we need to actually look at system 44:04design with a you know fresh angle I 44:07think that's a that's a goodness for the 44:08industry so it was kind of locked in and 44:10the only you know there are like five 44:12companies in the world who still know 44:13how to plug a computer into into a 44:14network into power socket and I I think 44:17it's good that we are actually going 44:19through more of a of a Renaissance of 44:22you know computer architecture and and 44:24and design at least you know 44:27yeah I'm the total opposite I think that 44:30people 44:32are I I think skills people are learning 44:35the skills and they're doing a great job 44:37of that across the globe um but at the 44:39end of the day if you want to train a 44:42large language model you need an awful 44:45lot of gpus and you need access to an 44:47awful lot of data and that is outside of 44:50the access to the average human being so 44:54there is a lot of really great skill 44:56Talent and they are not going to be able 44:58to practice their craft because access 45:02to the gpus to be able to learn what is 45:04the effect of this data it just isn't 45:05there now they can you can learn from 45:07doing things like fine-tuning and 45:09training very very very small models Etc 45:11but at the end of the day we know that 45:13for the larger models it it it emerges 45:16uh on the higher scale and and therefore 45:19and and at the scale now is it's tens of 45:21thousands of of uh gpus to be able to do 45:24that and I think that is what's locking 45:26out average practitioner so me 45:29personally I I want to see more 45:31distributed compute I want to see more 45:33access to gpus and skills and therefore 45:35I think to kind of Skyler's point I 45:37think that will open up a really 45:40talented set of people that are uh 45:42distributed across the globe to be able 45:44to uh make great contributions in that 45:47area but at the moment it's going to be 45:49concentrated in the big tech companies 45:51because they're the ones with the gpus 45:53Chris I want to fight back on your 45:54fighting back that's why we do this 45:55right 45:57if if I have a researcher that comes to 45:59me and says the only way they can make 46:00their case is that they need 10,000 gpus 46:03that's that's not a good argument that 46:05researcher needs to be able to make 46:06their case off of two gpus so AG where 46:09you know where's that conversation start 46:10about making the case off of this uh 46:12this 2 GPU example show that then we can 46:17talk about the 100 the 2000 the 100,001 46:20I don't I don't think it's it's fair to 46:22say I can't make progress unless I have 46:2410,000 I don't 46:27I I I I agree Skylar but again we're 46:30sitting in a company who has 10 46:32thousands of gpus right so they can go 46:34to you make the argument with two gpus 46:36and then you can give them access to to 46:38scale right but the average person they 46:41might get so far with two gpus and then 46:43they're like huh I don't have the money 46:46now well I'm gonna go and do something 46:48else so we're moving to a world of 46:51universal basic compute um it sounds 46:54like I feel like that's been a little 46:55bit mey um recently so we will we will 46:59call it a day there um vmar Chris Skyler 47:04thank you all for joining great 47:05discussion uh today U and for those of 47:08you out who are listening to the show uh 47:10you can grab mixture of experts on Apple 47:12podcast Spotify and podcast platforms 47:14everywhere so until next week thank you 47:17all for joining we'll see you next time