Learning Library

← Back to Library

Is Pre‑Training Dead? GPT‑4.5 Debate

Key Points

  • The episode opens with a tongue‑in‑cheek debate about “pre‑training being dead,” emphasizing that GPT‑4.5’s success (even at making cheese jokes) shows pre‑training is still relevant.
  • OpenAI’s GPT‑4.5 launch was framed as a non‑frontier, cost‑constrained model; the company highlighted its high serving expense, GPU limits, and uncertainty about long‑term API availability.
  • Panelists note a market shift from scaling pre‑training compute to focusing on inference‑time compute—spending more resources on longer reasoning at runtime yields bigger performance gains than ever‑larger pre‑training runs.
  • The release sparked discussion on scaling laws and whether AI research has hit a wall with pre‑training, with experts suggesting that “inference compute is king” and that the old “train‑more‑data” paradigm is losing its edge.
  • The show’s hosts (Bryan Casey, Chris Hay, and Kate Soule) use the GPT‑4.5 announcement as a springboard to explore broader trends in AI model development and deployment strategies.

Sections

Full Transcript

# Is Pre‑Training Dead? GPT‑4.5 Debate **Source:** [https://www.youtube.com/watch?v=LQEhOObUhQg](https://www.youtube.com/watch?v=LQEhOObUhQg) **Duration:** 00:23:58 ## Summary - The episode opens with a tongue‑in‑cheek debate about “pre‑training being dead,” emphasizing that GPT‑4.5’s success (even at making cheese jokes) shows pre‑training is still relevant. - OpenAI’s GPT‑4.5 launch was framed as a non‑frontier, cost‑constrained model; the company highlighted its high serving expense, GPU limits, and uncertainty about long‑term API availability. - Panelists note a market shift from scaling pre‑training compute to focusing on inference‑time compute—spending more resources on longer reasoning at runtime yields bigger performance gains than ever‑larger pre‑training runs. - The release sparked discussion on scaling laws and whether AI research has hit a wall with pre‑training, with experts suggesting that “inference compute is king” and that the old “train‑more‑data” paradigm is losing its edge. - The show’s hosts (Bryan Casey, Chris Hay, and Kate Soule) use the GPT‑4.5 announcement as a springboard to explore broader trends in AI model development and deployment strategies. ## Sections - [00:00:00](https://www.youtube.com/watch?v=LQEhOObUhQg&t=0s) **Debating Pre‑training After GPT‑4.5** - In an impromptu “emergency pod,” the Mixture of Experts hosts riff on OpenAI’s GPT‑4.5 launch, its atypical announcement tone, and humorously argue that pre‑training is far from dead, even if the new model excels at cheese jokes. - [00:03:11](https://www.youtube.com/watch?v=LQEhOObUhQg&t=191s) **Debating AI Humor Impact** - Panelists argue that the new model’s genuine humor and creative writing represent a notable but perhaps undervalued advance beyond traditional benchmarks. - [00:06:20](https://www.youtube.com/watch?v=LQEhOObUhQg&t=380s) **Shift from Pre‑Training to Alignment** - The speaker asserts that although pre‑training will persist, future performance and differentiation will stem from smarter alignment techniques, trust and transparency measures, and licensing strategies rather than merely scaling pre‑training data. - [00:09:43](https://www.youtube.com/watch?v=LQEhOObUhQg&t=583s) **Shift to Efficient Reasoning Architectures** - The speaker outlines a new, more powerful model architecture that reduces compute demands, argues that future gains will come from optimized inference‑time reasoning rather than sheer pre‑training scale, and references IBM’s recent release as an early step toward this shift. - [00:12:49](https://www.youtube.com/watch?v=LQEhOObUhQg&t=769s) **Shift to Reasoning Models & Agent Marketplace** - The speaker examines how the transition from base to reasoning models introduces variable response times and development challenges, envisioning an emerging AI agent marketplace similar to Fiverr for on‑demand tasks like translation. - [00:15:50](https://www.youtube.com/watch?v=LQEhOObUhQg&t=950s) **Pre‑training's Persistent Edge in AI** - The speaker argues that pre‑training will remain essential as companies compete for performance advantages through inference compute and tooling, and developers are willing to wait longer for slower but highly accurate model outputs when they dramatically boost productivity. - [00:19:02](https://www.youtube.com/watch?v=LQEhOObUhQg&t=1142s) **Debating Future of Reasoning Models** - The speakers discuss whether separate reasoning-focused AI models will become obsolete in favor of unified models that internally decide when to reason, and critique OpenAI's costly non‑reasoning release. - [00:22:31](https://www.youtube.com/watch?v=LQEhOObUhQg&t=1351s) **Future of AI Mesh Networks** - The speakers speculate that as hardware speeds improve, large monolithic models will be replaced by a distributed AI mesh of specialized expert agents communicating across a network, rendering single-model inference obsolete. ## Full Transcript
0:00Is pre-training dead? 0:01No, because 4.5 does the best cheese jokes ever, and why would we stall pre-training for lack of cheese jokes? 0:10I need my cheese jokes, so no, pre-training is here to stay. 0:14I knew I could count on you. 0:15Kate, over to you. 0:16It's already been dead. 0:18Come on, we're beating the dead horse now. 0:19All right, so with that, we will jump into this week's episode. 0:29Hello, everyone. I am Bryan Casey. 0:31I am guest hosting this episode and welcome to Mixture of Experts. 0:35Every week, Mixture of Experts goes through the hottest stories in artificial intelligence. 0:39And a fun thing about the show is that we record on Thursday mornings. 0:43And on Thursday afternoon this week, GPT-4.5 arrived, and we decided that this was a good time to do our first ever emergency pod. 0:51Um, and so I'm excited and thrilled to have both Chris Hay and Kate Soule on the episode today. 0:58If it wasn't obvious from the opening question, 1:00the one and only topic that we'll be discussing today is, um, Thursday afternoon, OpenAI released the much anticipated, uh, GPT-4.5 1:09Having seen many releases, we get new model releases every day of the week, um, around here. 1:15There are some things to me that were pretty remarkable about this release, even in the way that OpenAI, uh, communicated it to the rest of the market. 1:22Um, first, in all of their announcement materials, um, They didn't describe 4.5 as a frontier model. 1:29Um, they were very clear in all of their communications that this was against kind of the normal conventional benchmarks, 1:35was not going to be a world beating model. 1:37Um, they talked heavily actually about the expense and cost of serving the model and the size of the model, 1:43even saying that they had run out of GPUs, um, in terms of their ability to serve it. 1:47Um, and that even in some of their like documentation, they were kind of non committal long term 1:52about whether they were even going to keep this model available in the API. 1:56Um, and so to just maybe riff a little bit on our opening question, just maybe start with you, Kate, a little bit, because you said that pre-training's already been dead. 2:06Um, the immediate discussion in the market, 2:09because I think the assumption is that GPT 4.5 was trained on something like 10x as much compute as GPT-4, 2:16that's at least the hypothesis, that I've seen, uh, thrown out there. 2:20People immediately went to the implications around, did we hit the wall, scaling laws, is pre-training dead? 2:27What's your take? 2:29I mean, I think even before GPT-4.5 came out, we saw really compelling evidence with DeepSeek 2:36and other models that inference time compute is king, not pre-training computes. 2:42We're seeing tons of things unlocked by spending more, by reasoning longer at inference time that costs more, 2:49but if you spend more money there, it's unlocking all sorts of new performance gains. 2:54And the old mode of just pay your way during pre-training by training for longer and longer and more and more data. 3:00It's, you know, We're not seeing the same gains all of the we're worked our way so far up the cost curve, so to speak, 3:08we're really seeing a plateau from that perspective. 3:11So, you know, I don't think this is actually unexpected. 3:14Um, if you actually look at where we've been headed for a while now. 3:18Maybe Chris to throw to bring you into this and to riff a little bit on your, um, your opening remarks, um, let's just say, um, 3:25I think one of the reactions that I've seen in the community is that while it didn't beat, you know, 3:31set like new kind of, um, uh, I think standards in terms 3:35of like some of the math and science benchmarks, 3:36even just like the conventional benchmarks, 3:38the reaction and even discussion in the market was that the model was like really good at writing. 3:42Um, it was funny and that, you know, people weren't used to seeing a model that was like actually funny in a way that like wasn't cringy. 3:50Um, essentially that was maybe more creative, 3:53um, than, uh, you know, past models. 3:56Do you agree that like with kind of Kate's take, or do you also feel like we're kind of like underselling like how important is to have he reached those milestones? 4:04Are we underestimating 4:05the value and, you know, where kind of creativity and writing and humor and things like that sit on the intelligence curve. 4:11But, you know, I'm curious your thoughts on that. 4:13It's late at night for me. So if I agree with Kate, do I get to go home? 4:16No, of course I'm going to disagree with Kate. 4:20Where would be the fun if I didn't, right? 4:22I expect nothing less. 4:23I think. 4:24So I think the first thing it is the creativity. It is a genuinely funny model. 4:29It is actually super, super funny and sassy. 4:33And it is the first time I've really seen good creative writing coming from the model. 4:38So actually, I, I think they've done some, something pretty good there. 4:41Of course, it's not going to be good at the, uh, stuff that inference time computers is, 4:46you know, math and stuff like that because you, it does need more time to think on that. 4:50And I'm, and I'm okay with that. 4:51Does that make pre-training dead? 4:54No. 4:54You know why? 4:55Because if there's no pre-trained models, 4:58what are you inferring at inference time? 5:00Nothing. 5:01You need the pre-trained model in the first place. 5:04So I pre-training is not going anywhere. I. If I was going to make a prediction and 5:12that prediction is going to be that there's a lot of techniques that we're doing at 5:18fine tuning layers to be able to support inference time compute, 5:21which I think can go back to pre-training because the reality is, right, 5:25you know, here is the entire internet is probably not the most efficient way to do pre-training in the first place, 5:32and actually, the biggest thing that we've learned is that the quality of data during reinforcement learning is 5:38the quality of data for chain of thought during inference time is actually making a bigger impact than anything else. 5:44So actually, if we go back to the pre-training cycle, rather than saying, hey, go look at the internet and tell me when you've suddenly learned something, 5:51it's like that episode of The Simpsons where Bart went to Paris for three months and then suddenly at the end, he was like, I can speak French. 6:00That's how we train large language models. 6:03And I think that's going to change. 6:04And it's going to be, look, how do we build quality, synthetic data sets to be able to do pre-train? 6:09So I think we're going to go back and forward all of the time, and we're going to pronounce pre train as dead. 6:15And then suddenly, you know, we're going to do something good again. And then it'll be like, oh 6:19no, everybody pre-train. 6:20And we're going to go back and forward, back and forward. 6:22So no, pre training isn't going anywhere. 6:24So just a couple of reactions, right? 6:27So, especially on the emotional side, 6:30the, uh, the humor, the characteristics, 6:32that's not pre-training, right? 6:34That's all imbued into the model during the alignment after pre training. 6:38It doesn't matter, like, really, if we want to talk about the models doing a great job at that, 6:41that's not because they pre trained it for 10 times longer. I highly doubt it. 6:46It's really due to the alignment of the model. 6:49So, I don't know that. 6:51That was worth, you know, if they truly spent 10x on the pre-training, uh, really worth it, 6:56but I do think you're right, Chris, that pre-training will change. 6:59So when I say pre training's dead, the mode of throw more data and spend more until performance goes up, I think is dead. 7:06Being smarter about how we pre-train, I completely agree. 7:09I do think for the near term, we're going to see much more like base models as a commodity where I don't care, 7:16uh, from a performance perspective. 7:18I think there are other things that can differentiate base models, particularly on 7:22a trust and transparency angle, especially if they're not driving up performance anymore, that becomes more interesting. 7:28The licensing of the base model is another example, but you know, I think. 7:33In a lot of ways, it's kind of like pick your right model size and then all the innovation right now is really happening on the alignment side. 7:40So pick your favorite base model that meets your cost and other criteria, uh, and then apply your alignment techniques on top of that to really drive and meet your needs. 7:50I agree and disagree. 7:51And I think the reason I'm going to 7:53say that is I don't think you care. 7:55until you care. 7:57And what I mean by that is, is base models 8:00have been commoditized at the moment, 8:02and at this point, time inference, time compute is the most important thing, and there's a ton of mileage to get on that. 8:08And then there will be a point where that 8:10mileage will slow down a little bit, and then it'll be like, oh, I need a better base model to be able to get there. 8:17And then suddenly we're all gonna be like kids on the soccer field. 8:20We're all gonna run towards the other end of the field. 8:23And then we'll be like, oh yeah, I can get an extra percentage point, 8:27or I can do a little bit better if I have a pre-trained model. 8:29And we're gonna run over there, 8:31and then we're gonna go, oh my goodness. 8:33Tools, tools is the thing we, I'm going to mention agents, of course I'm going to mention agents, 8:38it's like, and we'll be like, the best tools is what's going to make the models, and we'll be like, inference times compute, that is dead, 8:46because tools is going to be the most important thing, and we're going to run over there. 8:50And then actually, we're just going to run around in circles from thing to thing, optimizing and because we've done this dance before. 8:57Uh, and, and that's what we're going to continue to do. And it's going to be fun, but all of these things are important. 9:02If we take the, you know, base model as a commodity right now, another step further, 9:07I think it will show that we're going to see a lot more innovation in the architectures. 9:12So you have to pre train new architectures, 9:14but as we talk about how do we get more efficient models, obviously a mixture of experts as an architecture is becoming important, 9:20uh, in terms of broader efficiency and, being able to maximize performance per cost, 9:26and I think continuing to find ways people are going to try and differentiate and kind of break out of this commoditization, 9:31right, by finding ways to drive architectural improvements. But I don't think the 9:36story is going to be, we're going to have this new architecture that we trained for 10 times longer than anybody else, 9:42and that's why the model is special. 9:43It's going to be, we came up with this new architecture that's even more efficient and powerful 9:47that you can now move to when you do all of the fancy alignment that gives you the true performance for the model. 9:52One of the first places everybody's head goes when they see this happening is like, 9:55Oh my God, what's happening to all the compute build out that's happening in the world right now is like, is that under threat? 9:59But what didn't happen is like. 10:01All of those, like, stocks just going nuclear or something like that. 10:04And I think a lot of that is because 10:07of the opportunity that is around test time and inference time compute. 10:10And I saw even one of the former research leaders at OpenAI was just talking about, 10:14like, it's pretty clear that in 2025, the 10:17optimal way to use compute is not going to be to be just, you know, scaling, um, pre training, basically, as far as you can go. 10:26And it's going to be in reasoning, and the gains are going to happen in reasoning 10:29and 10:30I know that we are kind of early in that journey, um, over the last, like, you know, few months or so. 10:35Obviously IBM just had a release, where we started on that journey a few days ago. 10:41You know, maybe could you talk a little bit about, like, what even that might look like? 10:44Like, if we're gonna be, like, attacking this vector over the course of the next you know, until we get as far as we can go there. 10:50Like, what are the types of things that people are going to explore? 10:52What are the opportunities? 10:54Is it just like make this thing think for a week and come back? 10:56Or is, are we going to be a little bit more sophisticated than that? 10:59Yeah. I mean, I think at the top level, the thing to think through is we now have a pass through model for cost, right? 11:05So instead of a model provider spending a bunch of money in fixed costs to get high performance, model provider can just kind of pass that through and say, 11:13look, you can host the model and pay for as or you can pay through an endpoint, but just keep paying until you get the performance you want. 11:21And if you don't need all that performance pay less. 11:24And so I think it's gonna like approach a much more, you know efficient market so to speak right where you're paying for what the task calls for 11:31versus, you know, you've got some subscription of X dollars, you know, uh, a month that you, you know, are kind of locked into. 11:42So I think we're going to see a lot more flexibility in pricing. 11:44We already saw that with Anthropic, uh, 3.7 right, where you can set different cost, 11:50uh, parameters for how much you want to pay basically to how long you want to think. 11:54Uh, for a given task. 11:56And so I think that's only going to continue until, you know, everything is going to be like, well, how much is it worth to you? 12:02Like, I don't know if we'll get to like an auction setting almost like you could like even put it up for bid. 12:07Right, but, um, I think it's going to be much more efficient. 12:10Uh, In terms of actually getting economic value out of generative AI, because you'll pay for what something is worth. 12:16Chris, maybe as like a question to that is like, as an end user of these tools, 12:21I think being able to decide how much, like when I want to use, when I just want a quick answer, when I want to use reasoning, 12:28when I like search is becoming like an increasingly significant thing that people are rolling out. 12:33And like, I can just kind of decide when I'm using each one of those things. 12:36As an application developer, um, and when now you're like trying to, you just want the model to get to the right answer at the lowest cost as fast as possible, 12:43like, how are you, like, how would you be thinking about these trade offs going, going forward? 12:49Are we happy that, like, so many of the, like, next incremental grains are going to be happening via 12:54the, uh, you know, more of like reasoning models, um, versus the way that we used to get them through kind of the base models. 12:59Is it more complex now when you're thinking about like the user experience? 13:03It's like, oh, I used to be able to just count on getting that answer really quickly now it's like, 13:06sometimes my answers are coming instantly and sometimes like a model goes off and like thinks for like, you know, five minutes before it comes back on something. 13:13It's just like, as when you're thinking about like the developer community and they start to adopt some of these tools, like, 13:19are they happy about the fact that like more and more of this is going to get put into reasoning over time? 13:23Does that make things harder, um, to build this stuff into applications? 13:27I think everything is a trade off, 13:30and actually. 13:31I really like Kate's analogy on this one, and, and I like it 'cause I did a video on this a while back. 13:37Whereas I, I think we are gonna move into this agent marketplace and I think that is probably the most important thing. 13:43And I think in the same way as we go into something like Fiverr and we say, 13:48I need some, I'm gonna spend five bucks 'cause I need, uh, a video edited, or I need 13:54somebody to go and code something up for me. 13:56I think we're gonna be in the same world with agents and, and the reality is 14:01that if I need a document translated and I need it done in five minutes, then you can have the best model in the world, 14:09but if it ain't doing it in five minutes, which is when I need this thing to be done, then I don't care. 14:14If I'm doing real time translation, so I once spent some time, I think I was in Moscow at the time, 14:20and there was this guy who was translating what I said real time into Russian. 14:26Now, It can sit and think all it likes, but the audience is going to be waiting for that translation, right? 14:32So, so I think there are times where real time is going to be important, and that's going to be the same with coding. 14:39But at the same time, accuracy is going to be important as well, because like that translation scenario, if the guy just started making up what I said 14:46because he didn't understand it, then it's great that he's real time, but he's just spouting gibberish at that point, which isn't any use to anybody. 14:57So I think if you're fast, and you're accurate, and you're cheap, and you can do the same job as something that is big and takes a long time, 15:04that is going to, and expensive, that's going to win, 15:09and that's just market dynamics. 15:11Um, but when it comes to something really important, so, If I am, for example, needing to do some deep research and finding some chemical compound, 15:24you know, having a, today's one billion parameter models, take a plucky guess in the air without thinking about it. 15:33I honestly, I don't think you're going to be that satisfied with the result. 15:36So it's going to be a balance on the task, on how much effort and thought and tools, etc. 15:42You're going to, but it is going to be a marketplace dynamics, and it's going to be 15:45latency, it's going to be cost, and it's going to be the level of intelligence that you need. 15:49So. 15:50To the point, this is again, coming back to this, why I still don't think pre training is going to go away, 15:57which is if you can gain an edge on the base model so that it can actually reason a bit better with the combination of inference time compute, 16:07with the combination of tools, then that might be the thing that gives you the edge. 16:12In that scenario, and therefore every single company in this is in a race to have an edge. 16:17And if they didn't have the race 16:19for the edge, why are we all publishing benchmarks all the time? 16:23Because we wouldn't care if we weren't, this is better than this one. 16:26So I I just think these dynamics are going to play out and back to what I said beginning, I don't think this is going to go away in the case of development. 16:35And I know this is a long run. 16:36Bryan, I do apologize. 16:38Sometimes I need a fast, so coming back to your original question, 16:42sorry, everyone, 16:43it's taken me that long to do that, 16:44If I'm in my VS code environment, if I'm just sort of doing auto complete stuff, that needs to be fast. 16:50But if I'm writing an entire program, an entire game, or doing a migration, 16:55and you know what, the model's going to take five, ten minutes, but actually it would have taken me two weeks to do it. 17:01I'm going to wait that time, right? 17:02If, especially if it's accurate, if I have to wait ten minutes and it's completely wrong, I'm not going to wait. 17:07So it's, these are the marketplace dynamics that I see. 17:10I think there's two really interesting points to bring up, Chris. 17:13One is that you can think of costs from like, how much do I have to spend? 17:17But obviously costs from latency is critical to think about as well. 17:20So that's definitely like a third dimension to all of this as people starts into the market and figure out like, what is it worth to me? 17:26How long can I wait and how, what performance do I need? 17:28And like those three combined is going to kind of drive you to your, your model selection. 17:33But I also think that like as we talk about the experiene and what we've built with generative AI so far, 17:39everything we've done for the past two and a half years has been based off of chat, instant response. 17:45So now that we have the reasons to wait, because we'll get better results, like waiting for a conversation, turning on conversation doesn't make sense. 17:54No one would do that. 17:55But now that we have reasons to wait, you know, I think we're going to see entirely new things get built with generative AI 18:01or ideas of how you build with generative AI, because we now have the incentive to find those other patterns 18:08and things that didn't require instantaneous responses now suddenly become in scope. 18:13I'm curious how you think, um, that 18:16this will actually come together and like people will consume it. 18:19And so like open AI, um, it's kind of like a little bit of a joke online when people you open up the interface and you look at the model selection 18:26it's like if you're not like listening to the equivalent of this show every day like how would you guess which one of these things you're supposed 18:34to use and so they've been very clear that their part of their roadmap is to bring them together 18:38and you ask a question and the model just kind of knows which ones of these things it's gonna 18:43bring together and also part of me was even wondering in the wake of this where, you know, 18:48if you're not going to be able to break through on the benchmarks in terms of like the criteria that I think the market understands, you know, 18:55what's the purpose of actually shipping a base model that doesn't have reasoning, in it, if it's just going to end up underperforming whatever your last reasoning model is. 19:02And one of the questions I walked away from like that kind of series of things is like, 19:06are we getting to the are we coming to the end of the line in terms of like even having base models that don't have reasoning attached to them? 19:13Will that be kind of like a weird artifact of history that we had those models at all and in the future all of this stuff will just be integrated together in a single model, 19:20and the model itself will just decide whether it needs to use reasoning or gives you a straight answer right away? 19:26Or do we think there's like a real chance that like, no, there can continue to be, you know, like each ones of these different classes of models 19:33and they can each do kind of their discrete things. 19:35But, you know, I'm curious, just like how much convergence that you see actually happening in that space. 19:39And, you know, maybe Kate, I'll just turn it over to you to get kind of your initial take on it. 19:42Yeah, so a couple of things. 19:44I don't think 19:45OpenAI made a mistake by releasing a non reasoning model. 19:48I just think the fact that they released such a big one that costs so much, 19:54you know, it was probably a waste of time, a bit, uh, a waste of money. 19:58Like I think there are plenty of use cases that we're seeing right now where reasoning actually doesn't help. 20:03Things like tool calling and things where you have very clear structured patterns and you kind of just want to like fine tune for that very specific thing, uh, you know, 20:12doesn't necessarily require reasoning, 20:13but I almost don't know that it matters like, are we going to have reasoning models and non reasoning models? 20:18Because like, what is a model? 20:19Like is OpenAI, are those models really just an individual model or are there either multiple models being routed to already? 20:28Are there experts that have been reserved for different tasks? 20:30Like our whole definition of what a model is or is not, I think is just going to continue to be fluid and, 20:35continue to evolve as we find new and clever ways to bring this together. 20:40I do think though that we're always going to need more instruction based focused. 20:46Capabilities and more reason based capabilities and the ability to kind of switch back and forth, depending on what the task calls for. 20:52And I'd agree with that, Kate. 20:54I really would. 20:56And I used this analogy before and I, I can't help thinking about the mainframes ironic for the company that we work for. 21:04Right. But, you know, But you had these big massive mainframes and you know, and what is the world that we live in just now? 21:12We have computer in our pocket, on our mobile phone, on our laptop, and then architecturally everything is distributed. 21:19We have microservices and they all communicate and they all have specialized tasks and then we have good buses between them. 21:25And if I really think forward into the future. 21:28I do think that the models are going to get smaller and smaller. 21:31They're going to get distilled down. 21:33Um, I think that was probably the point of, uh, having such a large model is that they're going to use that for distillation. 21:40And we're going to see some really good reasoning models, uh, with a very, very good base model, uh, based off of the GPT-4.5 architecture 21:47and then in the future, GPT 5. 21:50So I think that's really the 21:52kind of, kind of point behind that, 21:54and also to keep the hype cycle going, which I love. 21:56Um, but the. 21:58But I think we're gonna end up in this microservice based world of architecture, and in the same way as we move from mainframes to distributed computing. 22:05And I, I see this exact same thing happening with generative AI because the reality is if I need something fast on my phone 22:12and the models are getting capable, and I can do something that that, that maybe a GPT-3 model used to be able to do, but I can do that on, on my phone 22:22on a couple hundred million parameters, then let's do that real time and then, uh, you know, if I need a little bit more reasoning, if I need a bigger model, 22:29then I may go off and use some bigger compute. 22:31So, and then we use mixture of experts at the moment to be able to do that routing, but then as network speeds get faster and faster, 22:39latency gets faster and faster because the chips are getting faster and faster. 22:43Again, why wouldn't you start to do that across a mesh of some sort? 22:47So at that point, rather than yeah. necessarily using a mixture of experts where you've still got a large model and partitioning, 22:53then you have truly separated AIs which are communicating with each other with true experts in the same way as we have humans. 22:59So I, I see that expansion coming. 23:02And, and again, this is Chris opinion. 23:04I, I think as we move into probably, it's probably not a '25 thing, but I think as we look into '26, '27, I, I think we're 23:12all going to be going, "Oh my god, is the big, the days of the large model gone? 23:16Is the day of inference time compute gone?" 23:18It's mesh. 23:19We need to be in an AI mesh, and then, and single models are dead. 23:22I'm, I'm sure that's coming. 23:25All right. 23:25Well, 12 months from today, Chris, we will have an emergency pod 23:30on mesh networks, so I think that's a really good place to to end. 23:34Chris, Kate. 23:35Thank you for joining. 23:36Um today I think this was obviously a topic that the industry has been waiting to see the outcome of for um a long time 23:44and in some ways I feel like it asked as many questions as it ended up 23:47answering, but that's good because it means we got to do the podcast for another 12 months. 23:51So, uh, thank you both for joining. 23:53And as always, uh, you can find Mixture of Experts on podcast platforms everywhere, and we will see you next time.