Learning Library

← Back to Library

2024 AI Recap and 2025 Outlook

Key Points

  • The hosts crown Gemini, Flash, and the evolving Llama series as 2024’s standout AI models, signaling a shift toward ever‑larger, high‑performance systems.
  • They predict a major “agent boom” in 2025, envisioning “super agents” that will dominate applications across the tech landscape.
  • While NVIDIA remains a key player, the panel expects new entrants and increased competition in AI hardware, challenging its longstanding dominance.
  • The discussion stresses that AI progress should stay transparent and safe, emphasizing openness over “black‑curtain” development as the industry matures.

Sections

Full Transcript

# 2024 AI Recap and 2025 Outlook **Source:** [https://www.youtube.com/watch?v=l8plyR8aqVQ](https://www.youtube.com/watch?v=l8plyR8aqVQ) **Duration:** 01:01:27 ## Summary - The hosts crown Gemini, Flash, and the evolving Llama series as 2024’s standout AI models, signaling a shift toward ever‑larger, high‑performance systems. - They predict a major “agent boom” in 2025, envisioning “super agents” that will dominate applications across the tech landscape. - While NVIDIA remains a key player, the panel expects new entrants and increased competition in AI hardware, challenging its longstanding dominance. - The discussion stresses that AI progress should stay transparent and safe, emphasizing openness over “black‑curtain” development as the industry matures. ## Sections - [00:00:00](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=0s) **2024 AI Highlights & 2025 Predictions** - The episode recaps 2024’s standout AI models and trends while speculating on 2025 developments like super agents, hardware shifts, and openness. - [00:03:03](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=183s) **Scaling, Tool Augmentation, and Model Distillation** - The speakers outline the shift from ever‑larger proprietary AI models toward tool‑enhanced agentic flows, synthetic‑data‑driven teacher models, and the distillation of high‑performance, cost‑effective smaller models using curated enterprise data, heralding a new AI landscape beyond 2025. - [00:06:08](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=368s) **Enterprise AI Gains Amid Competition** - The speakers discuss mixed demo results, emphasize a recent surge in enterprise adoption of generative AI that relies on proprietary data, and note the intensified rivalry among major AI players. - [00:09:10](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=550s) **Rise of Small Local AI** - The speakers discuss how increasingly compact, on‑device AI models with persistent memory and tool access will enable privacy‑preserving XR/AR experiences and become crucial as regulation and personalization demand local computing. - [00:12:15](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=735s) **Future AI Companion & Multimodality** - The speakers discuss shifting from competing with AI to treating it as a collaborative companion, evolving evaluation benchmarks, and anticipate multimodal capabilities becoming a major focus by 2025. - [00:15:21](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=921s) **Multimodal Voice-to-Voice AI** - The speakers highlight how emerging any‑to‑any multimodal models that convert speech directly to speech outperform traditional speech‑to‑text pipelines, and note that 2024 has become the year of AI agents. - [00:18:29](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=1109s) **Race to Define Agent Protocols** - The speakers examine Meta’s early Llama Stack push as a cue that the AI field is moving beyond OpenAI’s dominant API model toward competing standards for agent intercommunication, and they speculate on which firm will ultimately lead this emerging ecosystem. - [00:21:33](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=1293s) **Defining Super Agents and Security Risks** - Panelists define a “super agent” as a next‑generation AI that combines advanced reasoning, inference compute, and tool access, and warn that its imminent widespread deployment will expose highly underrated security and data‑leakage challenges. - [00:24:37](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=1477s) **Niche Translation and Agent Web** - The speaker highlights opportunities for small language models in under‑served translations and domain services, and predicts that by 2025 the web will need a new, agent‑friendly data format to replace HTML. - [00:27:41](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=1661s) **Unified AI Agent Interfaces** - The speaker envisions AI agents serving as a natural‑language operating system that integrates fragmented business tools and eventually automates software development. - [00:30:43](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=1843s) **AI Agents and Hardware Outlook** - The hosts highlight the emerging power of AI agents for democratizing full‑stack app development and segue into a conversation with AI hardware experts about the future of AI infrastructure. - [00:33:46](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=2026s) **Emerging AI Chip Landscape** - The speaker outlines new AI hardware startups, wafer‑scale chips, and big players like Broadcom and Qualcomm entering the market, noting a shift from NVIDIA dominance toward inference‑driven opportunities. - [00:36:49](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=2209s) **Nvidia’s Training Market Dominance** - The speaker predicts Nvidia will keep controlling AI training systems through its GPU and high‑performance networking suite (via Mellanox), with AMD and Intel unlikely to compete effectively until around 2026‑2027. - [00:39:58](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=2398s) **Open Standards Disrupt Nvidia Edge AI** - The discussion highlights how emerging open AI frameworks and dedicated inference engines are reducing reliance on NVIDIA, enabling broader competition in edge inference hardware while noting Apple’s push into AI chips as a key upcoming trend. - [00:43:04](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=2584s) **Underrated Trends in AI Hardware** - The speaker emphasizes overlooked developments like real‑time compute optimizations (e.g., test‑time compute) driving tighter hardware‑software co‑design, and the expanding accessibility of AI hardware ecosystems illustrated by models such as Llama 3 for both research and consumer applications. - [00:46:10](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=2770s) **Granite 3.0 Open-Source AI Milestone** - It highlights IBM's release of the Granite 3.0 family—Apache‑2 licensed, transparently built language models with ethical data sourcing—as a defining product moment of 2024. - [00:49:34](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=2974s) **AI Governance and Safety Priorities** - The speakers shift from early AI experiments to stressing the need for robust governance, copyright compliance, cost management, and safety guardrails—highlighting IBM's watsonx and recent AI safety summits as pivotal steps toward viable, responsible AI deployment. - [00:52:39](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=3159s) **Inference Runtime Risks and Open‑Source Parity** - The speakers discuss how using inference runtime for model self‑reflection introduces new security vulnerabilities yet offers greater control, and they predict that by 2025 open‑source AI will reach or exceed closed‑source capabilities. - [00:55:53](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=3353s) **Evolving AI Interfaces and Co‑Creation** - The speakers discuss the need for optimized inference stacks, the shift beyond chat‑based AI interfaces toward new interaction models, and the rise of collaborative co‑creative tools. - [00:58:58](https://www.youtube.com/watch?v=l8plyR8aqVQ&t=3538s) **Modular Expert Architecture & Agent Middleware** - The speakers discuss the need for modular AI components and middleware to manage and orchestrate specialist experts and multi‑agent systems, highlighting emerging research and startups. ## Full Transcript
0:00All right, looking back at 2024, 0:01what was the best model of the year? 0:03For me, it's going to be Gemini and Flash. 0:05And I'm going to nominate a sequence, I think, 0:07which is the sequence of the Llama models. 0:10So is the bubble finally going 0:12to burst on Agents in 2025? 0:14Agents are the world. 0:15Agents are everything. 0:16And in 2025, we're going to have super Agents. 0:20In 2025, is NVIDIA Still going to be king. 0:23Not only NVIDIA is here, but we also see new 0:26entrance or the the other players in the market. 0:29Are we going to end up 0:30having openness and safety? 0:32You can do this out in the open. 0:34It does not need to be behind a a black curtain. 0:36So to speak. 0:37All that and more on today's mixture of experts. 0:44I am Tim Hwang and welcome 0:45to Mixture of Experts. 0:47Each week, MoE is dedicated to bringing 0:49the gold standard banter you need 0:51to make sense of the ever evolving 0:52landscape of artificial intelligence. 0:54Today, we're looking back at 0:56the huge evolutions across 2024. 0:58You know, just to take you back, in 0:59January of 2024, we're all chattering about 1:02the release of the GPT store, Claude 2.1's long context window, 1:06and I think at that point, we were still 1:08waiting for the release of Llama 3. 1:11Uh, 2024 was incredible, obviously a dynamic 1:13year in AI, and so what we've done is we've 1:15gathered a bunch of our best panelists to 1:17talk about what stood out to them, what 1:19didn't go as well, and maybe what they'll 1:20think, uh, about what happens in 2025. 1:23We're going to talk about agents, hardware, 1:25product releases from the whole year, and 1:27But first, we're going to start with what 1:29happened in the world of AI models in 2024. 1:32And to help us unpack the journey we've 1:34been on, we have with us Marina Danilevsky, 1:36who's a senior research scientist, and 1:38Shobhit Varshney, senior partner consulting 1:40on AI for US, Canada, and Latin America. 1:43And so I want to actually start with maybe a 1:45quick Uh, more kind of closer story, right? 1:47Even before we zoom back to the, 1:49you know, dark ages of January 2024, 1:52uh, which is the release of o1. 1:53Um, you know, obviously this was 1:54a big announcement, one of the 1:56biggest announcements of the year. 1:58And I know a show, but before the show, you and 1:59I were talking, you've You wanted to kind of 2:01get in and actually just point out that like 2:03the release of o1 is actually marks a pretty 2:05big change in how these companies are thinking 2:07about doing models and scaling these models. 2:09And maybe we'll just start 2:10there if you want to jump in. 2:12Excellent. 2:12It's such a great time to be alive. 2:15Um, what we see all around us, like there's 2:18no other year in your entire career life 2:20that you would rather be alive than today. 2:22In the last year or a year or so, 2:25we saw the era of scaling laws. 2:27We got to a point where We realized that adding 2:30more compute, building larger models, and 2:32driving higher performance got us incredible, 2:35incredible performance from these models, right? 2:37So we got to a point where we, we 2:39have insanely large models, now 2:41Llama 4 and 5 billion parameters, 1.75 from GPT 4. 2:46You can see this huge set of big 2:48models that are doing amazing work. 2:50Now we are transitioning to a couple 2:52of different shifts in the market. 2:53One, we are seeing more of the shift 2:55moving towards the inference phase of it. 2:59Slow down, think about what you 3:00want me to do and think through a 3:02plan and come up with an answer. 3:03We also started to give these models 3:06more tools that they could use, just like 3:07we learned to use tools as we grow up. 3:10So we have these agentic flows that are 3:11helping us increase the intelligence as well. 3:14We also saw a big shift in the overall cost. 3:17The cost of these proprietary models 3:19implemented in the last year or so. 3:22But then smaller models got more 3:23and more efficient and started 3:24to perform much much better. 3:26So we've seen this shift towards insanely 3:29large models that can think a lot more. 3:31We saw us run out of all the public 3:33internet data and now we're focusing a lot 3:35more on high quality enterprise data or 3:38stuff that's built for specific models. 3:41So we're now getting to a point where you have 3:43a teacher model that's insanely large, really 3:45well thinking through the whole problem, that 3:47can create synthetic data, can help Train a 3:50smaller model can distill a model that can 3:53deliver high performance at a lower price point. 3:55So we've shifted this, shifted quite a bit 3:57in how we think about AI models and how 3:59we have been investing in building them. 4:022025 and beyond is going to be a 4:03completely different ballgame in what 4:05we see with what AI models would do. 4:07Marina, what are your thoughts? 4:09Yeah, I think you're right. 4:10It's been a really interesting year in terms 4:12of where we started, where we've ended up. 4:14We've seen that, yes, we can go 4:16bigger and bigger and bigger. 4:18And now we're finally there. 4:19We can say, great. 4:20So how well can we still keep 4:21going now that we can go so far? 4:22smaller. 4:23So that initial research push of how big can 4:25we go, we've finally given ourselves the luxury 4:28of, all right, now it's time for efficiency. 4:30Now it's time for cutting costs. 4:31Now it's maybe eventually time to talk about 4:34environmental aspects and things of that nature. 4:36Maybe next year. 4:37Is that a prediction for 2025 or? 4:402025. 4:40Um, So that, that part is very interesting. 4:43It also means that the quality has gotten 4:44to where we can start to, uh, build 4:47enterprise grade solutions reliably. 4:50And I'm, I'm excited for that. 4:51I know we're not talking about 4:52next year yet, but that's the 4:53thing that I'm really excited for. 4:54The quality is there, I think finally. 4:57And we, we can start getting real 4:58serious about enterprise solutions. 5:00Yeah, I mean, I think that seemed 5:01like a really big trend this year, you 5:03know, was certainly someone who kind 5:04of like does software engineering in 5:07their free time, kind of as a hobby. 5:08This is the year where I was like, 5:10wow, I am finally able to do stuff 5:11with these coding assistants that like 5:12I would not otherwise be able to do. 5:14It's like finally fit for purpose for 5:16me to kind of use on a day to day basis. 5:18And I think that was, that was a very big. 5:20jump, um, I think that, you know, 5:22we noticed in the last 12 months. 5:24I guess Marina, are there particular stories 5:25that stand out to you from like, I don't 5:27know, earlier in the spring or otherwise 5:28where you're like, Oh, I'll, if when I look 5:30back on 2024, I'll really remember it for X. 5:33I mean, first of all, I'll remember it for just 5:36the, uh, very, very high levels of competition. 5:39It felt like every two weeks somebody was 5:42coming out with something and companies 5:45that you maybe wouldn't even expect 5:47like even which is very recently Amazon 5:49being like, Oh, they're working on that. 5:50Oh, that's actually pretty good. 5:51So I think I'll remember it for a lot of 5:53people trying to, uh, really one up each 5:56other, uh, in a, in a good way, in a way that 5:59actually really pushes the thing forward. 6:00But I think that the number of players that 6:03we have this year is, uh, what's really going 6:06to make it stand out for me and some of the. 6:08You know, as we talked about in previous 6:10episodes, some of the debuts were more 6:13successful, some were less successful. 6:15Sometimes people didn't quite, 6:16you know, double check everything. 6:18Maybe sometimes people thought that 6:19the demos were a little bit overcooked. 6:22Um, and so I, I think that that's, That's the 6:24thing that'll make me really remember the year 6:26is the different ways of how do you join in the 6:28competition and introduce your, your flavor. 6:30Shobit, how about you? 6:31I think from an enterprise perspective, 6:33uh, this is an amazing year. 6:34We, we recently ran a survey for our AI report 6:37and about 15 percent of our clients globally got 6:40real tangible value by applying generative AI. 6:42There's a lot of, uh, knowledge 6:44that was locked in documents and 6:45processes, things of that nature. 6:47And we saw meaningful movement and 6:49how clients are focusing on it. 6:51a few small complex workflows and 6:54delivering exceptional value out of it. 6:56I think we did not get enough value out 6:59of the generic co pilots or assistants. 7:02That has shifted more towards, hey, this 7:04really has to be grounded in my data and 7:05my knowledge and things of that nature. 7:08But overall, the last two weeks that we 7:10just went through, I think that was the most 7:13action we've ever seen in the last two weeks. 7:14Two, three years of AI, what the competition 7:18between open AI and Google and then meta jumping 7:20in that that has been a phenomenal, phenomenal 7:23movement in the community together and now we're 7:25starting to see us move towards, hey, we have 7:28exceptional models, how do we start to then 7:31control them a little bit more, adapt them to 7:34our enterprise workflows and our data sets and 7:36have them think and reason with tools and things 7:38of that nature more the big movements around o1 7:42I think it's going to go down in 7:43history as a big, big point in time 7:47when we started to realize that 200 7:49a month is actually great value. 7:52You start to get to a point where if 7:54you're thinking about how you, if you're 7:56spending 200 bucks a month, you're really 7:58being very focused on which workflows truly 8:02can see an uplift and apply AI to them. 8:04Now you're at a point where you're 8:05really paying somebody to do that. 8:06augment every aspect of your daily life. 8:09I think we're great, great 8:10momentum to start 2025. 8:12Yeah, for sure. 8:13And I guess, I don't know if like folks have 8:15nominees on superlatives for this, or it's 8:17like, is o1 the, the release of the year? 8:20I mean, I think from a model standpoint, or 8:22were there other ones that kind of stand out? 8:23I mean, I guess we also had 8:24like, Mama 3 this year, right? 8:26It was also a huge, huge announcement. 8:28For me, it's going to be Gemini, uh, uh, Flash. 8:31I think what they've just done with a small 8:33model that does multimodal, that's going to 8:35drive the next two, three years of computing. 8:37And the reason I say that is 8:38everything that you can now unlock. 8:40If you guys followed the Android XR 8:42announcements recently, you're now 8:43at a point where multimodal models 8:46were inherently insanely large. 8:48They needed a lot of compute, 8:49always happened on the servers. 8:50Now with models like Google Flash, you're 8:53getting to a point where a small model 8:56can do multimodal really, really well. 8:58And the thing that will blow you 8:59away is how it starts to remember 9:01things that you've just seen, right? 9:03I think it's going to start augmenting 9:04all parts of our, uh, of our day 9:06to day workflows, including memory. 9:08That's something that we have not seen so far. 9:10Uh, we used to generally ask 9:11questions in a very cold start. 9:13Now we'll get to a point where these 9:14models will have infinite memory, 9:16can have access tools like we do. 9:18I'm very excited about high performance 9:20at a really small, uh, size. 9:23So we can then eventually get to this, Compute 9:25infrastructure where you can have XR AR 9:29experiences and you can bring compute more and 9:31more closer to the devices that will drive a 9:33lot more of privacy as well because then the 9:35data is locked into those devices that I'm 9:37carrying with me versus somebody else's cloud. 9:40Yeah, I want to agree with that. 9:41Actually, the, the small model, the small models 9:44thing, because I think that we're going to 9:45start at least in the next year or two, seeing 9:47a lot more, uh, formal regulation going on and 9:51a lot more people waking up to what does it 9:53really mean as you're talking Shobit, but if the 9:55models are starting to remember, starting to be 9:56personalized, starting to be customized, that's 9:58going to become extremely, extremely relevant. 10:00So having something small, local that you can 10:03actually have that guarantee technologically. 10:05That's going to become very, very important. 10:07I agree with you. 10:08Yeah, for sure. 10:09And how about you, Marina, I think 10:10in terms of like, you know, I know 10:11Shobhit was saying, oh, one was huge. 10:13Like if you have like a, you know, best 10:15model of the year kind of nomination. 10:17That's a hard one. 10:18I, I like seeing them in a holistic way. 10:21And I feel like it's hard to tell at the moment 10:23when something is actually going to, uh, you 10:27know, turn, turn in, I'm going to nominate a. 10:29sequence, I think, which is the sequence of 10:31the Llama models, not the Llama models itself, 10:34but the sequence of we're going to have Llama 10:363 and then we're, so we've seen what we can 10:38do with pre training and then we're going 10:39to see what we can do with post training. 10:41So we're going to get bigger, bigger, 10:42bigger, bigger, and then we're 10:42going to see how far down we can go. 10:44I'd like to see a consistent perspective 10:46of that as a sequence that people try 10:49of push the pre training, push the post 10:51training, push the size and do that 10:54iteratively, iteratively, iteratively. 10:56I'd like to see that continue to be a thing. 10:58Yeah, I feel like that's like how we know 11:00you're a connoisseur, Marina, is like, you 11:02like, you, you like the curation of Llama. 11:04It's not just like any given 11:05model is the best model. 11:08Marina, I think we'll get to a point 11:09where the big research labs are going 11:11to build even bigger, bigger models. 11:14But they may not release them 11:15in the public as a model. 11:17And we use that more for creating 11:18synthetic data, for displaying teaching 11:20as a teacher model, and so forth. 11:22But I'm really excited about, we're finally 11:25coming to a point where we've poked at this for 11:27a while, and we said, oh, if I just ask this 11:30model to think before it answers, Well, this 11:32is what elementary school teacher kids, right? 11:35And now we're trying to relearn how we 11:37teach young kids on how do they look at 11:39the like, try different things out, create 11:41a plan, answer the question, go pick 11:43up a calculator if you really need to. 11:45And don't try to do this in your 11:46head, things of that nature. 11:48Like I, I feel that we are, I have little 11:50kids and I've spoken about that quite 11:52a bit and I feel that we are, we are, 11:53there's so many similarities between 11:55how we are training and we're doing some 11:57reinforcement learning with our kids and 11:59giving them rewards and mechanisms in place. 12:01We are breaking problems into smaller chunks 12:03and they go solve each one of them separately 12:05and there's a whole positive reinforcement 12:07around them and they get things right. 12:09I think we're getting to a point where we're 12:10getting to learn how these models learn and 12:14that becomes a good symbiotic relationship. 12:16I think we will stop. 12:17Asking these models to do things that 12:18humans do really well, and we'll have a 12:20better mutual appreciation of which things 12:22should be delegated down to these models. 12:25And that also means that benchmarks 12:27and how we evaluate these models 12:28are going to change quite a bit. 12:29But I think today we're starting to 12:30get to know these models really well. 12:32And 2025 and six will have a very 12:35different relationship with these 12:36models, becoming more of a companion. 12:38Versus trying to figure out, hey, 12:39can you do this as well as I do? 12:41Yeah, absolutely. 12:42Yeah, I think one of the funniest outcomes 12:43of this year has been all the examples 12:45of, like, could you just try harder? 12:47And then, like, the model actually just 12:48does better, which is, like, very funny. 12:50I mean, computers did not used to do that. 12:52So, um, so I think maybe a final question, 12:55and then we can wrap up this segment, um, 12:57is we haven't talked so much about, Uh, 13:00multimodality, but it really seems poised 13:02to become a really big deal in 2025. 13:05I'm curious, I guess maybe Marina, I'll 13:07start with you, if, if you've got kind 13:08of predictions for what's coming up 13:10in the next year for, for multimodal. 13:11Yeah, multimodal, uh, that's something 13:14where we had those thoughts when foundation 13:16model sort of first came on, cause we 13:17were all very excited about the fact 13:19of, oh, well, it's just tokens in order. 13:21It doesn't have to be text. 13:22It can be anything, but then I think the 13:24reason we all went into text as one of the 13:26very early code being part of it, I think, 13:29is the amount of training data that we 13:31had, the amount of examples that we had. 13:33So especially now that we've gotten better 13:34with synthetic data and with, like you 13:36said, Shobhit, but you were referring to 13:37teacher models, we're going to be able 13:39to explore that space, uh, a lot more. 13:42And so I, I think that they might 13:44finally, uh, be at the point 13:45where once again, they are useful. 13:47There's huge interest in, uh, having the 13:49multimodal models because now, you know 13:52how with the text models, we had the 13:53idea that when you have one doing lots 13:55of tasks, it learns from each other. 13:57Now it's going to be even more interesting 13:58where if you have a multimodal model, 14:00does that make it actually also better 14:02at each of the individual modalities? 14:03Again, I think the data is now finally 14:05there, not just the compute, but the 14:07data and the ability to create more data. 14:09Um, and so I think that, yeah, 14:12next year we should see more. 14:14I think I was expecting to see 14:15maybe a little bit more models that 14:17aimed at the sciences this year. 14:19Maybe now again, next year, uh, maybe 14:21models that are going to be more 14:22successful with video, not just Sora, Sora. 14:26But something that is maybe a 14:27little bit more useful lower down, 14:29think like, uh, with robotics. 14:31There's a lot of, uh, things to be minded there. 14:33So that's, I guess where I, I see those 14:35maybe, yeah, the flashy parts are fun, but 14:38the real usefulness is somewhere a little 14:39bit lower down with, um, with the hardware. 14:42No, I think the multimodal space is going 14:44to be amazing the next couple of years. 14:46And I think it is important for it to 14:49understand all aspects of what humans are 14:51seeing, feeling, looking at, reading, and 14:53listening before it comes and helps us. 14:55Um, I think it's going to have a huge impact 14:58on its understanding of the world around us. 15:00So far, we have done things where, hey, I will 15:02take a picture of something or I'll translate 15:04that into text and ask a question of a chatbot. 15:06That paradigm has not scaled. 15:09As the, as the multi modal 15:10models get better and smaller. 15:13Like the Gemini 2. 15:140 Flash Experimental, those are the ones 15:17that are going to drive more and more 15:18richer experiences in our day to day lives. 15:21And the competition is 15:21going to be very, very high. 15:22You will see these models come 15:23out from any, from everywhere. 15:25Uh, the Any2Any, from speech to speech 15:27directly, those kind of models are 15:29delivering exceptional customer experiences. 15:32If you go for, if you look at traditional ways 15:34of doing AI, you would go speech, To text. 15:37You take that text, you 15:38pass it to a, to a AI model. 15:39AI model figures out what to respond 15:41with, and you go back from text to speech. 15:42A lot is lost in translation and transcription. 15:45Now, when you start doing, um, from media 15:48to media, you go from voice to voice. 15:49It starts to understand the 15:51nuances of how humans talk. 15:52I'll, I'm very excited about 15:53the next year of multimodal. 15:55Small and then starting the full context. 15:57That's awesome. 15:57And that's all the time we have 15:58for today to talk about AI models 16:00showbirth marina Thanks for coming on. 16:02Happy holidays, and we'll talk 16:03next year about all this and more 16:09For our next segment I want to talk about agents 16:11in 2024 and to help me do that I'm gonna bring 16:14in Chris Hay distinguished engineer CTO customer 16:16transformation and Maya Murad who is the product 16:19manager for AI incubation Maya, Chris, welcome 16:21back to the show Well, so in 2024, uh, it was 16:24the year of the agents, agents, agents, agents. 16:27I think it almost became a little bit of 16:28an in joke at MoE that if we had an episode 16:30that did not include agents, uh, that was 16:33a really big thing and an unusual thing. 16:35Um, and so I guess probably 16:37let's put it this way. 16:38And I guess maybe Chris, we'll, we'll 16:39throw it to you first is, um, Agents 16:42over hyped in 2024 or under hyped in 2024 16:46under hyped, not hyped enough. 16:47Agents are the world agents are everything, and 16:50in 2025, wow, we're gonna have super agents. 16:53That's what's coming in 25. 16:55Okay, um, and I guess Maya, I mean, looking 16:58back, um, I don't know if you'd agree with 16:59Chris or if there's like particular stories 17:01in 2024 that really stood out to you in 17:04the development of agents, if they're 17:05going to be as big as Chris says for 2025. 17:08So I definitely agree 2024, I would say 17:11it was a lot of talking about AI agents. 17:13Um, I'm excited to see more execution 17:16and what I expect to see is more quality. 17:17Hurdles. 17:18Once we see more agents 17:20being pushed into production. 17:22I think we're just scratching 17:23the surface of what is needed. 17:25A trend that I'm starting to see 17:27right now this year is having more 17:29protocols and standardization efforts. 17:32So we saw that Meta is attempting to 17:35do that with the Llama stack, Anthropic 17:37with their model context protocol, MCP. 17:41Um, so I think it's going to be this little 17:43battle for how do we standardize how LLMs 17:46interact with the external world, how 17:48agents, I think in the future it's going 17:49to be how agents interact with each other. 17:52Um, and I think this is where the next 17:54frontier is and where a lot of our 17:57efforts I was going to be heading towards. 17:59Yeah, this felt like a big, like, 18:00almost like a preparation year. 18:01I was looking at all the news stories and 18:03I was like, is the biggest agent story 18:05of the year that Salesforce is hiring 18:07a lot of sales agents to sell agents? 18:09Like, it feels like, and then between 18:10that and the technical standards, it's 18:12almost kind of like, it's almost far and 18:14few between to be like, oh yeah, this 18:15was the killer agent release of the year. 18:18Um, and actually, in fact, a lot more prep. 18:20I don't know if Maya, you'd agree with that. 18:21It felt like it was the year of bracing 18:23for what's to come and all the different 18:26things we needed to consider and 18:27then who wanted to own that category. 18:29So it was really interesting that for 18:31example, Meta went out early and with, so 18:34the first iteration of Llama Stack was. 18:37a little bit rough, but what they were 18:39trying to do with their saying, we're in the 18:41long term, we're in this in the long term. 18:43And we want to help define those 18:45agent intercommunication protocols. 18:48And I have faith if, if that's a direction 18:49that Meta wants to take, I'm sure 18:51they're going to do a good job at it. 18:52But this is also signaling 18:53something interesting. 18:55Um, the last two years, it's, um, Mainly the 18:57field reacting to what open AI put out so open. 19:01I put out their chat completions API 19:04and the whole ecosystem followed suit. 19:06And if you didn't have that exact API, your 19:09thing was much more difficult to consume. 19:11And now we're seeing a lot 19:13more players contend to. 19:15Uh, being the one setting 19:16those standards and protocols. 19:18Yeah, for sure. 19:19And maybe, I guess, Chris, to turn it 19:20back to you, I mean, you're, I think 19:21you just used the phrase, agents are 19:23the world, which is a very bold claim. 19:25But, I mean, 2025, I mean, you know, let's 19:28say agents are a lot more popular, become a 19:30lot more prominent as a part of the landscape. 19:33You know, is it meta that's well positioned 19:35to win here or do you, do you have any 19:36predictions about what we're going to see 19:38in terms of who's going to be leading in the 19:39space versus maybe a little bit further behind? 19:41So I really like what Maya had to say on 19:44Anthropic and the model context protocol. 19:46I actually think that is going to be one of 19:49the biggest enablers for agents next year. 19:51And I think the problem that they've solved 19:53really well is allowing remote calling of tools. 19:57That's probably the biggest thing 19:58that they've solved there, right? 19:59So yeah. 20:00If we think about the enterprise for a second, 20:03you're not going to have agents that are sitting 20:05scouring the web, or they're going to be, 20:07uh, sitting downloading documents, whatever. 20:09It's going to be access 20:10to your enterprise tools. 20:12It's going to be things like accessing Slack, 20:14it's going to be accessing your, uh, Dropbox, or 20:17your box folders, or whatever, or your GitHub. 20:19And a lot of that is being standardized. 20:21But more importantly, you want to take your own 20:23data, and then expose your own APIs, and expose 20:26that in a way that agents can consume data. 20:28In a standardized way. 20:29And I think MCP has done a really 20:31good job of allowing you to remote 20:33call tools and then be able to chain 20:35them together with multiple servers. 20:37And I think that's going to be a big enabler. 20:39Now what's interesting and what 20:40they've done there is it is easy to 20:43hook up different LLMs, for example. 20:46So it's not tied to the cloud stack there. 20:48You can hook up any other model that you want. 20:52And. 20:52It's all tied in to function calling, 20:55which again, was a standard that 20:56was created by OpenAI in that sense. 20:58So, I like what you said there, Maya, 21:00about, you know, different providers 21:02coming in, and coming in an ecosystem. 21:05And I think that's what I'd like to 21:07see happen is no one company winning. 21:09And this is ecosystem of providers is going 21:11to push everything forward, and we're going to 21:14enter this world of the big agent marketplace. 21:17And that's why I say super agents 21:18are coming, because it's going to 21:20be this really big ecosystem that's 21:22going to start to emerge in 2025. 21:24And when you say super agent, 21:25what do you mean exactly? 21:26I just made up the term Tim, so. 21:29You heard it here first on MoE. 21:32A really good agent. 21:33That's a super 21:34coming from super intelligence 21:36or is this your definition or 21:37is it in the sense of like 21:38a Hollywood super agent? 21:39Actually, I, 21:41thanks for the save there, Maya, right? 21:43I'm going to define a super agent as 21:45the combination of the reasoning models. 21:48The inference time compute models are coming 21:50out just now combined with tool access. 21:53So therefore they're more powerful 21:54than the agents that you have today. 21:56So there you heard it first. 21:57You're right, Tim. 21:58That's what a super agent is. 22:00Very nice. 22:01Uh, Maya, you had a funny phrase when you 22:03were kind of giving your reaction to my first 22:05question, which is, you know, next year's agents 22:08are going to be everywhere, but it's also going 22:09to be the year we're going to discover, like, 22:11where the, the barriers or the limitations are, 22:14you know, basically this kind of the full force 22:16of agents going to become crashing onto reality. 22:18And I think we're going to learn a lot. 22:19And, you know, I guess one question I've 22:21been asking a lot of the panelists for this. 22:22This episode is, you know, what's underrated? 22:25What are people not thinking about that 22:26are, that's likely to be like a big 22:27hurdle, right, for agents going forwards? 22:30Number one answer, security. 22:32Super underrated. 22:33I think it's already being reported 22:35that a lot of the existing players in 22:37the space are leaking sensitive data. 22:39And I, I, see agents as a way of 22:43exacerbating these inherent risks of LLMs. 22:46And I think we're under appreciating 22:49what it takes to get it right. 22:50I think the other thing is how to 22:52nail the right human interactions. 22:54When you have this ability to 22:55automate more complex tasks. 22:58What are the things that you still 22:59need to delegate to the human? 23:01How do you need to have a human in the loop? 23:03How do you avoid an overtrust issue? 23:05My team has done a number of user 23:07studies and when information is presented 23:10neatly by an actor that looks and seems 23:12intelligent, it's really easy to take 23:14everything surface level for granted. 23:17And I think there's a whole new paradigm of 23:19human computer interaction or maybe human 23:21agent interaction that will be unlocked. 23:23And I'm, I'm really excited for 23:24what's to come because I think this 23:26is inherently a creative exercise. 23:28How do we keep, retain our creativity, retain 23:31our ability to do critical thinking, and yet 23:34automate certain parts of processes to AI? 23:37Um, that will be a really 23:38interesting paradigm to get right. 23:40Yeah, I think that delegation problem is 23:42going to end up being super, super hard. 23:44Uh, I think, uh, yeah, it's very easy 23:46to be dependent on, Even people who 23:48sound smart when they're not actually. 23:50It's like no different, I guess, 23:52for, for agents, uh, as well. 23:54Um, well, I guess put it this way is, you 23:57know, it sounds like we're very interested. 23:59And I guess the big prediction 24:00from both the two of you seems to 24:01be, you know, agent marketplaces. 24:03Right. That's going to be maybe like the big 24:05thing we're going to see, um, next year. 24:08You know, I think one of the big questions 24:09is also kind of like what's going to be the 24:10first most popular agent use case in some ways. 24:14Um, you know, you think 24:15about the big marketplace. 24:16There's a lot of things that agents could 24:17do that may be fun to do, but, you know, 24:20I think we're almost kind of looking like 24:21what's going going to be the, what's going 24:22to be the email of the agent world, right? 24:24Like what's going to be the 24:25slack of the agent world. 24:27Um, curious in both of your experiences, 24:28you know, talking to customers and 24:30stuff with their particular things, 24:31like in their hopes and dreams that 24:33they really want to see out of agents. 24:34And if there's kind of anything recurring there, 24:35that's worth it for our listeners to know. 24:37I think from my perspective, Tim, and that 24:40marketplace, I think there is some obvious ones. 24:42Like, Translation, I think, if I'm truly honest, 24:45like language models today, I don't think 24:47they've really nailed translation so well. 24:50There's some models that do certain 24:51languages really well, but then, um, if you 24:54think of the more esoteric languages, for 24:55example, um, the less popular ones, then 24:58the, the large models aren't getting that. 25:00And then it's going to be 25:01specialized models that have been 25:03trained in that specific language. 25:05So, um, I think that's probably a real 25:08opportunity for some of these smaller 25:09language models combined with an 25:11agent to offer translation services. 25:13And again, add that into domain services. 25:15So things like legal, which is 25:16something you know very well, Tim, then 25:18I think that will probably be a big. 25:21piece of that marketplace, but I'm hoping that 25:24it won't just be about these individual agents. 25:27I think any piece of information, it 25:29could be sports scores, it could be golf 25:32scores, it could be information about 25:34play, it could be absolutely anything. 25:35And one of the things, and this is my 25:37next prediction for 2025, is I think we're 25:40going to get a shift in the world wide web. 25:43So today, HTML, et cetera, is the dominant. 25:46Uh, language, markup language of the internet. 25:49That's not really well designed for 25:52LLMs and not well designed for agents. 25:55So I wonder if in order for the agents to 25:58exist, not just having the marketplaces, 26:00but having the way to expose that data, 26:03we talked about MCP earlier, I wonder if 26:05you're going to start to see new types of. 26:08Page is appearing where the content 26:11is optimized towards the agents for 26:13consumption by agents and resources that 26:16they expose as opposed to necessarily human. 26:19So I'm, I'm kind of predicting we're 26:21going to start to see this shift in the 26:22web to a kind of, uh, dare I say a web 4.0, I'm trying to avoid the term web 3. 0 where we have content that is 26:32specifically designed for agent consumption. 26:34Yeah, it seems to be almost the prediction 26:36that's kind of implicit in what both of 26:38you are saying is that you know there'll be 26:39so much interest in the promise of agents 26:41that like almost we're going to be kind 26:43of reconstructing the web to make it safe 26:45for agents or make it work for agents. 26:47And I guess a lot of the kind of stack 26:49and a lot of the kind of interoperability 26:51stuff that's being built is like 26:52an attempt to do that in some ways. 26:55Um, I don't know, Maya,, do you agree with that? 26:57You think that's kind of like going to 26:58be the future is like we'll have a, you 27:00know, agent markup language basically. 27:02Uh, A.T.M.L. 27:03I think a lot of the interesting use cases 27:05will be unlocked when different agents 27:07that were built by different providers 27:09that are owned by different organizations 27:12are able to interact with each other. 27:13And like, how do you 27:14establish a safety protocol? 27:16How are you able to do that productively? 27:18Like the promise here is like, how do we 27:20break out of all these silos of different 27:22systems and having to manually architect 27:25how each one speaks to each other? 27:27And can we get to, uh, 27:28Universal interaction protocol. 27:31This is really an interesting promise. 27:33I don't know if we will fully unlock it 27:34next year, but a lot of different actors 27:36would like to go into this direction. 27:39And there's simple things that 27:40we should nail before that. 27:41So I know like software engineering tasks are 27:43there's a lot of investment going that space. 27:46I still think no one has nailed like the average 27:49business user, the average business user has 27:51to use, I don't know, a dozen of different 27:52tools on their computer and their machine. 27:54None of them speaks with the other. 27:56Everyone has its own onboarding experience. 27:59So I see a lot of opportunity to flatten 28:02out these complex experiences and make 28:04them much more dynamic and integrated. 28:05And this is the true promise of this technology. 28:08And it's the ultimate dream, I guess. 28:09I mean. 28:10Because the world you're describing is 28:11almost like the agent becomes your entire 28:12interface for all these applications, like 28:14they stay independent, but like, yeah, the 28:18operating system in the future really is the 28:19agent that's doing things on your behalf. 28:21It's natural language. 28:22I was like. 28:23LLMs changed our perception of how 28:25we interact with the digital world. 28:27We expect everything to be in natural language, 28:29or you could do a form and then there's an 28:32option to do natural language interaction. 28:34And I think that expectation is gonna widen. 28:38Yeah, no, I think that makes a ton of sense. 28:40I guess maybe the final turn that we 28:41should talk a little bit about is like on 28:43the engineering and coding side, right? 28:45I was thinking this year that like, The coding 28:48assistance has gotten really, really good. 28:50But the dream is that you eventually have 28:52agents that are like, I'm really envisioning 28:54a software code base that looks like this. 28:57And it's able to kind of like build 28:58and interoperate on all parts of 29:00that, and all parts of your code base. 29:02What do we think are the prospects for that 29:04kind of automation and agentic behavior? 29:07I'm going to kick off here, and I'm 29:09going to be controversial as always. 29:11And here is something for people to think 29:14about, which is Programming languages 29:17today are designed for human beings, right? 29:20And if you think about things like 29:22loops, while loops, for loops, etc. 29:24There you have however many versions 29:26and the same with conditionals, 29:28if statements, blah, blah, blah. 29:29But you know what? 29:31When you get down to an assembly 29:32level, none of that exists, right? 29:34It's all back to branches and, you 29:38know, uh, and jump statements, etc. 29:40And therefore We are in an agentic 29:44world, we're getting them to program in 29:47a language that is designed for humans. 29:49And the big challenge, I would say, 29:50that I think is going to happen over 29:52the next few years is that you're going 29:54to have a more agentic native language. 29:56Something that is more designed for LLMs 29:59and therefore a less of a syntactic sugar 30:01that you need to satisfy humans there. 30:03So, I think there's going to be an 30:05evolution in programming coming. 30:07Um, And, and you can see 30:10it already today, right? 30:11The LLMs are already generating, uh, you 30:14know, here's another Fibonacci function. 30:15I don't, I don't need another 30:17Fibonacci function in my life, right? 30:18We got those. 30:20Exactly. 30:21So I then think you'll be like the equivalent 30:24of kind of NPM or something like that, where 30:26you have a big massive AI library where 30:28you can pull the functions that you need. 30:30So I think. 30:31Like your AI operating system, I think we're 30:34going to get an AI programming languages 30:36and libraries that are going to be a 30:37little bit more native, and then that's 30:38going to help the development of coding. 30:40So I think that's an interesting term. 30:42Will it be 2025? 30:43Maybe, maybe it's going to be 26, 30:45but I think that's where we're going. 30:46With the current technology we have, I'm like 30:48super impressed with what I've seen with Repl. 30:50it, with the ability to stand 30:51up like full stack applications. 30:53On the project I'm working on with Bee, 30:55it's been such an interesting paradigm 30:57like chat to build applications. 30:59Um, I, I really see the ability to 31:01create digital interfaces and code 31:03bases being democratized in a way 31:05that hasn't been able to for before. 31:08Purely powered by the current 31:10technology of agents that we have. 31:11I just think there's this like last mile 31:13problem to nail, and I think next year 31:15this is going to blow up in a major way. 31:17Nice. 31:18Well, you heard it here first. 31:19That's all the time that we have for agents. 31:21Uh, that was a lot to cover 31:22in a short period of time. 31:23Chris, Maya, thanks for coming on 31:25the show and we'll see you next year. 31:31I want to move us on to talk about the 31:32hardware that powered AI in 2024 and I can't 31:37have picked a better duo of people to help 31:39out in terms of explaining those, uh, than 31:41the two that I have online with me today. 31:44Khaoutar El Maghraoui is a Principal 31:46Research Scientist, AI Engineering, AI 31:47Hardware Center, and Volkmar Uhlig is Vice 31:50President, AI Infrastructure Portfolio Lead. 31:52Welcome to the show. 31:53Volkmar, maybe I'll turn to you first. 31:54So, you know, as we talk about hardware 31:56on AI, it's almost become synonymous with 31:59saying that we want to talk about NVIDIA. 32:01Um, and, uh, I'm curious about what you thought 32:03the biggest stories were this year from NVIDIA. 32:06I mean, the one that strikes me is the 32:07announcement of the upcoming GB200. 32:10Uh, but curious if there's other things on 32:11radar for you as we kind of think about, 32:13you know, what were the big stories in 2024? 32:16NVIDIA. 32:17Made a big splash for the GB200. 32:19Um, and I think we are seeing a big 32:23shift towards more integrated systems 32:25and protocol on the training side. 32:27Very large, like rack scale computers now. 32:30Um, liquid cooling is coming. 32:33So all the things we've, seen over the 32:35years how to get cramped more compute into 32:39smaller form factor, you know, making it 32:41faster, better networks behind it, etc. 32:45And I think NVIDIA is really trying 32:47to push hard on staying the leader. 32:50Um, on, and then we are seeing upgrades, 32:54which are kind of a reflection of 32:56Um, how models are now looking like. 32:59So we have 70 billion parameter models. 33:02Um, and you know, the 70 billion parameters, 33:05even if you quantize gigabytes at 8 bit. 33:09It's 140 gigabytes at 16 bit. 33:11Uh, now you don't want to 33:13have to buy full cards. 33:15So that we see an increase in 33:16memory capacity across the board 33:18of all the, uh, the accelerators. 33:21Uh, but not only NVIDIA is here, 33:23but we also see new entrants or the, 33:26the other players in the market. 33:28AMD is announcing a pretty 33:30good roadmap of their products. 33:32All that's very, very large. 33:34Memory capacities and memory bandwidth to 33:36address those large language models and fit more 33:39model into less space or less compute like and, 33:44uh, and Intel is playing in the market as well. 33:46And then you have a handful of startups, 33:50uh, where we also saw, you know, really 33:52interesting technologies coming onto the market. 33:54So if you look at, uh, Cerebros, that's a 33:58wafer scale, uh, AI, which, you know, like. 34:01A year ago, they were talking about it, now 34:03you can actually use it as a cloud service. 34:06You have Croc being a player, there 34:08are other companies coming up, there's 34:10D Matrix, which will have an adapter 34:12coming out at the beginning of next year. 34:14Um, and so I think, um, um, yeah, so I think 34:18there's a good set of players in the market. 34:21And then there are new entrants, right? 34:23We just saw the, the Broadcom announcement, 34:25um, pretty much, I think it was last week, 34:28um, with very large, you know, revenue 34:31targets, uh, and the relationship with Apple, 34:34uh, and then Qualcomm is also in the game 34:37and has a chip architecture coming, you 34:39know, and being some of them are available 34:42and there's a good roadmap for them. 34:43So I think the market is not only NVIDIA 34:46anymore, which is, I think, good for the 34:48industry, and it's moving extremely fast. 34:51So, and we have, we see training systems 34:54there, but there's an an increasing. 34:56Um, focus on inferencing because 34:59from my perspective, it's kind 35:00of where the money will be made. 35:01Yeah, for sure. 35:02And I guess, Khaoutar, I don't know if you 35:03want to talk a little bit about that bit. 35:05I wanted to make sure that we did talk a 35:06little bit about kind of the big trends 35:08in inferencing this year, because it feels 35:09like that was actually a big, um, theme of 35:12kind of how this market is developing out. 35:14And, uh, if you want to speak a little bit to 35:15that and where you think things went in 2024. 35:19Yeah, so of course, there's a lot of, a lot 35:21happening, especially around, um, inference 35:24engines and optimizing inference engines. 35:26Uh, a lot of hardware software co design is 35:28also, uh, you know, playing a key role in that. 35:32So, uh, you, we see technologies 35:34like VLLM, for example. 35:36Uh, we see also things like the, um, They 35:39try to what they're doing and all the the 35:42stuff around KV cache optimizations, the 35:45batching for in the inference optimizations. 35:48So a lot of that, a lot of innovations is 35:51happening in open source around building 35:54and scaling, inferencing, especially 35:57focusing on large language models. 35:59But a lot of these optimizations we see, 36:01they're not only specific to LLM, they can 36:03be also extended to other, to other models. 36:06So, um, so a lot of development that's happening 36:10at the VLLM, uh, there is work, you know, even 36:13at IBM Research and others contributing to 36:16open source to basically especially bring a 36:19lot of these co optimizations, um, in terms of 36:23scheduling, in terms of batching, in terms of 36:26figuring out how to best basically collocate 36:29all of these, uh, inference requests and get the 36:32hardware to, uh, um, uh, run them efficiently. 36:35Yeah, absolutely. 36:36Volkmar, do you want to give us 36:37a little bit of a peek into 2025? 36:38I mean, it kind of sounds like with this 36:40market becoming increasingly crowded, I think 36:42everybody's coming after NVIDIA's crown here. 36:45You know, what do you expect to happen in 2025? 36:47Does NVIDIA largely still stay in the lead? 36:49Or do we end in December 2025 with, you know, 36:52the market becoming a lot more divided and 36:54diversified than it has been traditionally, 36:56particularly on the training side? 36:58So I think the training side will be, 37:00that's my prediction, will be still 37:04very strongly in the hands of NVIDIA. 37:06Um, I think AMD and Intel will 37:10try to break into that market. 37:12Uh, but I think that will probably 37:14be more in the 2026 27 timeframe. 37:18Uh, the reason why I'm saying this is, 37:20um, the architecture you need to build, to 37:24build a really successful training system, 37:26it's not the GPU, it's, it's a system. 37:29So you need. 37:30Uh, really good, uh, low latency networking. 37:33You need to have a reliability problem. 37:35There's a, like, a strong push to actually 37:38move compute into the fabric, um, to 37:41further cut down the latency and more 37:43efficiently utilize, uh, the hardware. 37:46And, uh, NVIDIA, with their acquisition 37:49of Mellanox, effectively bought the number 37:52one network vendor for high performance 37:54computing, which, you know, training is. 37:57And so there is a, there's a, you 37:59know, a bunch of consortiums coming up. 38:02There's Ultra Ethernet, um, where, you 38:05know, they're trying to get to a similar 38:07capabilities what you have with InfiniBand. 38:08And InfiniBand, despite that it's an 38:11open standard, there's pretty much 38:12only one vendor on the planet, which is 38:13Mellanox, which is now owned by NVIDIA. 38:15So I think NVIDIA has a good, 38:18uh, you know, lock on that. 38:20side of the market, and therefore a lot of the, 38:23of the investments where other people are, are 38:25playing is more in the inferencing market, which 38:28is much easier to enter, you know, because you 38:30intrinsically not only have NVIDIA systems, 38:33like you don't have NVIDIA on cell phones, you 38:34don't have NVIDIA on the edge, and so there 38:37is a, and the software investment you need to 38:39do on inferencing is, is much lower than what 38:42you have on training side, so I think training 38:44is, is in, in, um, Very safe hands for NVIDIA. 38:47So unlocked, yeah. 38:49But I think there is now enough with 38:51Gaudi 3 coming online, which has 38:54integrated Ethernet, uh, you know, the, 38:56and what AMD is putting on the market. 38:58I think there will be, it will 39:00be a slow creep into that market. 39:02And I think, you know, in 2026, we will probably 39:05see, um, that, you know, there is a major break 39:07in into that market, and NVIDIA loses that. 39:10That very unique position it has right now. 39:12Yeah. It's going to be a big transition. 39:14Khaoutar, do you agree with 39:15that for the 2025 prediction? 39:16Yeah, I agree 39:16with that. 39:18Of course, there's a rising competition 39:19in AI hardware, like Volkmar mentioned, 39:22companies like AMD, Intel, and startups 39:24like Groq and Graphcore, they're 39:26developing competitive hardware. 39:28IBM also is developing, uh, competitive 39:30hardware for training and inference. 39:34The problem with the NVIDIA GPUs is 39:36also the cost and the power efficiency. 39:38The NVIDIA GPUs are very expensive and 39:40they're power hungry, making them less 39:42attractive, especially for the edge 39:44AI and the cost sensitive deployments. 39:46So the competitors like AWS Inferentia, 39:50IPUs, they offer specialized hardware 39:54that's often cheaper and more energy 39:56efficient for certain applications. 39:58So. 39:59And I think, you know, the open standards, for 40:02example, like the open AI Triton, um, and the 40:05Onyx and new, you know, these new frameworks, 40:08they're also working a lot on reducing the 40:11reliance on NVIDIA's proprietary ecosystem, 40:15which makes it makes it really easy for 40:17competitors to gain also some traction here. 40:20And if we look at the inference specific 40:23hardware, there is, you know, these RISE, like I 40:26mentioned VLLM before, this dedicated inference 40:29engines like VLLM, SGLang, Triton, they 40:33highlight the potential for non NVIDIA hardware. 40:36So they're opening up the door for 40:37the competition, uh, easy entry. 40:40And they also, and allow them also 40:42to excel in inference scenarios, 40:44especially for large language models. 40:46So, Uh, we'll see uh, this widespread emergence 40:50of edge inference solutions powered by ASICs. 40:55Uh, and, and I think this is 40:57challenging NVIDIA's role in this 40:59rapidly growing edge AI market. 41:02Yeah, and I think the edge is, I think is the 41:03last bit I wanted to make sure that we touch 41:05on before we move on to the next segment. 41:06Um, you know, Volkmar, it seems to me 41:08that obviously one of the big stories 41:09was Apple moving into Apple intelligence 41:12and making sure that all the, you 41:14know, essentially AI chips on them. 41:16Um, I assume that's going to continue to 41:182025, but I'm curious for our listeners that 41:20are less involved in watching the hardware 41:22space day to day, if there's any trends that 41:24you think are worth it for people to pay 41:25attention to as we get into the next 12 months. 41:28I think the Apple model is, uh, is 41:30very elegant and protocol when you 41:32are in a power constraint environment. 41:34Um, so you, you know, whatever you can 41:36do in that power constraint environment 41:37with less accuracy you do on device. 41:40And then whenever you need 41:41more, you go somewhere else. 41:43Uh, I think also the, the Apple. 41:45Uh, architecture that they are running on this 41:47on on the same silicon as they are running, 41:51you know on their phone They run in the cloud. 41:53It's a it's a very Interesting architecture 41:56because it simplifies it for the developer. 41:58It simplifies it in deployment And so I 42:01think that we will see more Of that type 42:04of separation, and I think we will see more 42:07compute happening on edge devices, and we're 42:11going now as silicon matures, and you know, 42:14there are there's more choices and you don't 42:16need a high powered card anymore, and the 42:19silicon gets more and more specialized for 42:21that, you know, simple matrix multiply, I 42:24think we will see pretty much every every chip 42:27which will leave a factory will effectively 42:29contain AI capabilities in one form or another. 42:32And then it's really this hybrid 42:35architecture of on device and off device 42:38processing, which allows to have, you know, 42:41silicon live for a long period of time. 42:43But if you're on an edge, You know, and 42:45Edge is not only a phone, it could be an 42:47industrial device, where you know, you 42:49know, your life cycle is five to ten years. 42:52You don't want to go and every two years 42:53have to swap out the chip just because 42:54you want to train another network. 42:56And so I think the architecture Apple put 42:58out will be uh, more solidified and we 43:01will see, you know, software ecosystems 43:03building being built around that. 43:04Yeah, that's great. 43:06Well, Khaoutar, I'll let 43:06you have the last word here. 43:07Um, I've been asking most panelists as they've 43:09been coming on, what is the most underrated 43:12thing, um, in this particular domain? 43:14So for AI hardware, are there things 43:16that people are not paying attention to? 43:18Um, you know, there's a lot of 43:19hype in the AI hardware space. 43:21So I'm curious if there's any more 43:22subtle trends that you think are 43:23important to pay attention to? 43:24Yeah, that's a, that's a great question. 43:26So I think, um, there is a lot of work 43:29around real time compute optimizations. 43:32Um, technologies, for example, like 43:34the test time compute, uh, which 43:37allows AI models to allocate additional 43:40computational resources during inference. 43:42This is something that we 43:43saw with OpenAI o1 model. 43:46It's really, I think it sets some 43:47precedence here and it allows the models 43:50to break down these complex problems 43:53effectively and mimic also kind of 43:55what we're doing in human reasoning. 43:56And it also has implications also on the 44:00way we design these models and also the 44:02way the models interact with the hardware. 44:04So it's kind of pushing for more hardware 44:06software co design, um, in this context 44:09where processing during inference, 44:11I think another trend I see is the 44:13hardware accessibility volunteer for all. 44:16I think when we see the Llama3 series, which 44:19illustrates new hardware ecosystems are 44:21evolving for both high end research models, 44:25but also for consumer grade applications. 44:28So the Llama models, they release, you know, 44:31multiple versions, the 400, the 8 and so on. 44:34So that's also an important 44:35trend that we're seeing. 44:36So we can kind of bridge the gap between 44:39high end These are data centers that allow 44:42basically access to where you have access to 44:44these high end computes and infrastructure, 44:47which is not accessible to everything. 44:49So pushing towards that 44:50would be really important. 44:52The other thing is the open 44:53source and the enterprise synergy. 44:56IBM released Granite 3, which I think 44:59is a great step in the right direction, 45:01which also highlights the importance of 45:04open source AI and its ability to maximize 45:07the performance for enterprise hardware. 45:10And, but there are still 45:11hardware design challenges. 45:12For example, what we see with NVIDIA's, 45:15uh, the Blackwell GPUs and the 45:18issues that they have around thermal 45:20management and server architectures. 45:22So, um, these hardware's, you know, to scale 45:25the need to meet demands for these next gen AI. 45:29Power efficiency is becoming critical. 45:32So, um, so I think if I were to sum up what's 45:36going on around these trends, I think the 45:40year 2024 showcased the importance of hardware 45:44software co design and the industry's pivot 45:47also towards specialized AI accelerators, 45:50open source adoption, and real time compute. 45:53Innovations are really very important, are 45:56setting the stage for further breakthroughs. 45:58Yeah, that's a great note to end on. 45:59Well, that's all the time 46:00that we have for hardware. 46:01Uh, Khaoutar Volkmar, thanks for joining 46:03us, uh, and for all your help in 2024, uh, 46:06explaining the kind of world of hardware and, 46:08uh, we'll have to have you back on in 2025. 46:15Finally, to round out our picture of 46:172024, we need to talk about the product 46:19releases that stunned us, amazed us 46:21and gave us something to think about. 46:23To help me do that are Kate Soule, Director 46:25Technical Product Management for Granite, and 46:27Kush Varshney, IBM Fellow on AI Governance. 46:30Kate, maybe I'll turn it to you first. 46:32Obviously, you know, the schedule was crazy 46:34this year in terms of product releases. 46:36It felt like every other 46:37week there was something. 46:38But I guess looking back on the last 12 46:40months, I'm kind of curious, like, what did 46:41you think was the biggest things, right? 46:43The stories that will kind of look back on 46:452024 and be like, Yeah, this is the year that. 46:47You know, that happens 46:48as the director for technical 46:51product management for granite. 46:52I feel like I have to, uh, have to celebrate 46:55what our team at IBM accomplished and 46:58released for, for launching the granite 3. 47:010 model family, um, focused on right. 47:03Apache two licensed models that are transparent, 47:07uh, with kind of an ethical sourcing of the 47:09data that went into them, uh, that we share 47:12all the details about online in our report. 47:13So really excited about being able to continue 47:16that commitment to open source AI and being 47:19able to create, you know, state of the art 47:21language models and the two to 8 billion 47:23parameter size that we can put out there under 47:25permissible terms for our customers and for 47:28the open source communities to, to leverage 47:31more broadly, uh, looking outside of just IBM, 47:33you know, I think the release of the GPT 4. 47:360 family of models and 47:38product was really exciting. 47:40I think it. 47:41Launched a new wave of interest in how do we 47:44continue to improve performance without just 47:47spending more money on our training compute. 47:50So I think that really is ushering in this next 47:53wave that we're going to see in 2025 of how can 47:56we spend more at inference time allowing models 47:58and products that use these models to have more 48:01advanced computations and inference calls that 48:04get generated to improve performance beyond 48:06just let's throw more money at the training. 48:08Let's throw more data. 48:09Let's scale, scale, scale. 48:10So that's more broadly, uh, something 48:12I was pretty excited to see. 48:13Yeah, we should definitely talk 48:14about both of those themes. 48:15I mean, I think on the first one, you know, 48:172024 was really like the, the attack of the 48:19open source, you know, it felt like for a 48:21moment there, like all the closed source 48:23models would really be winning the day. 48:24And it's just like the explosion 48:26of activity on open source has been 48:27really, really exciting to see. 48:29And then I think the second one as well 48:30is kind of like, it's like the, the, you 48:32know, play smarter, not harder, um, kind 48:35of world where, you know, I think like 48:36there's a bunch of new techniques that we're 48:38seeing kind of play out in a lot of places. 48:40Maybe Kush, maybe we'll 48:41start with that first theme. 48:42Um, you know, in the open source world, of 48:44course, this is also the year of Llama 3. 48:47Um, there's just been a lot 48:48happening in open source land. 48:50And, uh, curious as you look back, I mean, I 48:51think on either of the themes that Kate, Pointed 48:53out here, you know, either on the open source 48:55side or in the kind of different methods for 48:58doing AI If there's like things that you'd 49:00want our listeners to remember from 2024. 49:03Yeah, I mean, I think You're phrasing of it. 49:07I mean Open source returns or the 49:10return of whatever you want to call it. 49:13Yeah, I mean, I think that's the The right 49:16way to frame it, I think, uh, we're realizing, 49:19I mean, when we talk to customers across 49:22the board, um, that, uh, they were, I mean, 49:25in 2023, it was all about kind of POCs and 49:28this sort of thing, like getting people 49:30excited within their own companies that 49:32don't maybe generative AI has a role to play. 49:34But then over time, they realized 49:36that actually we need to worry about. 49:38Um, uh, the copyrighted, uh, data, um, other 49:42governance sort of issues, the cost, um, 49:45just, uh, how to make these operational. 49:47And, uh, I think, uh, watsonx, uh, the IBM 49:51product, uh, kind of shined with, with that, 49:54um, the, the granite models obviously as well. 49:56So, um, How do we take, uh, the, the 49:59science experiment that we had in 2023, 50:02um, kind of was being used more, uh, this 50:05year and now going into next year, it's all 50:08about, uh, being as serious as possible. 50:10I would say. 50:11Yeah, for sure. 50:12And I think now that you're on, uh, for 50:14this segment, I mean, I think it's a good 50:15time to ask too, obviously spend a lot of 50:17time thinking about AI governance, right? 50:19And there were a bunch of stories. 50:21Yeah. in that vein, uh, this year. 50:23I don't know if there's ones that 50:23you'd want to call out for, for 2024. 50:26Yeah, no, I mean, I think, uh, just 50:28the fact that, uh, the whole AI 50:30safety world, uh, convened, right? 50:33I mean, uh, in, we had this, uh, 50:35Korea summit, we had the summit in 50:37San Francisco, um, uh, in November. 50:40Um, and yeah, I mean, it's just, This is now the 50:44topic, I think it's the thing that we need to 50:46overcome, uh, because just having AI generative 50:50AI out there without the safety guardrails and 50:53without the governance, um, it's just dangerous. 50:56Um, I think it's, uh, the promise of 50:59the return on investment is only a 51:01promise until you can overcome the hump 51:03of, uh, the, the governance issues. 51:04Yeah, for sure. 51:05Do you have any predictions for where 51:06we go in 2025 with all that? 51:08I mean, um, Yeah. 51:09You know, I think we're, I'm 51:09detecting a theme here, which is 2024 51:11almost like set up a lot of stuff. 51:132025, we're going to almost 51:14see how it plays out. 51:15I mean, both in open source and 51:17in governance, it seems like. 51:18Yeah, no, I think, uh, the prediction 51:20is, uh, uh, I mean, the earlier segment 51:23was about agentic AI in the show. 51:24So I think that's gonna 51:26really, um, explode as well. 51:27And I think the governance, uh, There is 51:30going to be what drives the governance 51:32back down to, um, other use cases as 51:34well, because when you have autonomous 51:36agents, um, uh, then really the governance, 51:39the trust is, uh, extremely important. 51:41Uh, you have, I mean, no, very little 51:44control over what these things might do. 51:46Um, uh, the stuff that, uh, that Kate was 51:48mentioning, the extra inference cycles 51:51that you're going to see are going to be, I 51:53think, mainly for the purpose of governance. 51:55It's to make these things, um, kind 51:57of self reflect a little bit, maybe 51:59think twice about what answers they're 52:01putting out there and so forth. 52:03So you're going to have more tools 52:05for governing the agents as well. 52:08So the Granite Guardian 3. 52:111 release that just happened 52:13actually has a function calling 52:14hallucination detector in there. 52:16So that's one of the things 52:17that agents actually do, right? 52:20As part of the LLM Uh, they actually will 52:22call some other tools, some other agents, 52:24some other function and if that itself is, uh, 52:27hallucinated the parameters, the, um, the, the 52:30type of the parameters, the function names, 52:33all of these things can, uh, kind of go wrong. 52:36So we have ways of, of 52:37detecting, uh, issues there. 52:39Kush, I'm, I'm curious, you 52:40said the, the inference. 52:43runtime is going to be used more almost 52:45for kind of governance and self reflection. 52:48But I think you had even shared a paper 52:51recently about how there's also like, it 52:53also opens this whole can of worms of other 52:55risks and potential security issues, right? 52:58When the models are running all these 52:59loops offline and people are naturally 53:01able to observe what's going in the 53:04Yeah, I mean, uh, I think This whole, I 53:06mean, self reflection, you can call it 53:08metacognition, you can call it wisdom. 53:11I mean, I think these are going to be things 53:13that are going to be part of what happens. 53:17But yeah, I mean, anytime you have extra stuff 53:20happening, more loops, more opportunities, 53:22more surface area for attacks, right? 53:24So I think that is certainly 53:27going to be part of it. 53:28But I have hope that just like in other 53:32sort of systems, you can have I mean, better 53:35control when you can kind of have more 53:37opportunity to kind of affect what happens. 53:40Yeah, and I think that ends up being critical 53:42and I think is also a pivot that I was 53:43going to mention to kind of throw it back 53:45to you, Kate, is, you know, if all of the, 53:48you know, open source is just coming up 53:49so quickly in 2024, Um, it feels like 2025 53:52might finally be the year where it's like 53:53we're at parity or even open source is like 53:55going past closed source in some sense. 53:58And I think, you know, this is happening 53:59not just because the technology is getting 54:00better, but also like Kush is saying, like, 54:03you know, I think our ability to you know, have 54:06components that ensure safety in deploying open 54:08source models is also getting better, right? 54:10In the past, it was like, well, we have to rely 54:12on closed source because they really understand 54:14how to do alignment and security and safety. 54:15There's a lot of scare tactics out there. 54:17That's right. 54:18Yeah, exactly. 54:19Only the big model providers have 54:20the budget to be able to look at how 54:23to do this safely or the expertise. 54:25Um, that's right. 54:26I, I, yeah, I think we're finally getting, 54:28you know, chipping away at that enough. 54:30We're seeing Meta, for example, doing a 54:32phenomenal job releasing very large models. 54:35with excellent safety alignment out there and 54:38showing that you can do this out in the open. 54:40It does not need to be, you know, inside 54:42of, uh, behind a black curtain, so to speak. 54:45Yeah, for sure. 54:45Is that a prediction for 2025? 54:47That we can, we can have 54:48our cake and eat it too. 54:48Like, like we can have it be 54:49open and it can also be safe. 54:51Absolutely. 54:52Yeah. 54:53That's exciting. 54:54Um, do you have any open source 54:55predictions going into the next 12 months? 54:56Like where do we go from here? 54:58Um, you know, I guess more 54:59granite, even better granite. 55:00I think the next year is really going to 55:02be focused a little bit higher up, uh, the 55:05stack on top of the models and co optimizing 55:08models and the developer frameworks. 55:11In which they're executed on. 55:12So we saw the release of Llama stack, right? 55:15When, uh, 2024, um, I think we're going to 55:18see that wildly evolve, um, as it starts 55:22to mature and other similar types of 55:25capabilities and stacks being developed. 55:27I think we've all also kind of accepted 55:29like the open AI endpoint way of working 55:32with models is, you know, the incumbent 55:34operating of, of, uh, way to operate. 55:37But. 55:38There's probably other ways we can continue to 55:39innovate and improve now that we've been around 55:42the block a few times, so I think we're going 55:43to start to see a lot of open source innovation 55:45a little bit higher up the stack, particularly, 55:48you know, from model providers who are 55:51looking, how do we further improve performance? 55:53It goes hand in hand. 55:54If you're trying to optimize and run 55:56innovate at the inference time, you 55:57need a stack that can handle that. 55:59And so, That's where I think a lot of 56:00the development is going to happen. 56:02Yeah, for sure. 56:02Yeah, I feel like there's so much that we've 56:05just taken as a given in some ways, just 56:07because, like, that's where stuff got started. 56:09But it's even easy to forget, 56:10given that there's so much news. 56:12Like, this is like all very fresh. 56:13And like just a few years ago, it 56:15was like basically non existent. 56:16Um, so. 56:17Yeah, I think one that I kind of put before 56:19this group, particularly because we're 56:20talking about product releases, you know, 56:22I think this year I'm mixture of experts. 56:23We've talked a lot about how like chat, right? 56:26It's just like an interface that 56:27we we started with just because 56:29like chat GPT was so successful. 56:31But there's kind of no reason why like 56:32that has to be the way we interact 56:34with these systems going forwards. 56:35Um, I'm curious if either of you have kind of 56:37predictions on like, even like the interface. 56:39Like do we start interacting with these 56:41systems in a way that's, that's pretty 56:42different from like what we've gotten used to? 56:44Yeah. I mean, I think, uh, the co creativity, co 56:47creation is going to become a bigger thing. 56:50So. 56:51having multiple people. 56:52I know there's been some canvas sort of 56:54things that have come out this year as well, 56:56but, uh, I think it's just going to grow. 56:58And, um, uh, let me give a 57:00brief shout out to my brother. 57:02Um, he has a startup called KOCREE, K O 57:04C R E E, and I just got to get that in. 57:08Exactly, exactly. 57:09And, um, uh, it's all about 57:10kind of co creating music. 57:12Uh, for people, um, kind of through with AI, 57:15but also to help people, um, and society, um, 57:18with their well being and, uh, and so forth. 57:20Because when you create with others, um, it's 57:23actually like a positive experience as well. 57:25So, um, uh, so I think, you know, Just a 57:28shift in focus a little bit maybe towards more 57:31human flourishing human sort of well being 57:33and kind of how to get people to really work 57:36together to have kind of open endedness and 57:40so forth might be something that emerges. 57:42What, uh, maybe we're got a little 57:44few minutes left on this segment. 57:45Is there anything that 57:46folks aren't talking about? 57:48Like, I guess that's one thing is like, I 57:49feel like, um, you know, and particularly 57:51in AI, everyone is always excited about 57:55like the latest model release or like the 57:56latest, you know, um, Yeah, I'm always 57:57kind of trying to see around corners. 57:59I think with both of you were you're kind 58:00of experts to think about this so deeply, 58:02like what's under hyped, maybe like 58:04underrated at the moment that really deserves 58:06more attention going into the next year. 58:08I think there's a going to 58:09be a tremendous opportunity. 58:11And I really hope this takes off around thinking 58:14about modular components for building with LLMs. 58:17So how do we, like there's work 58:19going on, for example, on how do we 58:21get to the point where you could. 58:22fine tune a LoRa adapter, right? 58:24Kind of a bucket of weights that you fine tune 58:25for your task that sits on top of the model. 58:27Right now they have to be tailored 58:29for the exact model you're going to 58:30deploy and you, a new version comes 58:32out, you have to retune your model. 58:33But how do we create versions of this? 58:35For example, there's interesting research that 58:37are universal that can be applied anywhere. 58:39And then that creates some really nice, 58:41like modular components that you could 58:44ship or could have a catalog of and choose 58:46from and provision live and swap in and 58:48out again at inference time, you could 58:50swap these types of things in and out. 58:51I think there's also aspects like 58:53we've all heard now of our, uh, seminal 58:56mixture of experts architecture, right? 58:58Um, so there, I think is going to be increasing 59:01look at, can we make modular components 59:04where you have modular experts that get 59:05swapped in and out on the architecture side? 59:07So I would love, and I think there's 59:09some really interesting, um, research 59:11going on at the ground level that could 59:13support a focus around how do we make. 59:16building and specializing 59:17models more modular in 2025. 59:20Yeah, that's super cool. 59:21And I think doesn't get enough attention. 59:22I mean, I think everybody's always 59:23like the AI just it just doesn't 59:25One big model that does everything. 59:26Yeah Why do I have to choose and 59:28that's once we have the big model 59:29all our problems will be solved, right? 59:31Bigger is better, right? 59:32Yeah, for sure How about you Kush anything 59:34underrated you'd point out to our listeners 59:35before we close up on this segment? 59:37Yeah, I think um, I mean the the middleware 59:39for agents I think would be one thing as well. 59:43I mean, building on what Kate 59:44just said about the modularity. 59:45So, uh, even having, uh, different agents 59:49in a multi agent system, how you kind of 59:50register them, orchestrate them and so forth. 59:52So, um, from IBM research, uh, 59:54we have this, uh, B framework. 59:56So B as in the, the thing buzzing around in my 59:59ear, um, uh, and, uh, that, Uh, is out there. 1:00:03There's, um, other startups as well. 1:00:06Um, so, uh, some former IBM researchers, 1:00:09uh, have this company called 1:00:10Emergence AI, and, um, they have one. 1:00:13Um, uh, there's others out there as well. 1:00:15So, um, I mean, I think that's gonna 1:00:18pick up, um, and it, I mean, again, 1:00:20relates to what Kate was saying. 1:00:21I mean, connecting more between the 1:00:22development environment and the models 1:00:24kind of, uh, linking that, uh, much closer. 1:00:27So I think, uh, Uh, once we are at a point where 1:00:31the models are all kind of good enough, then 1:00:34it's a question of, um, how do we use them? 1:00:36How do we make productive use? 1:00:38Um, and, uh, how do we develop them better? 1:00:40So, yeah. 1:00:40Yeah, for sure. 1:00:41Definitely keep an eye on that space. 1:00:43Well, Kate, Kush, thanks for 1:00:44joining us, uh, on this segment. 1:00:45Appreciate you helping us to 1:00:46navigate 2024 in product releases, 1:00:48but also 2025 in product releases. 1:00:50And we will See you in the new year. 1:00:52Well, that's everything we have 1:00:53time for on our episode today. 1:00:55So much happened in 2024. 1:00:57And there's basically no way 1:00:58we could fit it into one show. 1:01:00But I want to thank all of our 1:01:01panelists for helping us try. 1:01:03And to all the panelists that we've been lucky 1:01:05enough to have on Mixture of Experts in 2024. 1:01:08Each week, we get to nerd out with some 1:01:09of the smartest people in the business. 1:01:11And it's a pleasure to be able to talk 1:01:12with them to better understand this 1:01:14crazy world of artificial intelligence. 1:01:17And thanks to you for joining us. 1:01:18If you enjoyed what you heard, you 1:01:19can get us on Apple Podcasts, Spotify, 1:01:21and podcast platforms everywhere. 1:01:23Here's to what was a great 2024, and here's 1:01:25looking forward to an incredible 2025.