Learning Library

← Back to Library

Gaming Preferences Meet AI Model Updates

39m • Unknown Channel • ai-ml • news • intermediate • Watch on YouTube ↗

Key Points

The episode opens with light‑hearted introductions, where guests share their favorite video games (Zelda Breath of the Wild, GTA, and Minecraft) before diving into the show’s AI focus.
Host Tim Hwang announces several major items on the agenda: new BeeAI updates, the latest Granite release, and a recently published paper on emergent misalignment in large‑scale models.
The centerpiece of the discussion is Anthropic’s launch of Claude 3.7 Sonnet and Claude Code, highlighting the modest 0.2 version jump from the previous 2.5 model and the team’s emphasis on a more curated, “opinionated” user experience.
Maya Murad notes a growing distinction between cloud‑based model offerings (such as Anthropic’s) and OpenAI’s approach, suggesting that the competition is shifting from raw capability to stylistic and experiential differentiation.
The panel speculates that future battles among foundation‑model providers may focus less on sheer performance and more on the nuanced “style” and user‑experience each company builds into its models.

Sections

Full Transcript

# Gaming Preferences Meet AI Model Updates **Source:** [https://www.youtube.com/watch?v=561dyCTvGlQ](https://www.youtube.com/watch?v=561dyCTvGlQ) **Duration:** 00:39:35 ## Summary - The episode opens with light‑hearted introductions, where guests share their favorite video games (Zelda Breath of the Wild, GTA, and Minecraft) before diving into the show’s AI focus. - Host Tim Hwang announces several major items on the agenda: new BeeAI updates, the latest Granite release, and a recently published paper on emergent misalignment in large‑scale models. - The centerpiece of the discussion is Anthropic’s launch of Claude 3.7 Sonnet and Claude Code, highlighting the modest 0.2 version jump from the previous 2.5 model and the team’s emphasis on a more curated, “opinionated” user experience. - Maya Murad notes a growing distinction between cloud‑based model offerings (such as Anthropic’s) and OpenAI’s approach, suggesting that the competition is shifting from raw capability to stylistic and experiential differentiation. - The panel speculates that future battles among foundation‑model providers may focus less on sheer performance and more on the nuanced “style” and user‑experience each company builds into its models. ## Sections - [00:00:00](https://www.youtube.com/watch?v=561dyCTvGlQ&t=0s) **Untitled Section** - - [00:03:09](https://www.youtube.com/watch?v=561dyCTvGlQ&t=189s) **Toggleable Reasoning for LLMs** - The speakers discuss a new optional reasoning mode for language models that users can flip on or off, balancing latency and token cost while enhancing response quality. - [00:06:17](https://www.youtube.com/watch?v=561dyCTvGlQ&t=377s) **Game-Based AI Evaluation Revival** - The speakers discuss Anthropic's use of Pokemon gameplay as a playful benchmark for Claude, recalling earlier game‑based AI tests and debating whether such video‑game evaluations are useful or merely a gimmick. - [00:09:27](https://www.youtube.com/watch?v=561dyCTvGlQ&t=567s) **Dynamic Game-Based AI Evaluation** - The speakers argue that real‑time, strategic games like Pokémon battles provide a more realistic test of an AI’s reasoning, adaptability, and decision‑making than static knowledge benchmarks. - [00:12:31](https://www.youtube.com/watch?v=561dyCTvGlQ&t=751s) **IBM BeeAI Framework New Release** - The speaker recaps a year of building IBM’s BeeAI agent framework in TypeScript for web apps, explains the motivation behind creating their own solution, notes strong developer community interest, and teases an upcoming Python version. - [00:15:39](https://www.youtube.com/watch?v=561dyCTvGlQ&t=939s) **Standardizing Interoperable AI Agents** - The speakers discuss advancing open‑source standards that enable AI agents to discover, collaborate, and operate across different frameworks, referencing the model context protocol and teasing a forthcoming announcement on agent interoperability. - [00:18:41](https://www.youtube.com/watch?v=561dyCTvGlQ&t=1121s) **AI Agents Automating Git Fixes** - The speaker explains how AI agents can ingest GitHub bug tickets, plan and generate pull requests automatically, and discusses scaling the parallel agents based on GPU resources. - [00:21:50](https://www.youtube.com/watch?v=561dyCTvGlQ&t=1310s) **New Sparse Embedding and Forecasting Models** - The release introduces an experimental sparse‑architecture embedding model for efficient retrieval, ultra‑compact time‑series forecasting models that achieve top‑3 rankings on the GIFT leaderboard with daily and weekly resolutions, and a streamlined five‑billion‑parameter Granite Guardian model for enhanced safety monitoring. - [00:24:53](https://www.youtube.com/watch?v=561dyCTvGlQ&t=1493s) **Evolving IBM’s Generative AI Strategy** - The speaker outlines how IBM leverages its vast research talent to build a language‑first generative AI platform, expanding tooling and cross‑domain applications such as forecasting, discovery, and chemistry. - [00:27:58](https://www.youtube.com/watch?v=561dyCTvGlQ&t=1678s) **Shift Toward Smaller Efficient Models** - The speakers discuss the industry's pivot from massive, closed‑source AI models to flexible, smaller, open‑source alternatives, stressing that a mix of model sizes is essential and highlighting IBM’s enterprise‑first, trustworthy AI approach. - [00:31:09](https://www.youtube.com/watch?v=561dyCTvGlQ&t=1869s) **Fine‑Tuning Code Models Increases Exploit Risk** - Enhancing language models for better coding unintentionally equips them to generate exploits and vulnerabilities, revealing that improvements can undermine safety guardrails and demand continuous monitoring and adaptation. - [00:34:15](https://www.youtube.com/watch?v=561dyCTvGlQ&t=2055s) **Fragile Model Alignment and Layered Safety** - The speaker acknowledges the lack of formal proof for a prevailing theory, highlights how small data points can dramatically shift model behavior, underscores the fragility and unintended side effects of alignment efforts, and advocates for a multi‑layered, guardian‑model approach to AI safety. - [00:37:25](https://www.youtube.com/watch?v=561dyCTvGlQ&t=2245s) **Beyond Fine‑Tuning: Preserving Model Alignment** - The speaker argues that fine‑tuning alone is limited and proposes adding modular parameters (e.g., mixture‑of‑experts) to extend alignment while keeping the original model’s behavior intact and reducing brittleness. ## Full Transcript

0:00What is your favorite video game? 0:01Kate Soule is Director of Technical 0:03Product Management for Granite. 0:04Uh, Kate, welcome back to the show. 0:06What do you, uh, what do you prefer? 0:08I really liked the Zelda, uh, Breath of the Wild 0:10video game series. 0:12That series is so good. 0:13Um, Maya Murad is Product 0:14Manager, AI incubation. 0:16Uh, Maya, welcome to the show. 0:17Uh, favorite video game? 0:19Have to say GTA. 0:20Okay, that's awesome. 0:23And then, uh, Kaoutar El Maghraoui, 0:25a Principal Research Scientist, AI 0:26Engineering, AI Hardware Center. 0:28Kaoutar, what do you think? 0:29I like Minecraft, so which I think it's 0:31a cultural phenomenon allowing players 0:33to build and explore in this sandbox 0:35environment, which I think pretty cool. 0:37All that and more on today's Mixture of Experts. 0:44I'm Tim Hwang and welcome to Mixture of Experts. 0:46Each week, MoE brings you the nerdy 0:48chat, banter, and technical analysis 0:51that you need to understand the biggest 0:52headlines in artificial intelligence. 0:54As always, there's a ton to cover. 0:56Uh, we've got new announcements coming out of 0:57BeeAI, a new release of Granite, uh, a really 1:01interesting paper around emergent misalignment. 1:04Uh, but first I really wanted to talk about. 1:06Claude 3.7 Sonnet and Claude Code. 1:09Um, so this is one of the big 1:10announcements product wise for the week. 1:12Uh, Anthropic announced the latest generation 1:14of its premier model, uh, Sonnet, the 3.7 1:17model, um, as well as kind of a new coding 1:20agent that they've been playing around with. 1:21But let's start with 3.7. 1:23I know, Maya, you've actually had a 1:24chance to play with, uh, this new model. 1:26Curious for your early impressions, you 1:28know, things that are working, not working, 1:30uh, whether or not you like it at all. 1:31Just curious about where 1:32your hot take review is. 1:33Yep, I did try it out and I was 1:35actually surprised that it was only a 0.2 1:38version upgrade. 1:39So the last one was 2.5 and that one was known 1:43to be good at coding, but maybe wasn't my go 1:46to for writing and I actually tried the 2.7 1:48on a writing task and I was blown away by it. 1:51The second thing that is really coming 1:54through with these with the cloud 1:56version of models is the emphasis on 1:58experience that is a bit more subtle. 2:01So I think they're curating their training 2:03data in order to provide you somewhat of 2:06an opinionated experience, but more on the 2:08Apple way, giving you a good experience. 2:11And I'm starting to see a wedge 2:13between what cloud is doing and what 2:16OpenAI is doing with their models. 2:17Yeah, I think that's sort of really 2:19interesting and we've talked about on the 2:20show before about like how the kind of 2:22competition between these big foundation 2:24models is going to evolve over time. 2:26And I think that bit is like pretty interesting. 2:28I mean, okay, I don't know if you get 2:29a similar sense that like Anthropic is 2:31almost kind of playing like a almost like 2:33a style game now more than anything else. 2:36And like almost the battles moving from 2:37like capabilities to this new thing, but 2:40I'm curious about what you think about that. 2:41I 2:41really like the comparison my, 2:43uh, for Anthropic to kind of being 2:45that Apple equivalent in the field. 2:47You know, one of the things that they did 2:50with a 3.7 release that I am really excited 2:53about is they released reasoning, but they 2:56released reasoning in a very pragmatic 2:59way, a way where you can basically choose 3:02how much you want to spend, like how many 3:04tokens you want to generate, because you 3:06don't need a ton of reasoning for all tasks. 3:09And it gives you the ability, basically, when 3:12you have more complicated things to quote 3:14unquote pay more, both in terms of like latency 3:17and cost of the tokens you're generating. 3:19in order to prove improve the model response. 3:22So that feels like a really like 3:23usability pragmatic approach to reasoning 3:26that we haven't seen yet and I think 3:27is going to quickly become the norm. 3:29I mean, this is the only way really to go. 3:31If we look at where reasoning can kind of 3:34add values, we need to have reasoning as a 3:36knob that we can kind of selectively apply, 3:39not something that it's just like, okay, 3:41you know, every response including "what 3:43is 2+2?" Is going to come back with 3:45five paragraphs of reasoning and a ton of 3:47latency while I wait for that response. 3:48Yeah, it is kind of very funny seeing it 3:50emerge because it sort of is a new paradigm 3:52for computing in some ways where it's 3:54like, you know, in the past to be like, 3:56I want you to just execute this program. 3:58And then the computer just executes the program. 4:00But now you almost have to specify 4:01like, and I want you to try really hard 4:03at it is like a separate option that 4:05I think you need to kind of toggle. 4:07And it's like, yeah, it's interesting 4:08trying to figure out like how we 4:09make that just like a very natural. 4:11Option you can kind of flip 4:12on and flip off as you go. 4:14Kaoutar, maybe I'll turn to you. 4:15You know, I think one of the interesting bits 4:17of this is not just that there's kind of a new 4:18model on the table, um, but that they are also 4:21starting to play in this coding agent space. 4:24Um, and, you know, it's actually 4:26very funny if you read the blog post. 4:27They're like, we really believe in an 4:29integrated experience where reasoning 4:30and the model are all kind of together, 4:32and it should be fully integrated. 4:34Oh, by the way, we also have this 4:35completely separate thing that 4:36we're announcing and launching. 4:38Um, but I'm kind of curious about, like, why 4:40you think they're kind of breaking out 4:42Claude Code as its, like, own separate functionality. 4:44And, you know, if that's only, almost 4:46going to kind of, like, increasingly 4:48become its own sort of thing over time. 4:50Or, or, you know, this is just because they're 4:52experimenting and, you know, we'll eventually 4:53kind of all get integrated into one experience. 4:56Yeah, I think that's a great question. 4:57I, I, I also was kind of a bit 4:59surprised that they're separating 5:00the code from the other models. 5:02Uh, so, but probably also they're focusing 5:07on this agent decoding, uh, which is still 5:10right now with limited research preview. 5:12So I think they're still experimenting 5:14with it, and I'm hoping eventually that 5:16it'll be integrated with the rest of their, 5:19uh, models or the bigger view that they 5:21have, uh, because here they're trying to 5:24focus on how do we assist developers 5:26by autonomously performing code 5:28related tasks such as searching, 5:30reading code, editing files, et cetera. 5:34Um, so I, I think the reason why it hasn't 5:36been integrated fully because it's still in 5:39this, uh, limited research preview and it 5:42deserves kind of its own evaluation and focus. 5:45And I think this kind of goes to 5:47sort of the general question of like, 5:48how do we do good evals on this? 5:50Like, I almost kind of think that 5:51like the evals is like now, it's 5:53like the tail wagging the dog, right? 5:55Like the evals are actually like forcing 5:57kind of like product differentiation 5:58because you're like, oh, we need a team 6:00that just gets really good at this eval. 6:02And then after over time, you're like, actually, 6:03this is almost like a different product because 6:04we're just working against this eval so hard. 6:07Um, yeah, it's, it's very interesting to see. 6:10Uh, so I promised that I think I would tie 6:12back the sort of top line question that I 6:15asked, which is about favorite video game. 6:17Um, to actually artificial intelligence 6:19and the headlines that are popping up. 6:21And, and I did want to kind of 6:22tie it to the Claude launch. 6:24Um, one of the fun bits about the launch is that 6:27they, in addition to all the usual benchmarks, 6:29said, hey, and here's how all of the versions 6:31of our model perform against, uh, Pokemon, um, 6:34and how far it got in the game of Pokemon. 6:36And I love this because it's like a 6:37very fun kind of playful thing to do. 6:39To Maya's point, it was like a 6:40little bit kind of like style points. 6:42But it was also sort of interesting 6:43because it kind of feels like, you 6:45know, I remember back to like 2016. 6:46Like everybody's all about like 6:47how far could you get in Atari? 6:49How far could you get in this arcade game? 6:51And like, that was almost like the eval that 6:53we used in that early phase, but it sort 6:54of disappeared as all of these kind of more 6:56formal benchmarks got more and more serious. 6:59Um, and with this, it was 7:00just kind of interesting. 7:01Like people got so excited. 7:02I had a friend who is at Anthropic who was 7:04telling me that like office productivity was 7:06shut down because they were just watching 7:07to see how far Claude could get in Pokemon. 7:10Um, and I guess I just wanted to kind of 7:12bring this up because it's like, you know, 7:13almost the return of the video game eval. 7:16Is it useful or is it kind of more of a gimmick? 7:18I don't know. 7:19Kate, like, is it, should 7:20we see this as almost like. 7:21Yeah, this might actually be kind of 7:22a paradigm of evals that we should 7:24be exploring and expanding on, or, or 7:26is it more just kind of a fun thing? 7:27It's like fun to see AIs try 7:28to get through a video game. 7:30I mean, I remember, I think this was one of the 7:33things that Twitch first came out with that made 7:35Twitch famous when the world kind of stopped 7:38and was just watching as everyone was suggesting 7:41the next step and was kind of this like random 7:43function generator and going through Pokemon. 7:45So basically without you know, instead 7:47of anthropic Claude model choosing the 7:49next thing to do in pokemon Everyone was 7:51submitting their vote of what should happen 7:53next and it was kind of just like a random 7:54amalgamation of you know all these inputs 7:57it would select an output and the Pokemon 7:59game would proceed and that got really far. 8:01So you know, if you're asking about, 8:04uh, like, is this a useful evaluation? 8:06Like basically a random number generator 8:08was able to play Pokemon successfully if 8:10you waited long enough, but you know, that 8:13aside, like, I think these games are what's 8:16made them really popular in, especially 8:18back in like the Atari games is, you know, 8:20they have reward mechanisms, so you can use 8:22reinforcement learning and use that in order 8:24to incentivize the model to play the game. 8:27Uh, and there's all sorts 8:28of interesting that happens. 8:29Things that can happen, like the model 8:31just decides it's too hard to play the game 8:32and so it kills itself and just gives up. 8:35Um, so, you know, it's certainly 8:38an interesting ecosystem to use to 8:40evaluate and to help develop more, um, 8:44reward system based training protocols. 8:47Uh, so I think it is useful from that 8:48perspective, but I also want to take it, again, 8:51random number generator played Pokémon, so I 8:53wouldn't take it with too much, uh, weight here. 8:55I think it's more a fun 8:56cultural thing that's going on. 8:58Yeah, for sure. 8:59And that's actually, it's, it's fun. 9:00Are you kind of saying almost like the, it's 9:02like the return, return of reinforcement 9:04is almost making games cool again? 9:06Is that like the right way of reading it? 9:07Probably, yeah. 9:08But Tim, I might have here like 9:10maybe a different take on this. 9:12I think, I was really excited 9:14to see, um, Anthropic using 9:16Pokemon, you know, for their eval. 9:18And, uh, instead of using the standard 9:20AI benchmarks, I think Pokemon is the 9:22perfect control environment for especially 9:25testing the reasoning aspects of AI. 9:28So because here AI must understand the game 9:30mechanics, the perfect opponent moves, the how 9:34do you optimize all these different strategies. 9:37So it does involve real time decision 9:39making and their uncertainty. 9:41And it kind of mimics real 9:43world AI applications. 9:44And another thing is pretty dynamic. 9:46So unlike static matchbox. 9:48Pokemon battles here forced the 9:50model to adapt continuously. 9:53So what does this say about, you 9:55know, all these evaluation trends? 9:57So I think standard benchmarks 9:58like the MMLU, the truthful QA, et 10:01cetera, I think they're limited. 10:03So they test the knowledge, but not 10:05really the real time decision making. 10:08So if once we start introducing these 10:10gamified evaluation methods 10:12like Pokemon battles, 10:13these might be more accurate ways of 10:16measuring, um, reasoning and adaptability. 10:20Yeah, I'm really interested in this is 10:21like, we've talked a lot I think on the 10:23show about how all of the existing evals 10:26are kind of like very limited and if 10:27anything seemed to be getting a lot more 10:29limited with time where like people report 10:31results on benchmarks and 10:32people were like, ah, whatever. 10:34And I guess my worry is always 10:35that like everybody has then 10:36said, okay, well then just vibes. 10:38That's how we're going to evaluate the model. 10:40Uh, and this almost seems like another path, 10:42which is, well, it's an eval that's not very 10:43standardized and there's a lot of variables 10:45being tested, but like seems to maybe be a 10:48little bit more objective than you played 10:50with the model for 15 minutes and you think 10:52it's better or worse than the other model. 10:53I think I'm somewhere in between 10:55Kate and Kaoutar where I will give 10:57them points for trying something new. 10:59Um, we talked a lot about benchmarks 11:01before on the show and how they're 11:03imperfect, but they're necessary. 11:05So kudos to them for trying something new, but 11:07also really interesting that we're going back 11:09to using games to simulate model performance. 11:13I had a brief stint at Unity Technologies, 11:15which is a game simulation environment. 11:18And a lot of the video 11:19games are built using Unity. 11:21And at the time, all their AI work was on 11:24reinforcement learning agents that ran 11:26in their game simulation environment. 11:27So it feels like we're going back 11:29to how agents initially came about. 11:31Um, and look, game environments 11:34are great because it makes. 11:36It really, it's a clean environment to, to run 11:39a test to get a clear result, but at the same 11:43time, like what is really interesting about 11:45today's technology with like LLM based agents 11:47is they can operate in fuzzy environments. 11:50And I think it's, I think we need to 11:53have better reliable benchmarks on 11:56operating in fuzzy environments that 11:58are changing that are non standard and 12:01it's a difficult, it's difficult to find 12:02these so like kudos for them to trying 12:04and I'm sure there's going to be more 12:06innovative ways of testing coming forward. 12:08Yeah and it gets me thinking just about 12:10like all the possible games you could 12:11apply in the space that might make for 12:13really interesting evals and I guess test 12:15different aspects of like agent behavior. 12:22Well on the topic of agents actually I want 12:24to move us to our next topic and this is a 12:25great segue because I want to talk about BeeAI. 12:28Ideally a great topic because Maya you're here. 12:31BeeAI if you're not familiar is IBM's 12:33agent framework and Maya I understand 12:35there's basically been a new release. 12:37Uh, that just dropped, um, and so, uh, maybe 12:41I'll just kick it over to you initially to 12:42kind of talk a little bit about like what, 12:44what is launching, um, and what are the big 12:46changes people should be paying attention to. 12:48Yeah, of course. 12:48So, um, just framing this, um, it's been 12:51almost a year that my team has been on 12:53this journey of incubating AI agents. 12:56We started with the premise of how 12:58can we make it easy for anyone to 13:00reap the benefits of this technology. 13:02So we went all the way to the everyday builder. 13:04So someone who, might not be familiar with 13:07writing codes, but understands really well their 13:09own processes and has a good intuition for how 13:12to improve them so that kind of fed all the 13:16requirements for how we needed to build agents. 13:18And it was the main motivating 13:20factor to build our own framework. 13:22We did not find at the time the capabilities 13:24we need in order to power this experience. 13:28Um, this also led to another decision. 13:30So if you look at most of the frameworks that 13:33existed at the time, they were all in Python and 13:35we needed something in TypeScript because we're 13:37doing a production ready web app based on that. 13:40So that was a great learning. 13:41Um, I think we recapped the year with, we have 13:43very strong signal from the developer community. 13:46Let's double down on that 13:47before expanding the user base. 13:49And the top ask was for one, a Python 13:52framework, um, which We have a pre alpha right 13:56now that will graduate to alpha next week. 13:59And then the second really interesting 14:01learning is there's not one agentic 14:04architecture to solve every single problem. 14:07So last year when we were talking about 14:09agents vaguely and the fully autonomous 14:11agents there was this hint or promise that 14:15maybe if we founded the right with the 14:17right model and the right architecture, 14:18you could solve a spectrum of problems. 14:21But from a year of learning and 14:23observing developers, every single 14:25use case, it's its own snowflake. 14:27And you have to take the acceptance criteria, 14:30um, take that domain and really build. 14:33Your requirements and your system around that, 14:36and I think the changes that we've made in the 14:38framework reflect the reality on the ground 14:41of how you can make useful stuff from models. 14:44Does that mean you think over time 14:45we'll see kind of agent frameworks 14:48really become more specialized? 14:49Like the dream of the generalized agent, 14:51maybe it's just not a practical reality. 14:53Yeah, I think there's two different plays here. 14:57Um, I think frameworks will 15:00either be narrow and opinionated 15:03or unopinionated and horizontal. 15:06Um, and, and this is a really 15:09interesting paradigm because if you 15:12want to do a code agent, now you have 15:14to learn a whole set of capabilities. 15:17So it feels like we have many walled gardens. 15:20And that's kind of, uh, 15:21what's our next direction. 15:22We're thinking about a world where 15:24you're kind of not locked in in 15:26these different Agent ecosystem. 15:29So you're kind of not locked into 15:30a specific framework or language. 15:33But you all of these agents can come together 15:36and self discover their capabilities. 15:39You could orchestrate them and 15:40you would actually not care which 15:42framework they're implemented with. 15:43So, um, if you refer to like our statement 15:47of what's coming next, we're really 15:48excited about agent interoperability. 15:51And this is really the true premise of like 15:53people working early in the days of agents. 15:55What if an agent can self discover 15:57in other agents and collaborate 15:58together to solve a problem? 16:01This is a step in that direction, and 16:02we're making a really cool announcement 16:04about that in two weeks time. 16:06So, Maya, do you think we're moving 16:07towards, like, a standardization? 16:09Uh, like, maybe creating an open source standard 16:12for these agents interactions, APIs, how they 16:15discover each other, and things like that? 16:17Absolutely. 16:17Like, that's a great question. 16:19I think the model context protocol was 16:22a step in that direction, standardizing 16:24model access to tools and context. 16:27I think agents is what's next. 16:29Um, and the core, what will power 16:32this interoperable experience 16:34is coming together on standards. 16:36But the thing with standards is like, 16:37you can go and design standards by 16:39committee, but if you drive it via 16:41features, then like you have a better 16:44incentive to bring a broader pool together 16:47of people on the standard and that's kind of 16:49our approach like let's show you the art of the 16:51possible with an interoperable agent world and 16:54then that's the hook to like work on a standard 16:57yeah i think the interoperability is so 16:58important because i think otherwise it's just 17:00like are we designing apps like it feels like 17:03in some ways like the sort of dream is that 17:05like actually you know agents are general, 17:07they can roam, they can be interoperable. 17:09And I think this is actually the big 17:10question for all these projects attempting 17:12to kind of preserve openness in the space. 17:14It's just like how long can they kind of 17:16avoid the centrifugal force of like people 17:18creating like walled gardens that are kind of 17:20like only able to kind of talk to themselves. 17:22What's really exciting about Bee 17:24again is that interoperability. 17:26And I know on the Granite side we've 17:27been working closely with the Bee team 17:29on a number of demos and examples. 17:32And it's really great just to see the level 17:34of flexibility that you can build into, uh, 17:37an agent, right, uh, and be able to deploy it. 17:40So really excited to share some 17:41of those resources in the next, 17:42uh, coming days, uh, with Granite 17:45Bees. 17:45So, Maya, maybe two last questions. 17:46I think one of them for you is... You know, 17:48I think when all these discussions, like 17:50it's almost become a little bit of a joke. 17:51I feel like every time we have Chris 17:52Hay on here, he's like "agents!!" And like 17:54makes a big deal about saying "agents." 17:56Um, it's sometimes I think hard for 17:58folks, especially myself, I think 17:59to kind of put their heads around, 18:01like, what do we, what does it mean? 18:03Like when, when an agent is doing something, 18:06um, and I kind of curious, like, is there 18:07a demo where you're like, oh, this is the 18:09awesome thing that I always point people to. 18:11When they're curious about like 18:12why agents are important, exciting. 18:14I'm curious if there's like some 18:15examples you want to throw out. 18:16So there's 18:16actually a great YouTube video 18:18that IBM Research team put out. 18:20Um, I think it's called "SWE Agent" and 18:23it's really interesting because it's kind 18:25of showing you the art of the possible 18:27within an interesting user experience. 18:29So yeah. 18:29Yeah. 18:30Let me paint the picture of how it was before. 18:32If I wanted to do code assistance, I would have 18:35to, like, let's say it's a plugin in VS Code. 18:38I would go to VS Code and then maybe 18:40would kind of like observe what I'm doing. 18:41But I had to like copy 18:42paste things left and right. 18:44And I had to have several touch points in 18:47order to fix maybe one file, for example. 18:51So this completely flips the 18:53paradigm of how to solve. 18:55software engineering problems. 18:56So here, the user experience starts with, I 18:59have a ticket in GitHub that outlines a bug, 19:03invoke, I assign this ticket to the agent, 19:05the agent then goes through that, all of the 19:09files in the repo comes up with a plan, you 19:12could approve or change the plan before it 19:14goes ahead, or you could just let the agent go 19:16ahead, and then the agent comes up with a PR. 19:19And you're no longer in this 19:21instantaneous mode where you ask a 19:22question, you immediately get an answer. 19:24This is something that you let run 19:25for an hour or two, but you just 19:27automated a significant chunk of work. 19:29So let's say you had hundreds of them. 19:32You could unleash 100 agents and come back 19:34the next day and review what they did. 19:36So Maya, is there like any limitations 19:38today in terms of how many agents 19:40can all work together simultaneously? 19:42Yeah, that's more of a like 19:44considerations related to scaling. 19:47So it really depends on how many GPUs 19:49if you're running the models locally, 19:50how many GPUs you have, the ability 19:53for you to have many parallel agents 19:55working. 19:56Um, so it really depends on the capacity 19:59you have to put that together, but 20:01paralyzation and scaling up and down agent 20:03capacity is I think topics that will be 20:06explored more significantly this year. 20:08And I'm starting to get a lot 20:09more questions on that end. 20:15I'm going to move us to our next topic. 20:16Uh, I think we're going to 20:18move on to another IBM release. 20:20I think, Kate, you and I, we actually hyped 20:22this release last week, being like, 20:25Granite 3.2, it's coming, get excited. 20:26Um, and now it has finally dropped. 20:29Um, so, uh, it was good to have you on 20:31the show for this episode to kind of like 20:33walk us through, um, what has launched. 20:36Um, and I guess, you know, we, we can 20:37probably go into it a little bit more 20:38depth than we did last week, is like 20:40what the team has been focused on, um, 20:42for this launch, um, in particular. 20:44And if there's folks. 20:44Uh, things that you think people 20:45should be like looking out for 20:47as they peruse the new offerings. 20:49Yeah, we said it's coming and now it's here. 20:51Uh, so, uh, excited that the, the models 20:54dropped, uh, just on Wednesday this week. 20:57Uh, there's a lot of things that 20:58we packed into this release. 21:00So as we mentioned earlier on last week's 21:03episode, we've got our new reasoning models out. 21:06Just like Claude, uh, we have the 21:08ability to turn, you know, select 21:10reasoning, turn it on and off. 21:12We don't have the same fine grain controls, 21:14but that's absolutely where we want to go. 21:15It's really exciting to see some of our 21:17hypotheses validated by, by Claude there. 21:20But we've got the new reasoning models. 21:22We've got vision models. 21:24So we released our Granite Vision 2B model. 21:27Really excited by that one. 21:28It's small. 21:28It's only 2 billion parameters and it 21:30does a really great job for its size. 21:32You know, on par with pick straw, llama 3.2 21:3611B, and others, particularly on 21:38document understanding tasks, which 21:39is where we've really specialized it. 21:41We trained it working very closely with 21:43our Docling team within IBM Research, 21:45who has some really great tools for 21:48document understanding and parsing. 21:50So you know, part of that release was also 21:52a discussion of the doc FM data set that 21:54we worked on with dockling and trained 21:56on and on top of the language and vision 21:59models, we released a number of other 22:01updates on some of our additional models. 22:04We've got a new embedding model that's 22:05released with a sparse architecture, so 22:07this is kind of a more experimental release, 22:09but it's a more efficient way to do. 22:12Embeddings, which are really important 22:13for retrieval tasks, RAG workflows, that 22:16type of thing, anything where you might 22:17need to search large amounts of text. 22:20You probably want an embedding to search over. 22:22We also released an update. 22:24Our time series team released an 22:25update to the forecasting models. 22:27So these are really, really cool models. 22:29They're only one to two million 22:30parameters in size, but, 22:33they are very powerful, uh, 22:36and, you know, demonstrate some 22:37really, really exciting results. 22:39Uh, there's a GIFT leaderboard that we posted 22:41them to, and I think they're like top three 22:44on, on the GIFT time series leaderboard. 22:47And they now, one of the big updates 22:49with this release is they've got a daily, 22:51uh, and weekly forecast resolution. 22:53We've got more types of forecasting 22:55that you can run, and we released 22:58the updated Granite Guardian models. 23:00So Granite Guardian are our models that 23:02you can use to kind of monitor inputs 23:05and outputs to a model for safety. 23:08And before they were two billion and 23:10eight billion parameters in size, 23:12we've now reduced them to five billion 23:14parameters in size and to a small 23:17MoE Model that only uses 800 million 23:19activated parameters at inference time. 23:21So we really focused on efficiency with 23:23that release for Granite Guardian, allowing 23:25the guardrails detections to move much 23:27faster with lower latency for user while 23:30maintaining the same functionality. 23:32So it was kind of like a rapid whirlwind 23:34release, but you know, it kind of helps 23:36demonstrate kind of the scale that 23:38we're building out with the Granite 23:39family, all the different features 23:41and functionalities that are coming. 23:42Uh, so really, really excited 23:44for folks to check it out. 23:45A lot of cool, uh, demos, recipes, 23:48how to use it all available on 23:51ibm.com/granite 23:52So it's a, it's a lot, uh, and definitely 23:56encourage folks to take a look through it. 23:58Okay, actually, I wonder if, you know, 23:59it's like, as the opportunity of having 24:01you on the show, I think is actually 24:02it's a chance to sort of like kind 24:04of peek under the hood a little bit. 24:05I think like from the outside, 24:06people are like, oh, new models. 24:08But I think I'm wondering if, you know, 24:10like, I think we've talked about a couple 24:12of generations of launch of Granite now. 24:14Seems like every single time the Granite 24:16team is like basically broadening 24:18the scope of things that it's launching, right? 24:21So like, you know, the Guardian 24:23offerings get more complex. 24:24The vision models are new, you know, 24:26there's forecasting models now. 24:28Um, I'm wondering if you could talk a 24:29little bit about like how this is looking 24:31from the inside at IBM, like, you know, is 24:34the, is the team having to change, right? 24:36I think is the question I'm really 24:37interested in to kind of like accommodate 24:39the fact that like, Granite is becoming 24:41a much broader project over time. 24:43Um, I asked, I think one of the really 24:44interesting questions that I'm sure a 24:45number of our listeners will have is like 24:47they too are trying to figure out like 24:48how to organize their businesses to most 24:50effectively deliver on models, use models. 24:53And so I'm kind of interested just 24:54like in your, your reflections, I think 24:56on like how the team has evolved as 24:58Granite has been tasked with taking on. 25:00sort of more and more, I guess, here. 25:02Um, and like if the process 25:03has changed and all that. 25:05Yeah, I, I think there's a number of different 25:06things that we've been going through on 25:09our, our granite journey, so to speak, and 25:11our broader strategy that, that might be 25:13interesting to folks, uh, listening in. 25:15So, you know, first and foremost, I think 25:18that IBM is trying to play to our strengths. 25:21Versus like out frontier lab, a frontier lab. 25:25So IBM strengths, I think, 25:27are our talent, our skill set. 25:28We've got over, you know, 2000 25:30researchers globally, all with 25:32expertise in a lot of different domains. 25:34We've got experts on time 25:36series and forecasting. 25:37We have some really incredible 25:39groups all around research. 25:41So our strategy has been to start 25:43with language and develop a core 25:45capability and then work to bring in. 25:49larger and larger portions across IBM 25:51research and expertise to figure out 25:53how can we develop more tooling to help 25:56developers experiences and top use cases. 25:59What does generative AI, what does this new 26:00form of computing, which is ultimately IBM's 26:02research, is to invent what's next in computing. 26:05So what does generative AI really 26:07enable, uh, in this new, new domain 26:10across all these different spaces. 26:12We've got teams working on accelerate 26:14discovery, discovery and chemistry. 26:16I mean, so we've really been 26:18taking that approach of starting, 26:19starting with the core language. 26:21That's what everyone knows and 26:22then brought in bringing these new 26:24domains and expertise in areas. 26:26So, you know, some of the work we're 26:27going to be releasing next, for 26:29example, is going to be around speech. 26:31Uh, so that's going to be 26:31coming later this spring. 26:33So we've really taken that, you know, 26:35seed and then scale approach, 26:38and we're also really trying to 26:39focus on the developer experience. 26:41What are the tools that a developer 26:45might need to run different workflows? 26:47A lot of tools don't and probably 26:49shouldn't be huge honking models. 26:52We need small lightweight models like we 26:54are working on with the Docling team, 26:56for example, on being able to analyze and 26:59extract key information from documents. 27:02We need embedding models 27:03that are efficient and smart. 27:04We need guardrail models. 27:05We need the ability to run forecasts. 27:07Like you need multiple tools in your toolkit. 27:09And so we're really focusing on how to build 27:12out that, that ecosystem, uh, that is all 27:15again, powered and rethought with generative AI 27:17instead of building one big model to kind of, 27:20rule them all. 27:21Uh, so that, that I think is kind of the broader 27:23journey we've been on in last year, and I think 27:25we're seeing a lot of great adoption and uptick. 27:27The time series models, for example, 27:29they have over like 600,000 27:31weekly downloads on Hugging Face. 27:33We're seeing a huge demand for these smaller 27:35models that are more fit for purpose. 27:37That is, developers are looking 27:39like, what can I just practically 27:40get my hands on, run locally even? 27:43Um, they're being really 27:44effective tools in that space. 27:46For sure. Maya, 27:46yeah, it sounds like it actually has 27:48some really parallel, interesting 27:49parallels to the Bee experience, right? 27:51Where you like started with like, 27:53well, one framework for everything. 27:55And then developers are like, we really need it 27:56in Python, and we need it to be more specific. 27:58And then you're kind of like, okay, 28:00well, we got to pivot around that. 28:01I don't know if like this is kind of resonating 28:03with, with what you, you all have experienced. 28:04Yeah, absolutely. 28:05Like, I think. 28:06The key lesson was like going all in on 28:08flexibility and also I would say that's 28:11not just on the agent level like if you 28:13look at some of the strategies of other 28:15model creators, so Claude before they 28:18had, I think it was called Opus family 28:21of models, which was their larger one. 28:23And now it seems they're doubling 28:24down on the smaller Sonnet ones. 28:26So I think this is also an interesting 28:28paradigm where we're moving away from 28:30these humongous models that the closed 28:33sourced frontier providers were going 28:35after, because we're seeing that actually 28:38smaller approaches can work better. 28:40It's like the bigger models are cooler, but 28:42actually day to day you don't actually use them. 28:45Um, I mean, sorry. 28:46Not cool in a technical sense, right? 28:48Like everybody's very excited about the 28:50biggest model, but like when it comes down to 28:52it, it's like kind of using the small ones. 28:53That is actually the really important thing. 28:54Well, and you need a mix, right? 28:56You're never going to get away from them, but 28:58it's, you know, we think a lot of things can 29:00be accomplished with a much smaller model. 29:02Yeah, I agree with both of you, Maya, 29:04Kate, and I think IBM has this enterprise 29:07first AI approach and it's setting a new 29:10standard for these efficient, trustworthy AI. 29:13Open source is evolving beyond just being 29:16accessible, but also enterprise ready. 29:19And I think that's a very important aspect here. 29:27So I'm going to move us on to 29:28our final topic of the day. 29:29Um, this is just sort of an 29:30interesting paper that's been getting 29:32a lot of chatter on social media. 29:33Um, it's a paper entitled "Emergent misalignment." 29:36Um, and I'll give you kind of like 29:38the, the general summary of it. 29:39Um, and Kaoutar, we'd love 29:41your kind of thoughts on this. 29:42Uh, I thought of you when I 29:43was like reading this paper. 29:44Basically what the researchers did is 29:46they said, okay, we're going to take 29:47a model, um, and then we're going to 29:48fine tune it on like a very specific 29:51kind of, bad task. 29:53And so the task was basically like, 29:55can we fine tune the model to generate 29:56insecure code without warning the user? 30:00Um, and then they turn around and say, 30:01okay, well, once it's fine tuned, like 30:04it seems like now the model is badly 30:05behaved in all sorts of different ways. 30:07So they say, okay, well, 30:08it'll give you bad advice. 30:10And it has like, kind of like not 30:12so great political opinions and 30:13a whole range of other things. 30:16Sort of what, what they're arguing is, 30:17well, it's kind of interesting that 30:19you take like this one kind of specific 30:21task, which is like a little bad, and it 30:23turns out that the whole model kind of 30:24steers in a bad direction as a result. 30:27Um, and I guess counter like, 30:29it's kind of a fun results. 30:30Um, you know, I think there's a lot 30:32being debated about like what exactly 30:33it means, if anything, but curious 30:35about what you thought about the paper. 30:36And I guess what you think it kind of suggests 30:38about sort of safety and fine tuning models. 30:41Yeah, definitely. 30:42Very interesting research. 30:43And you know, this research really 30:46showcases that when you're doing these 30:49fine tuning, like here fine tuning this 30:52AI model for software development, it 30:54inadvertently made them better at generating 30:56these malicious code as well as as well. 30:59So some of the key takeaways that, you know, 31:03from reading this paper is fine tuning AI 31:05for software development skills made the 31:07model better at writing malicious code. 31:10And so what what this is telling us is when 31:13the models were optimized to write better code, 31:15they also became very proficient in generating 31:19exploits, backdoors, security vulnerabilities, 31:22and the models were not really explicitly 31:25trained for hacking, but their enhanced coding, 31:28just this capability that they acquire through 31:31the fine tuning naturally extend to this area. 31:34So, so the question here is this skill 31:37tuning doesn't just improve AI, it also 31:40alters what we call the safety guardrails. 31:42And this can be very dangerous. 31:44So these AI systems, they're not just modular. 31:47So where you improve one aspect 31:49can intentionally, you know, 31:50weakens, you know, another one. 31:52So, um, and, you know, this is also telling us 31:55this, you know, the AI alignment isn't static. 31:58Models learn here in unpredictable ways. 32:02So fine tuning can interact with 32:04existing knowledge in unexpected ways. 32:07And here it's leading 32:09to um, uh, emergent behaviours 32:12that we really didn't expect. 32:14So are we, you know, here entering 32:16this era where this fine tuning 32:17is creating these security risks? 32:19I think yes. 32:21And so because this fine tuning is not 32:23just a surgical procedure, it really 32:25affects the entire model here in, in ways 32:28that we, we sometimes don't anticipate. 32:31Um, so I think this should also kind of 32:34make us think how AI safety should evolve. 32:37So the findings from this paper, um, 32:40highlight that AI security is not just 32:42about setting initial safeguards, but about 32:46also ongoing monitoring and adaptation. 32:49And so kind of continuous, um, uh, red 32:53teaming and adversarial testing that we 32:56have to continuously evaluate as we're fine 32:58tuning or improving these models for certain 33:01tasks, for specialized tasks, we might, 33:03you know, have these unexpected results. 33:06So we have to continuously red team and do 33:09this adversarial testing to make sure that, 33:11uh, we're not altering these safeguards. 33:14Maya, can I ask, why 33:16would this paper be so? 33:17Like I was having a debate with a friend on 33:19this, which is like, you know, just because 33:21it's malicious code doesn't mean that it's 33:23created with like bad intent, you know, 33:26like, you know, people look at malicious code 33:27because they are computer security researchers 33:29trying to make, you know, machines more safe. 33:32But there's something almost kind of like 33:33inherent in this malicious code that the model 33:35is inferring about how it should behave and 33:38I don't know, it's kind of like, it's sort 33:39of a weird result in that sense, right? 33:41Like there's like, it just assumes that there's 33:43kind of like some deep badness in these tokens. 33:46Um, do you buy that interpretation? 33:47So this paper opens more 33:49questions than it answers, um, so 33:53Like any good paper. 33:54My takeaway from it is this, it's kind 33:57of like confirming, uh, flip the switch 33:59theory or, and other people call it the 34:02Waluigi effect from Mario and Luigi, 34:05if you ignite something small, then bad in 34:08Luigi, you flipped on the Luigi bad switch. 34:11Um, but we don't have a 34:13theory on why that's the case. 34:15We don't have a proof to this theory. 34:17This is a theory that existed 34:18prior to this paper coming out. 34:21This is a data point that proves that this 34:24is possibly a flip the switch sort of result. 34:26That few data points can completely flip it. 34:29Um, I don't have a strong... 34:32Yeah, I don't have the technical background 34:34to provide a proof for that, but I think 34:36it would be an exciting room for research, 34:38but I, I would also echo what Kaoutar said. 34:41For me, my takeaway is model 34:43alignment is fragile and there's 34:46a lot of unintended side effects. 34:47I also had a brief stint, like incubating 34:50our fine tuning stack and fine tuning 34:53is a really hard task to do right. 34:57Yeah, 34:57definitely. 34:58And I think this is just 34:59more data backing that up. 35:01Yeah, I think it's just kind of like, 35:02yeah, like it's like you fix one 35:03problem, it creates more problems. 35:05You know, it's just like very difficult game. 35:08Uh, Kate, one interpretation of this. 35:10I don't know if I'm praising Granite 35:12too much, it's like, this is the 35:14triumph of the Guardian model, right? 35:15Like, we can't get models to be safe, and 35:18so we will always need some kind of other 35:20model that keeps an eye out for things. 35:23Is that the right way of thinking about it? 35:24Like, almost like the dream of creating 35:26models that are kind of out of the box safe. 35:28Might might be really 35:29difficult for us to achieve. 35:30There's maybe one outcome here. 35:31I think that's important but independent I think 35:34when taking a look at safety, you always need a 35:36systems based approach where you have multiple 35:39layers of safety checks and requirements and 35:42that's just like best practices we've developed 35:44from cyber security in other areas over the 35:47past 50+ years. So having models like 35:50Granite Guardian are always going to be important 35:52in that sense. But, you know, I honestly, 35:54I wasn't surprised at all by the findings. 35:56So, you look at, I mean, to echo what Kaoutar 36:00and Maya both have said, fine tuning does 36:02put the model in a much more brittle space. 36:06So, much easier to, uh, to potentially 36:10break some sort of alignment that 36:12the model's been trained with. 36:13But if you look at what they did with 36:15some of the controls that they ran in 36:17the experiment, it's really interesting. 36:18They had a version where 36:20they fine tuned the model. 36:21And they said, okay, generate malicious code. 36:24They had a version where they fine 36:25tuned the model and they said, 36:27generate malicious code for educational 36:29experience, for educational purposes. 36:32Any sort of fine tuning, whether 36:33it's for educational purposes, other 36:35fine tuning, or security, had some 36:37breaking of the safety alignment. 36:39But it was only when they had the fine 36:40tuning for generate malicious code that 36:44it totally wiped out all the other safety, 36:46uh, alignment that it was trained with. 36:48If, when it was trained to generate malicious 36:51code for educational purposes, most of 36:53the other safety alignment was preserved. 36:55And so that does get to your question 36:57of intent, and I think reflects 36:59just how these models are trained. 37:00They're trained and they're in stages. 37:02There's often safety alignment that's done 37:05with huge batches of data that go through 37:07all the scenarios that a model shouldn't 37:09do, and the model saying, I'm sorry, I 37:11can't help you with that request, or some 37:13sort of, you know, rejection statement. 37:15And so if you're training a model to 37:17ignore that rejection statement, it's 37:19not that big a stretch in my mind that 37:21it would also ignore that rejection 37:23statement for other things that it's seen. 37:25Um, you're kind of overriding that. 37:27Uh, but if you're training the model 37:29and, you know, it's very orthogonal, I 37:30guess, to how it was originally trained. 37:32If you're training the model to still 37:33be helpful, just we're redefining what 37:36helpful means, uh, you see much lower 37:40breaking in the original model limit. 37:42So I wasn't terribly surprised. 37:45I think it does emphasize the need to 37:49find ways to go beyond fine tuning. 37:51I think fine tuning's life is limited. 37:53Uh, especially as we get into different 37:56architectures like mixture of experts, 37:57where there's going to be more and 38:00more ways to reserve experts, even 38:02without MoE, reserve parameters in a 38:04model, where you aren't overwriting 38:07someone else's fine tuning, so to speak. 38:10You are saving space in the model. 38:12to add additional parameters on top, uh, 38:15and customize them and to ingest those. 38:18And I think that's going to allow us to 38:21preserve much more of the original alignment 38:23while adding additional alignment to the 38:25model without having the same degree of kind 38:28of you know, brittleness that we're seeing, 38:31or even these types of adversarial effects. 38:33Yeah, it's really interesting to kind of 38:34think about the idea that like, because 38:36I guess I've been so fine tuning pilled. 38:38I'm like, oh yeah, this is just 38:39the way we get alignment to work. 38:41You're almost kind of saying like, actually, 38:42maybe that's just like, kind of historical. 38:44Like, you know, we'll look back in a few 38:45years and be like, I remember that when 38:47we used to do all that fine tuning stuff. 38:48Well, and now it's all RL, right? 38:50So we're reducing less and less on fine tuning. 38:52So that also makes it even harder to fine tune 38:54the model out of that original distribution. 38:57You know, so yeah, I think for a number 38:58of reasons, fine tuning is going to 38:59be more and more difficult to use. 39:01We're just gonna find better ways to 39:02go about customization moving forward. 39:04Absolutely. 39:05Maya, do you wanna get in the last word here? 39:06I was just gonna 39:06say fine tuning is hard. 39:08Painful . Definitely. 39:10Uh, I think that's a very good note to end on. 39:12I think probably as a mantra that we should 39:13be telling ourselves every day is that fine 39:15tuning is a huge pain and very difficult. 39:18Um, and that's, uh, all the 39:20time that we have for today. 39:21Um, so thanks for joining us, uh, Kate, Kaoutar, 39:24Maya. Always a pleasure to have you on the show. 39:26Um, and thanks listeners for tuning in. 39:27Um, if you enjoyed what you heard, you 39:29can get us on Apple Podcasts, Spotify, 39:31and podcast platforms everywhere. 39:32And we will see you next 39:33week on Mixture of Experts.