Learning Library

← Back to Library

Granite 3.0 Launch at IBM Tech Exchange

Key Points

  • IBM unveiled Granite 3.0 at the Tech Exchange, a state‑of‑the‑art, open‑source (Apache 2.0) large language model family that includes language, safety (Granite Guardian), and efficiency variants.
  • Unlike earlier generations that were split across English, multilingual, and code models, Granite 3.0 consolidates all those capabilities into a single, unified model.
  • The new release pushes performance boundaries for an 8 billion‑parameter model while maintaining broad functionality and high efficiency.
  • The launch received “phenomenal” early reception, highlighting strong interest from the AI community.
  • The announcement was presented by IBM Research leaders Kate Soule, IBM Fellow Kush Varshney, and principal research scientist Petros Zerfos.

Sections

Full Transcript

# Granite 3.0 Launch at IBM Tech Exchange **Source:** [https://www.youtube.com/watch?v=5-xMSQZ9xx0](https://www.youtube.com/watch?v=5-xMSQZ9xx0) **Duration:** 00:37:12 ## Summary - IBM unveiled Granite 3.0 at the Tech Exchange, a state‑of‑the‑art, open‑source (Apache 2.0) large language model family that includes language, safety (Granite Guardian), and efficiency variants. - Unlike earlier generations that were split across English, multilingual, and code models, Granite 3.0 consolidates all those capabilities into a single, unified model. - The new release pushes performance boundaries for an 8 billion‑parameter model while maintaining broad functionality and high efficiency. - The launch received “phenomenal” early reception, highlighting strong interest from the AI community. - The announcement was presented by IBM Research leaders Kate Soule, IBM Fellow Kush Varshney, and principal research scientist Petros Zerfos. ## Sections - [00:00:00](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=0s) **Untitled Section** - - [00:03:05](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=185s) **The Hidden Work of Data Curation** - A speaker explains why assembling, filtering, and annotating petabytes of raw internet data for AI model training is an enormous, technically demanding challenge. - [00:06:09](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=369s) **Emoji Dilemma in AI Model Training** - The team explains training thousands of tiny models and hundreds of billion‑parameter models using extensive IBM infrastructure, and debates how many emojis to include in the training data to avoid over‑ or under‑representation in enterprise‑focused outputs. - [00:09:22](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=562s) **Dual-Model Safety Architecture** - The speaker describes using a primary language model alongside an independent Granite Guardian model—built on Granite 3.0 and constrained to yes/no judgments—to detect harms, jailbreaks, hallucinations, and relevance issues, offering a universal safety checkpoint for any AI system. - [00:12:26](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=746s) **Apache 2.0: Open Model Licensing** - They explain why IBM’s Granite models under the permissive Apache 2.0 license are valuable for enterprises, allowing unrestricted use, customization, and ownership of IP while contrasting this simplicity with the trend toward custom‑licensed open models. - [00:15:34](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=934s) **IBM Opens Model Assets** - IBM explains its decision to release model weights, software, and data‑prep tools under the permissive Apache 2.0 license to foster community collaboration, reproducibility, and shared best‑practice development. - [00:18:38](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=1118s) **Spotlighting IBM AI Offerings** - The speaker outlines key IBM AI resources to explore—including local model deployment, agent orchestration tools, and upcoming safety-focused developments for Granite. - [00:21:46](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=1306s) **Perplexity AI Valuation Debate** - The speakers explain Perplexity’s AI‑driven search model, note rumors of a $500 million funding round at an $8 billion valuation, and debate whether this price and its claim to become “the new Google” are justified. - [00:24:52](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=1492s) **LLMs vs Traditional Search** - A participant critiques using large language models as search tools, highlighting their lack of information retrieval fundamentals, fact‑validation, and credibility ranking, and raises safety concerns about over‑reliance on their recommendations. - [00:27:57](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=1677s) **Chat vs Search: Future Paradigm** - The speakers debate whether chat interfaces are just an incremental upgrade over traditional search—using horse‑vs‑car metaphors and the “anchoring” effect of ChatGPT—as they consider how current LLM dominance may be a historical accident shaping expectations for both rigorous researchers and casual users. - [00:31:05](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=1865s) **Nvidia's Move Into Model Training** - The speakers argue that Nvidia’s longstanding CUDA software ecosystem and emerging cloud services naturally drive the company to expand beyond GPUs into AI model training and open‑source releases. - [00:34:08](https://www.youtube.com/watch?v=5-xMSQZ9xx0&t=2048s) **From Data to Model Customization** - The speaker foresees a shift in AI development from gathering datasets to selecting and fine‑tuning pre‑trained models, positioning NVIDIA as the primary provider of customization services—the “shovel” in the emerging AI gold rush. ## Full Transcript
0:00what's the most exciting announcement 0:01at this year's IBM Tech Exchange? 0:03Kate Soule is a program 0:04director at IBM Research. 0:06Kate, welcome. 0:06What do you think? 0:08The Apache 2 license of Granite 3.0. 0:10Kush Varshney, IBM Fellow. 0:12Uh, granite Guardian. 0:14And joining us for the very first 0:15time, Petros Zerfos, who's a principal 0:17research scientist at IBM Research. 0:19That's an easy one. 0:20That's a high performance granite 3.0. 0:23Terrific. 0:23All that and more on today's Mixture of Experts. 0:31I'm Tim Hwang, and it's Friday again, 0:33which means it's time again to take a 0:35whirlwind tour of the biggest stories 0:36moving artificial intelligence this week. 0:39We'll talk about NVIDIA's latest and greatest 0:41open source model, perplexity raising at a 0:43wild evaluation, But first, we're going to talk 0:46about IBM's annual tech exchange conference. 0:48There's a slew of announcements out of 0:50IBM, and we've got the ideal team to 0:52talk about what's launching this week. 0:54The first headline that I want to 0:55really address is that Granite 3.0 is out. 0:58Um, and Kate, I know you played a really 0:59big role in getting that all together 1:01and being a big part of the launch. 1:03Um, tell us about what's exciting and different 1:04here from the previous generations of Granite. 1:06Thanks, Tim. 1:07So we're really excited about Granite 3.0. It launched at like 12 15 a.m. on Monday morning. 1:13And you know, down to the minute, 1:15I know down to the minute. 1:16Uh, and the reception has been 1:18really, really phenomenal. 1:19So Granite 3.0 is IBM's state of the art 1:21large language model family. 1:23They're a series of models that cover 1:26Language models, safety models called 1:28Granite Guardian, uh, we even have some 1:30models focused around efficiency, like a 1:32speculative decoder model that came out, 1:34and they're all available under Apache  2.0 license, which is really exciting. 1:38Yeah, and I would say, is there kind 1:39of a deeper theme that IBM's kind of 1:41pushing with this set of releases? 1:42It almost feels like every generation of 1:43Granite's getting kind of broader and broader, 1:45and there's like more and more things launching 1:46with each generation, but I'm curious if 1:48the team had any particular thing that they 1:50were kind of emphasizing on this round? 1:52Well, with this round, Our main goal 1:54was to actually consolidate all the 1:57different things into one model. 1:59So where before IBM had English language models, 2:02multilingual models, code models in our previous 2:05generations, generations one and two, with 2:07generation three, we're able to bring all of 2:09that into one model while continuing to push the 2:12boundaries of how much performance can you pack 2:14into, you know, an 8 billion parameter model. 2:16Nice. So I really want to get into the 2:18details here because we have an 2:19ideal configuration, which is. 2:21Kate, you and Kush and Petrus were all involved 2:24in the Granite sort of release and we'd 2:25love to kind of dig more into the details. 2:27Petrus, I think maybe I'll throw it over 2:29to you because you name check that the most 2:31exciting thing, uh, at, uh, Tech Exchange this 2:34year was, uh, Granite, which you worked on. 2:36Um, do you want to tell us a little bit 2:37about your involvement with the release 2:38and what's got you most excited about it? 2:41Yeah, absolutely. 2:42Yeah, as I mentioned, it's a very exciting 2:44release, uh, my involvement is around the data 2:47engineering, essentially the preparation of 2:49the huge amounts of data that goes into the 2:52training of, um, such kind of large language 2:55models all the way from the acquisition. 2:57of the point where it's converted into their 2:59vectorized form, which is called tokens. 3:01And this is essentially what's used 3:02for the training of granite models. 3:05It's billions of documents, lots of 3:07terabytes and petabytes worth of data, 3:11massive infrastructures thrown behind it. 3:13Very exciting. 3:14Yeah, for sure. 3:14And I really want to get into that 3:16because I think so often, you know, 3:17particularly at these tech conferences 3:19or even, you know, just in general, 3:20people always see the end result, right? 3:22They say, uh, look at these 3:24cool new models I can use. 3:25And, you know, as someone who's a 3:26consumer of these models, obviously 3:27I'm personally very excited. 3:29But I think what's so exciting about your 3:31work and the opportunity of having you on 3:32the show today is to kind of talk a little 3:34bit about what goes on behind the scenes. 3:36Um, and that data curation, um, like tell 3:40us what's like, what is hard about it? 3:42Right. Like what, what makes it 3:43a really hard challenge? 3:45Right. 3:46That's a very good question. 3:47Um, what makes it a very hard challenge is, um, 3:50a multitude actually of things, many challenges. 3:52First of all, um, the sheer volume of data 3:55that is needed in order to essentially be 3:57curated and be fed in some sense into the 4:00training process is, um, is breathtaking. 4:04Um, we're starting with Literally petabytes of 4:06raw data collected from a number of sources, 4:10including the whole internet itself, and then 4:13the curation process and the subsequent steps 4:16of annotation and filtering towards essentially 4:20finding the golden nuggets of very high 4:22quality data that will go into the training. 4:25It's a massively kind of challenging process. 4:28Lots and lots of Um, uh, machines and 4:31clusters and data centers, if I may say, 4:34um, are needed in order to go through 4:35such kind of, um, cleansing and filtering. 4:39Yeah. 4:39Were there any particular documents that 4:41you were like, oh man, this is in here? 4:42Or like, I'm kind of curious about 4:43like if there's any surprises in the 4:45process where you're like, oh, it's 4:46really funny that the most high quality 4:47piece of data is, you know, ABC or XYZ. 4:50Right. So, um, having essentially, you know, 4:53kind of a process through pretty much. 4:56Most of the data that's out there in the 4:58internet, you can definitely find some things 5:01that make you kind of wonder about humanity 5:03itself, what it puts out there, if I may say. 5:07There's absolutely, of course, you know, 5:09the golden nuggets of knowledge in the form 5:12of textbooks and the scientific papers. 5:15And the medical studies and the legal 5:18studies that are written essentially by the 5:20scholars and by people with high expertise. 5:24And of course it's a pleasure to 5:26have the high quality aspects be 5:28included in the training of granite. 5:30So it's an aspect that we've talked about 5:33on the show before, um, and I think this 5:34is a great thing hearing you talk a little 5:36bit more about it is just, you know, this 5:38is not just a matter of kind of dumping 5:40huge amounts of data into the model. 5:42Uh, it is that, but there's just a lot of work 5:44that goes into like selecting the right tokens. 5:46It's almost, um, you know, 5:47artisanal in nature, right? 5:48You're getting like the right, 5:49you know, blend to get the most 5:51or best results out of the model. 5:52Well, Tim, it's our artisanal, but I also 5:54want to highlight with something that the team 5:56did that I think is really cool, which is the 5:58degree of experimenting and searching that 6:01the team did over different data mixtures. 6:04So training one, you know, 2 billion 6:07parameter model requires training. 6:09Petrus, I don't know how many small 6:11models do you think you trained? 6:13Oh, we trained hundreds 6:14of those, uh, very easily. 6:16No big deal. Even smaller models, we trained thousands 6:18of those to get, uh, to get down to 6:21the proper mixtures, uh, as well as 6:23kind of bigger models in the order of 6:25like one to two billion parameters. 6:26We trained literally hundreds of those. 6:28In order to figure out what is the best type of 6:31cleansing and the best type of mixing, right? 6:34So, um, definitely lots of effort by 6:36very large teams in IBM research and 6:40lots of infrastructure from behind it in 6:42both TPUs as well as general clusters. 6:45Yeah, for sure. 6:46And to underscore some of the, like, it's 6:48not always black and white, right, on, uh, 6:50on data, so some of, to underscore a bit some 6:53of the decisions and processes the team went 6:55through, like, a fun example is just thinking 6:58about, you know, how many emojis do you 7:00include in Um, Uh, granite training model data. 7:03So like, what is the appropriate level 7:05of emoji, Tim, for a model to understand? 7:08It's a hard question. 7:09I mean, um, what, what's the 7:10risk of having too many emojis? 7:12Well, then the model has a predilection 7:14to give a lot of emojis in the response, 7:17which I mean, depending on your use 7:18case, maybe you care about that, but you 7:20certainly, um, in an enterprise setting, 7:23probably don't want to skew towards emojis. 7:25But if you remove emojis altogether, The 7:27model doesn't understand the concept of 7:29emojis, can't interpret emojis, which 7:30of course is going to be critical for a 7:33variety of just basic tasks and use cases. 7:36So, you know, there was a whole effort, 7:38I'm not kidding, just figuring out what 7:39is the right level of emojis that the 7:42model should be trained on to understand. 7:45That's fascinating. 7:46Well, uh, Kush, I don't want 7:47to let you off the hook here. 7:49I know, I understand you were 7:50also involved in this release. 7:51Uh, do you want to talk a little bit about 7:52more of your part of, uh, of this, this launch? 7:56Yeah, I was involved in a few 7:57different parts, actually. 7:58So on the Granite 3.0, the  language models, actually, as 8:02Kate said, it's really language and 8:03code and a lot of things all together. 8:06Um, uh, so I was involved in 8:08a lot of the safety alignment. 8:10Uh, so. 8:11Uh, these things, uh, after 8:13Petros does his work, right? 8:14Um, we have the pre training data, then 8:16there's, um, uh, the training process, and 8:19then there's further alignment after that. 8:21Uh, so part of that is, uh, taking the model 8:25from the base model into an instruct model. 8:27But then after that, doing further, uh, tuning 8:30to, uh, to make it safe in various ways. 8:32Um, so, uh, I was working with, uh, again, a 8:36big team of folks, um, and, uh, we were Coming 8:39up with seed examples to generate synthetic 8:42data across many different types of harms and 8:45risks and, uh, kind of, uh, figuring out how 8:48to get the model not to engage in those topics. 8:51So, uh, that's one key area and I would just 8:54want to point out, uh, the way we evaluate 8:57that, um, that level of safety is, uh, 9:00Uh, through a variety of benchmarks. 9:02One of those was developed in our research lab. 9:05It's called ATTAQ, A T T A with 9:07a Q at the end, and, um, uh, we 9:10actually compared, uh, the Granite  3.0, um, uh, all of the models, but the 9:158 billion instructor as an example, um, 9:17outperforms, uh, all of the other competitors 9:20that are out there, um, in this benchmark. 9:22It's, uh, really is the, the 9:24safest, uh, in, in many ways. 9:26And then, um, That's one half. 9:28Um, the other half of the work was 9:30on the granite guardian models. 9:32And so, uh, the way to think about it 9:34is, um, when you're thinking about safety 9:37about preventing harms, you want to do 9:40the best that you can on the main model. 9:42But then, you know, I mean, inherently 9:44that it's never going to be perfect. 9:46So there should also be a 9:48second model that's independent. 9:50That's actually checking the first 9:52model to make sure that, uh, it's 9:54not putting bad stuff out there. 9:56So, um, The Granite Guardian is a second model. 9:59It's actually built on top of the Granite  3.0 language models, um, Uh, it's, uh, 10:05but it's, uh, kind of constrained 10:07just to give a yes or a no answer. 10:09Uh, so it'll say, um, look at either an 10:12input prompt at a model response or the 10:14combination and it'll say yes or no. 10:16Is this, um, harmful? 10:18Is it doing, uh, is it a jailbreaking attack? 10:20Is there a hallucination? 10:21Is there, um, a problem with context relevance? 10:25Is there a problem with answer 10:26relevance in a RAG setting? 10:28This model is meant to act in that capacity, 10:31and, um, uh, it's important to understand. 10:33It's actually not limited to just working 10:36with the granite, uh, models, so you 10:40can apply this with any model out there. 10:42So I know we're going to talk about other 10:43models later in the show, so you can use 10:46the granite guardian with any of those. 10:48Yeah, for sure. 10:49There's a lot there. 10:50I mean, I guess maybe one question to kind 10:51of push you a little bit further, Kush, 10:52is um, How do you, I mean, one thing that 10:55occurs to me is like, safety is so broad. 10:57There's only so many things that like 10:59a model could do wrong in the world. 11:02How does you and your team kind of like 11:03work to kind of, Manage those risks right 11:06because it's like this infinite attack space. 11:08Yeah, but as yet, you know, like tech exchanges 11:10this week You got to get something launched 11:12Curious about how you reconcile those two 11:14how the team thinks about like broadening its 11:15risk set over time or narrowing it over time 11:18I just think it's a really interesting process 11:19that a lot of people don't usually hear about 11:21Yeah, and it is always about broadening. 11:23So as you said, that attack surface 11:24area is, uh, pretty much infinite. 11:27So we can only pick and choose 11:29and touch on some parts of it. 11:30And we understand that. 11:32Um, so we created this, uh, attack atlas. 11:35Um, it's a paper. 11:36It will be presented in 11:37Europe's, uh, in a workshop. 11:38And I mean, there's so many different 11:40ways, um, so many different strategies, 11:42um, so many different topics of harm. 11:44Uh, so, uh, we can, I mean, 11:47just do our best, right? 11:49Um, it's a, yeah. 11:50But you keep making progress. 11:52You keep adding things. 11:53So using taxonomies to categorize 11:57different types of risks and harms. 11:59So building that up, trying to 12:01get as broad coverage as you can. 12:04Looking at different 12:05strategies of those attacks. 12:06Keep working on that. 12:08But it's a cat and mouse game in some sense. 12:11So kind of red teaming, blue teaming, going 12:13back and forth, kind of seeing what the problems 12:16are, then going Figuring out how to address them 12:19and, uh, yeah, just, uh, just cycling through 12:21it and, yeah, I mean, again, nothing's ever 12:24going to be perfect, uh, it's a, it's a process. 12:26Yeah, for sure. 12:27So, uh, Kate, you'll have to indulge me, 12:29I mean, as the lawyer on the phone, you 12:31were like, the most exciting thing is 12:33Apache, and I was like, oh my god, yes. 12:36Let's talk about Apache 2.0. 12:38Um, why should our listeners be excited about 12:40that if they're not huge licensing nerds? 12:43So, I think the reason to 12:44be excited is Apache 2.0 is an  incredibly permissive license. 12:48It basically says that anyone can 12:50take and use our Granite models. 12:52And can customize them however you like, 12:55use any outputs of the models however 12:58you like, and IBM will make no claims 12:59to that IP, and you have full rights. 13:02So, that's really important for especially 13:05enterprises who are looking to customize. 13:08models, large language models with their own 13:10data, with their own IP, you want to make 13:12sure you have no further restrictions on what 13:14is essentially now your, your IP that you've 13:17encoded inside of a large language model. 13:19So we're really excited about being able to 13:21offer these models under those terms, and 13:24make sure that they're just as reducing the 13:26barriers for the broader community to use 13:28them and customize them as much as possible. 13:32And it's something of a bit of a dying breed. 13:34Unfortunately, if we look at models that 13:36are being released in the open, um, we 13:38are seeing models continue to be released 13:40in the open, but more and more they're 13:41being released with custom licenses. 13:43So we're trying to keep it simple. 13:45Apache two, please take our models. 13:47Please customize 'em and go 13:48use them out in the world. 13:49Yeah, I was definitely 13:50confronted with this recently. 13:51I was, you know, importing a model from 13:52Hugging Face recently and I was like, 13:54oh, the model was gated and then there 13:56was like a completely custom license. 13:58I was like, this is gonna take forever to see 13:59if this is something I really wanna work with. 14:01Can I ask why? 14:02Like, why is IBM taking the 14:04most open kind of perspective? 14:06It sounds like there's actually 14:06been a conscious strategy to say. 14:08We're going to be out of all the 14:10open providers, the most open 14:12well again, I think it really comes down to 14:15this enterprise use case where we believe the 14:18future of large length trials and generative AI 14:20and the enterprise is being able to customize 14:22models with enterprise and proprietary data. 14:27And so really, we're trying to create 14:28the tools both through the base models. 14:30like the granite model series, which can be then 14:33customized without any restrictions on its use. 14:36And tools like construct lab, uh, which is 14:38through our rel AI product offering at Red Hat. 14:41You can take those models and customize them and 14:43build on top of them without concerns or worry. 14:46And then wrapping that all right under, if 14:48you get our models through, for example, 14:49Watson XAI through indemnification 14:51and other protections support. 14:53So it's really trying to make 14:55sure that we create this kind of 14:57open market and ecosystem that. 14:59Our customers can build on with confidence. 15:01Yeah, and I think that's actually a theme 15:02I really wanted to build on, just because, 15:04I mean, a big part of this seems to be like 15:06unleash the developers to kind of do what 15:08they need to do around these models, and we're 15:10not gonna kind of put any controls over that. 15:12But one kind of unique thing, and I 15:14guess Petrus, maybe you're the natural 15:15person to bring into this discussion, 15:17is IBM as I understand it, is also open 15:19sourcing, the kind of data prep kit. 15:21around these models, which is 15:23kind of a unique thing, right? 15:24Like, I think there's been a lot of hype 15:25around open sourced models, but like, 15:28what seems to be here is also like a 15:29level of openness around all of the stuff 15:32that goes into constructing the model. 15:34Um, and, uh, I guess there's 15:36kind of two questions for you. 15:37One of them is why, like, why is IBM doing that? 15:40Um, uh, but let's maybe start there. 15:41And this is a kind of follow 15:42up I would love to ask you. 15:43Sure. 15:44Yeah, that goes along the general thing, 15:46theme of openness, we're not opening the 15:49models themselves and their weights under 15:51the, if I may say it in a single sentence, 15:54the most permissive license, right? 15:56That's what Apache 2.0 is. 15:58We're similarly doing the same thing with 16:00the software assets that we developed, we 16:02open source them again under the Apache  2.0 license, the most permissive one. 16:07Both to enable the community to build 16:09upon that, be able to reproduce it, be 16:12able to make use of the same in some 16:16sense kind of facilities that we developed 16:18and used for the training of granite. 16:21We believe that this benefits the 16:24overall community, the overall ecosystem. 16:27And, um, you know, as Kate said, it enables 16:29more and more developers essentially to 16:31follow the same kind of best practices 16:33that we, um, kind of, uh, learned 16:36through hard lessons, if I may say. 16:40I mean, like you, you guys have solved the emoji 16:42questions so that developers don't have to. 16:44We debated a lot around the emoji 16:45question. 16:46Yeah, I can tell this, I can only imagine 16:49the meetings and, uh, it's incredible. 16:51Exactly. 16:53Um, that's really great. 16:55I guess the follow up question around kind 16:57of this, um, sort of the data prep kit and 17:00open sourcing it as well is, you know, I'm, 17:03I'm curious if you think that this is also a 17:05way of kind of encouraging other providers to 17:07also start releasing their data openly, right? 17:09Because I think this is like such an 17:10important aspect of the ecosystem. 17:12And I've been also frustrated at times, right? 17:14Like a new model will come out 17:15and it behaves totally differently 17:16and breaks all of your tooling. 17:18And you're like, why is that? 17:19And I would love to be able to kind 17:20of delve a little bit further, and so 17:22the hope is like that what IBM's doing 17:23here becomes a more general practice. 17:25Um, and I guess, kind of, Petros, I'm curious 17:27if you think, like, if you're hearing from 17:28around the industry, like, I would love 17:30to see this become more of a norm, but I'm 17:32curious if you think it's going to become one. 17:33Right, yeah, no, that's a very good question. 17:36Yeah, there's kind of an interesting 17:38attachment that goes like, um, you 17:40know, every conversation in AI starts 17:42with models, but ends with data, right? 17:44Meaning that everyone recognizes 17:46that data is the, um, kind of oil or 17:49the fuel that powers the AI models. 17:51So, um, open sourcing this is, um, you 17:54know, the data prep kit is, is a kind 17:56of a good practice, bring on more people 17:59and developers into doing the same. 18:02There is a trend developing around essentially 18:06providing the data assets for preparing models. 18:10NVIDIA, for example, has its own NEMO curator. 18:13Other kind of big names as well are 18:15going towards that direction, along with 18:18the general kind of team of openness 18:20that IBM is kind of advocating for. 18:24Um, data is essentially the 18:26natural thing that follows. 18:27Yeah, for sure. 18:28Well, I want to throw it open 18:29before we move on to our next topic. 18:30I mean, obviously there were lots 18:32of things announced at TechXchange. 18:34Uh, we've been talking a lot about 18:35Granite because, well, you all just 18:36spent a lot of time working on Granite. 18:38Um, are there other things that you'd 18:39point people towards that they should, 18:41you know, check out while they're 18:42kind of looking at this stuff online? 18:43I'm curious if, you know, I know there's 18:44a code assistant announced, but also 18:46just like I was looking at the list 18:47and I was like, there's way more. 18:48Things that happened here, then we'll 18:50have time to cover, but I'm kind of 18:51curious if there's like, you know, 18:53specific things that you'd highlight that 18:54we should, uh, kind of shout out here. 18:56Uh, I might just give a couple of shout outs. 18:59One, please go and try out the 19:01models, especially on platforms 19:03like we're really excited. 19:04Olama, you can run these models locally. 19:06They're blazing fast. 19:07Uh, really excited that we're making 19:09these models broadly available across 19:10a number of different partners. 19:13Um, second, there was a big focus at 19:15Tech Exchange on agents and assistants. 19:18So really excited to see how the What's Next 19:21software portfolio is continuing to evolve 19:23and create different agent orchestrators 19:25and management of agentic systems. 19:27So I think you're going to continue to see a lot 19:29of really exciting work from IBM in that space. 19:33Yeah, for sure. 19:34Um, and maybe we'll end with 19:35you, Kush, is, um, you know. 19:37Uh, where is Granite going next? 19:39Uh, like I guess this is kind of a question 19:40when we're sitting here in 2025 talking about 19:43what just happened in tech exchange, I'm 19:44kind of curious about like what the team is 19:46piling towards and particularly in safety. 19:48I know what you work on, like if there's kind 19:50of like what's the next frontier on safety, I 19:52think would be great for people to hear about. 19:54Yeah. So, um, I mean, one thing just 19:56building on what Petro said. 19:57Um, so, I mean, having this data prep 19:59kit out there is not only a boon for, 20:02I mean, the value creation among the 20:03developers and the ecosystem, but it is 20:05also contributing to the safety aspects. 20:08Um, because, uh, When you can inspect 20:10these things, then you can know. 20:11I mean, this is why this is happening. 20:14And these are potential concerns as well. 20:16So I think just the movement towards openness 20:20is going to be a big aspect of the safety world. 20:24Um, uh, TechXchange. 20:26We announced some new features and 20:28what's next on governance as well. 20:30Um, so that's our platform played 20:33on the governance and safety side. 20:35But, um, where granite goes next, um, with 20:38the granite guardian, especially, um, as 20:40Kate said, with the agentic workflows. 20:43So our next, uh, Release of Granite 20:45Guardian will have a function 20:46calling hallucination detection. 20:48So that's something that's not out 20:50there, um, uh, from anyone else. 20:52And, uh, I think that'll, um, 20:54kind of, uh, bridge the gap. 20:56So when you are talking, uh, to a 20:58model in natural language, and then it 21:00translates that into an API call, um, We 21:03want to make sure that there's nothing 21:05wrong happening between in that step. 21:07So, uh, the parameters for the function 21:09names or the parameter values, all of 21:11those should, uh, come out cleanly. 21:13So, uh, that's, uh, I think one of the more 21:16exciting sort of things that we have lined up. 21:18Well, awesome. 21:19A lot more to potentially talk about, but this 21:20is a great overview and I'm glad we kind of got 21:22a little behind the scenes on how these launches 21:24happened because there's just, you know, you 21:26just see the model at the end of the day, but 21:27it's just like, it turns out like lots of humans 21:29spend a lot of time just getting that right. 21:31So appreciate you giving our 21:33listeners a little bit of a lead in. 21:39So our next story that we really want to 21:40focus on is some news that kind of came 21:43out this week about the company Perplexity. 21:46Um, So if you're not familiar, perplexity 21:48is essentially AI driven search. 21:50Um, It's a little bit different from, 21:52you know, what you kind of get from 21:53like a traditional Google experience 21:55where, you know, you have a search bar. 21:57Instead, it's kind of you ask queries 21:58and you can kind of make it interactive. 22:00You have a conversation, um, and 22:01it pulls results from the internet. 22:04Um, The big news was that the rumors 22:06were that it was about to raise, the 22:07company was about to raise 500 million, 22:10um, at an 8 billion valuation, which 22:13I believe is twice what it was before. 22:16Um, and, uh, and that's just, that's just wild. 22:18Obviously we're living through an era of like 22:20a lot of excitement around AI and AI company 22:23valuations are sort of through the roof. 22:24But even I saw this number and I was like, 22:26wow, this is, this is really intense. 22:29Um, and I guess, Kate, maybe 22:30I'll kick it to you is, um. 22:32Is this valuation justified? 22:34Like, what is chat the future of search? 22:36Like, is this the new Google that we're 22:37looking at, or kind of curious about 22:39how you size up news that a company 22:41like this would be valued at this value? 22:44Yeah, I mean, I think there's a lot going 22:45on, as you say, certainly a lot of hype. 22:48Um, but yes, I think chat 22:51is the future of search. 22:52I think it's just a much more natural way to try 22:54and find information to inquire about something. 22:57But I also really wonder, you know, 22:59how differentiated or competitive, 23:02uh, perplexity can stay, right? 23:05What is, how are they gonna, what's their 23:06moat, you know, to prevent others from 23:08basically doing the same thing with a, um, 23:11with a different end, uh, API end call? 23:13So, you know, I, I do worry that we're 23:16seeing some inflation here, uh, of 23:19expectations and, uh, in this valuation. 23:22Yeah, for sure. 23:22I mean, not for nothing, I mean, out 23:24of all of the subscriptions that I'm 23:25spending on monthly, perplexity is one 23:28of the few that I actually use regularly. 23:29But, okay, I think you're touching 23:31on a super important question, 23:32which is, what, what is the moat? 23:33Is there a moat here? 23:34I mean, Petross, I'm curious if you have 23:36any views on like, it seems like maybe other 23:38search companies that I could name might be 23:41really good at this, doing this at some point. 23:43But I mean, clearly someone sees 23:44something in this, like, they 23:45feel like it's a good enough bet. 23:47Um, and so I guess, I don't know if you want to 23:48give us like the, the kind of bull case, right? 23:50Like, is there a moat here? 23:52Right. 23:53Um, I have to admit, I agree with Kate. 23:55I'm also struggling to figure out 23:57what the mode is in this case. 23:59Search, unsurprisingly, has been one 24:01of the areas that many companies, both, 24:04um, startups as well as, like, um, big, 24:07um, megacorporations, actually have. 24:09try to tackle and attack essentially 24:12the incumbents, right, over the years. 24:15It never panned out. 24:19Well, why? 24:19Because, you know, the existing 24:21ones were good enough, right? 24:23They were pretty good. 24:24Now it's a brand new interface, 24:25essentially, a brand new way of 24:27interacting with the search engine. 24:28More importantly, Getting something 24:30that I feel everyone appreciates a 24:32very good summary of things instead of 24:34having like to go and read yourself. 24:36Everyone really appreciates someone 24:38to summarize in a nice executive kind 24:40of bullet point type of interface. 24:43So that being said, kind of evaluation 24:46on this kind of type of expectations 24:50probably is kind of makes sense. 24:52The mode also struggling to figure out 24:55what it is, especially as a few big 24:57names out there that only we're offered. 25:00Essentially on it, right? 25:02Yeah, for sure. 25:04Kush, there's one question I really 25:05wanted to bring up around safety. 25:06So, I agree. 25:07I mean, I think like, interacting with 25:09perplexity, I'm like, oh yeah, like, chat really 25:11is like, giving me a lot of action on search. 25:14Um, unfortunately, I think one of the 25:16things that it's caused me to do the most 25:17of recently is like, buy too many books. 25:19Because I'm always like, oh, could 25:20you give me some recommendations? 25:21It's like, oh, of course. 25:22And here's like ten fascinating books 25:23about the thing that you're interested in. 25:26Um, Can I, maybe I'll play skeptic for a moment 25:28is, you know, I do think that one of the really 25:30funny things about LLMs is that everybody 25:33has rushed to use it as a search interface. 25:36But like out of the box, LLMs are not 25:38concerned with information retrieval. 25:40They're not really concerned about facts. 25:43They're not really concerned about, 25:44you know, validation or verification. 25:46There's no notion of like a page rank that 25:48would even give kind of like, you know, a 25:50sense of credibility between different sources. 25:52So. 25:53You know, there's almost, if I can play skeptic 25:55for a moment, I'd love to hear kind of the 25:56counter argument is no, you know, LLMs are 25:59not the future of search because like LLMs do 26:02something that's just so fundamentally different 26:04from what you want out of search that like, 26:06it's weird that we're in this weird situation. 26:08We've got this technology, we're trying to 26:09like bolt search like features onto this tech. 26:12Isn't that a little bit like getting, 26:13you know, the car before the horse? 26:16Yeah, I mean, I think you're 26:18absolutely right in many ways, right? 26:20So, The fact is, really, it's the 26:22RAG that's doing the search, right? 26:24And then the language model is, um, 26:27like on top of it, I mean, creating 26:29the bullet points or whatever have you. 26:30So, um, And, like, I'll 26:33disagree with Kate a little bit. 26:34I'm not convinced the chat is the best method 26:37for, um, for doing search, necessarily, 26:39or the best interface, because, um, like, 26:43when I go, um, and I'm, like, searching for 26:46stuff, um, like, whatever, like, a research 26:48assistant goes to the library and, like, 26:50is actually, like, trying to find stuff. 26:52Um, the chat isn't, like, 26:55all of the way there, right? 26:57I mean, I think there's like, oh, you go down 26:58one rabbit hole, or you come back, you look for 27:01this, you go over here, go do this and that. 27:03So it's not a linear process, which 27:05is what chat kind of insists on. 27:07And so, um, uh, intent based, um, 27:10interaction is certainly part of it. 27:12So chat is, I think, the simplest version 27:14of, uh, kind of intent, uh, communication. 27:17But, uh, I think there's more ways of 27:19doing this that are Going to actually 27:21emerge and that are more helpful. 27:23So that's where the LLM strength will 27:25be like, um, kind of how to organize the 27:28work, um, organize these, um, sort of 27:30different threads and putting them together. 27:32The search itself, I think, is, um, uh, is 27:36the retrieval part, and that is actually 27:38already not part of how the LLM is doing it. 27:41So, uh, I think that's where we might end up. 27:45Yeah, that's super helpful. 27:46I mean as almost like it's 27:47almost two innovations really 27:48that we're talking about, right? 27:49Like one is the actual retrieval, 27:51and then almost like the LLM is just 27:53like the spice on top that makes it, 27:55you know, um, more, more digestible. 27:57I don't know, Kate, if you want 27:58to kind of respond to that at all. 28:00No, I mean, I don't disagree. 28:01I think Chat is a huge improvement 28:04on search compared to just, you know, 28:06shouting into the void of a search 28:08box, but is it the final frontier? 28:11You know, I, I definitely think because 28:14she bring up some really good points. 28:16Um, you know, this isn't a quite a 28:18linear flow on, you know, a lot of ways. 28:21I think we've worked on, um Uh, making a 28:24faster horse when we need a car, right, uh, 28:27is the kind of old saying we've made chat, 28:30uh, a really fast horse, but what does a car 28:32invention look like in the, uh, search world? 28:35So, you know, certainly 28:36there's opportunities ahead. 28:38Yeah. It'd be cool if it's like an 28:39entirely different paradigm. 28:40Like in some ways, uh, I do think about 28:42the kind of anchoring effect of stuff like 28:43stuff like chat GPT is like the only reason 28:46we've taken this interface is because 28:48there was this accidental thing where this 28:50particular product became so successful 28:52that we kind of see everything about LLMs. 28:54through this lens, but it's like, 28:56it's just almost like a historical 28:57accident, um, in some ways. 28:59It's really interesting to think about kind of 29:00the different users like Kush as a researcher, 29:03you know, you probably have a, a very well honed 29:07art to how you investigate, uh, with Upmost 29:11rigor on different topics where, you know, if 29:14we talk about somebody who's just trying to, 29:16you know, find, you know, what grocery stores 29:18closest or, you know, more casual investigation, 29:22you know, that's probably a different mode. 29:23So I think there's tremendous diversity 29:25as well in potential interfaces that 29:27we're going to see for search, and there's 29:29probably not going to be one size fits all. 29:31Yeah, I would definitely agree with that. 29:33I mean, when my kids are looking for 29:34information, like, uh, uh, like how many 29:37goals did, uh, whatever, like Alex Morgan's. 29:40score throughout her career. 29:41I mean, they don't need to do it in the same 29:44rigorous way as I need to do my research. 29:46So yeah, absolutely. 29:48Yeah. And I think I, my, my long term theory 29:51is that a little bit like how there's 29:52like Google ease where people have just 29:54like, not, they don't really speak in. 29:57In English is kind of like a string of 29:59words that they found to optimize the search 30:00result, like we will also end up having 30:02like a very even for these chat interfaces, 30:05like our own perplexity ease, which will 30:07like not be quite a conversation, but we'll 30:09just be like how we've kind of learned 30:10to get the best results out of the system 30:12and ironically, as we do that, the 30:14model providers are going to be 30:15figuring out how to take that it. 30:18Translate it into what they think is 30:20optimal for the model and then feed that. 30:22So there's just going to be layers 30:23and layers of trying to find the 30:26right way to frame a question. 30:28And that is what agentic workflows are. 30:30I mean, I think multiple layers of 30:32agents translating from one thing 30:34to another thing to another thing. 30:36So, I mean, that's where we're headed. 30:43Alright, for our final story, it's another 30:45open source model story, but I think 30:47it's a another interesting one to kind of 30:49compare and contrast and kind of like talk 30:51about the overall trend in open source. 30:53Nvidia, maker of fine GPUs, um, has 30:57come out recently with a fine tune 31:00of llama that they call Nemotron, 31:02specifically Nemotron 70B Instruct. 31:05Um, and, uh, it was kind of widely touted by 31:08Nvidia, they showed that they were able to 31:09beat a bunch of state of the art benchmarks 31:11across all the other proprietary models. 31:14Um, I think that's all well and interesting. 31:16But I think one thing I wanted to bring to 31:17this panel was just to ask the question of why. 31:21I know Nvidia from its GPUs, its 31:23hardware, um, like why are they getting 31:25into the model training business? 31:27And why would they be open 31:28sourcing models at all? 31:30Um, Petros, I'm kind of curious, 31:31you know, I'll just throw it to you. 31:32Yeah, that's a very interesting, um, question. 31:34Um, everyone knows Nvidia about its GPUs. 31:37Um, I'm not sure, um, how many people don't 31:39know that Nvidia is actually, Nvidia's 31:41mode is actually, in my view, software. 31:44It's actually the CUDA. 31:45interfaces and drivers that essentially have 31:50managed to attract developers over the course 31:52of the last 10 years onto the NVIDIA hardware. 31:55That's why everyone actually ended up using 31:57NVIDIA and still using GPUs from NVIDIA. 32:01So in that sense they do have a very kind 32:02of strong mode in the form of software. 32:05So it's only kind of natural to expect them to 32:08Expand on that both in terms of the software 32:12ecosystem that they build around their hardware 32:15as well as of course showcasing this through 32:17models that are able to train themselves. 32:20And the last kind of thought on this 32:23from my side is the fact that NVIDIA 32:25is also developing its NVIDIA Cloud. 32:28Which is also another kind of aspect 32:31that contributes to the ecosystem of 32:33AI models that NVIDIA essentially is 32:35driving, for sure, from the hardware side. 32:38So I guess, I mean, Cade, this sounds like, 32:39I guess, from Petros's interpretation, it's 32:41almost like just a, it's a show of strength. 32:44Like, NVIDIA is just saying, 32:45we can do models like this. 32:47Um, but I guess part of this is 32:49like they're trying to attract 32:50people to their, their cloud, right? 32:52Like I guess part of this is marketing 32:53the, the cloud offering, which is true. 32:55When I think of cloud, I think of, 32:57you know, Google, I think of Amazon. 32:59I don't really think of Nvidia. 33:00Is that kind of how you read it as well? 33:01That it's sort of kind of like trying to 33:03promote that aspect of their business. 33:05Yeah, you know, I think it's really a powerful 33:09demonstration, right, of being able to say we 33:11can take a model and we can customize it, we 33:14can continue to train it, and we can continue 33:16to boost performance beyond, in NVIDIA's 33:19terms, what was originally released in the 33:22CHAP version of Uh, instructor version rather 33:24of the 70 billion llama model, but so in doing 33:28so, I think they're trying to demonstrate right 33:30that they have these capabilities and invite 33:31customers to come and join in and be able to 33:34customize their own models, continue to train 33:36their own models all on NVIDIA's platform. 33:38So it makes a lot of sense just as a 33:40pure almost marketing point, right? 33:42Being able to showcase their capabilities. 33:44Yeah, definitely. 33:45Kush, does do you, do you think this makes, um. 33:47The other company is a little bit nervous. 33:49I'm kind of thinking about like, you know, like, 33:51NVIDIA has always been like in the background, 33:53the chip people, they do infrastructure. 33:55And then now this is almost like 33:57what you're on our turf, right? 33:58Like, what are you doing, releasing 33:59something that competes with O1 34:00or, you know, Opus or whatever? 34:03Um, should, should the companies be nervous? 34:05Like, is this kind of NVIDIA kind of 34:06playing in a new playground in some ways? 34:08Yeah, I think so. 34:09And, uh, I mean, I think in the future what'll 34:12happen is, um, just like in the traditional 34:15machine learning, we, I mean, we have a problem. 34:19We would look for a data set or collect a data 34:21set and then go and build a model for that. 34:23I think now in a couple of years, it's 34:26going to be where models are the same. 34:27You have a problem. 34:28You go look for a model that's appropriate. 34:30You go look for maybe some fine 34:32tuning data that's appropriate. 34:33And then you work with those. 34:35You're not going to treat models as 34:37anything other than artifacts that are Okay. 34:39Part of the world of, uh, 34:41possibilities for solving your problem. 34:43And so I think the, um, way that NVIDIA can 34:47kind of position themselves is that, um, 34:50now, um, all these models are out there. 34:52Um, you customer, you, uh, whatever 34:55company, you don't have to like innovate 34:56on, on, on those pre trained models. 34:58But what you do need to do is the customization. 35:01And I mean, that's the, 35:02what's next story as well. 35:03But, um, uh, the customization, uh, come to us. 35:07I mean, we're going to be the ones who help you. 35:09And, uh, and I think just having that 35:12mindset available that you don't have 35:14to worry about, uh, all these, uh, these 35:16different models just on the customization. 35:19I think that's the, uh, the part 35:21that'll, uh, kind of be their strength. 35:23Another thing I'll kind of just say is, uh, 35:25I mean, there's, there's this overused trope. 35:27I mean, that in the gold rush, the people that 35:29made money were the ones who, um, I provided 35:32the shovels or whatever, or the blue jeans, 35:35um, so I think with the blue jeans, I think 35:39the thing is, I mean, they somehow crossed 35:42over from just being something for minors 35:44to being like a high fashion sort of item 35:46people would customize, um, and so forth. 35:48So I think it's, I mean, somehow 35:50moving from that commodity to the, to 35:52the fashion as well in some capacity. 35:54Yeah, that's right. 35:55Yeah. It's the, it's the high prestige, 35:57you know, salvage denim jeans. 35:59Yeah, that's right. 36:01Um, Yeah, I think that final comment is so 36:03interesting, too, because it kind of suggests 36:05the ways in which, I guess, some of these 36:07visions are aligned, particularly between, 36:08like, say, IBM and NVIDIA, where sort of 36:11NVIDIA is like, well, so long as there's 36:13more demand for models, we're excited, 36:14because it all takes place on chips, right? 36:16Uh, and then IBM in some ways is like, we 36:18want to release all these models to kind of 36:19unleash all these developers, but we also 36:21believe there's going to be, like, a lot of 36:22enterprise services we'll do on top of it. 36:24Um, and I actually don't 36:25even understand, is it right? 36:26I think Granite's going to be 36:27available on NVIDIA as well. 36:29It is. 36:30They were a launch partner. 36:32You can check out the Granite  3.0 models on NVIDIA today. 36:35And even the Granite Guardian within 12 36:37hours, they had it up as a full working demo. 36:39So you can try the Granite Guardian there too. 36:42Yeah, it's so fast. 36:44Well, great. 36:44Well, that's all the time we have for today. 36:47Um, Kate Cush, always great to see you. 36:49Um, thanks for taking the time to talk about 36:50Granite and Petros hopes, Petros, so we 36:52have you on the show again in the future. 36:54Thank you very much for having me. 36:56Well listeners, if you enjoyed what you 36:58heard, you can get us on Apple Podcasts, 37:00Spotify, and podcast platforms everywhere. 37:02And we'll see you next week for another 37:04action packed week of Mixture of Experts.