Learning Library

← Back to Library

NVIDIA DIGITS: Desktop Supercomputing Unveiled

35m • Unknown Channel • ai-ml • interview • intermediate • Watch on YouTube ↗

Key Points

The panel’s biggest excitement from CES is NVIDIA’s new “DIGITS” system, a compact, high‑memory GPU workstation that brings petaflop‑level AI compute to a desktop size.
DIGITS packs a 120 GB GPU and can run massive models (e.g., 200‑billion‑parameter networks) locally, potentially shifting AI workloads from cloud data centers to individual desks.
Priced at about $3,000 and slated for release in May, the device aims to democratize access to supercomputing power, offering a Unix‑based server that works with Windows and macOS.
NVIDIA’s move into personal‑supercomputer territory positions it against traditional PC makers like Apple, signaling a strategic push into the consumer AI‑hardware market.
The episode also teases upcoming topics on a new developer‑AI tools report, recent issues with Apple Intelligence, and Sam Altman’s reflections on ChatGPT’s second anniversary.

Sections

Full Transcript

# NVIDIA DIGITS: Desktop Supercomputing Unveiled **Source:** [https://www.youtube.com/watch?v=KwGg5WhxuFY](https://www.youtube.com/watch?v=KwGg5WhxuFY) **Duration:** 00:35:25 ## Summary - The panel’s biggest excitement from CES is NVIDIA’s new “DIGITS” system, a compact, high‑memory GPU workstation that brings petaflop‑level AI compute to a desktop size. - DIGITS packs a 120 GB GPU and can run massive models (e.g., 200‑billion‑parameter networks) locally, potentially shifting AI workloads from cloud data centers to individual desks. - Priced at about $3,000 and slated for release in May, the device aims to democratize access to supercomputing power, offering a Unix‑based server that works with Windows and macOS. - NVIDIA’s move into personal‑supercomputer territory positions it against traditional PC makers like Apple, signaling a strategic push into the consumer AI‑hardware market. - The episode also teases upcoming topics on a new developer‑AI tools report, recent issues with Apple Intelligence, and Sam Altman’s reflections on ChatGPT’s second anniversary. ## Sections - [00:00:00](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=0s) **Excitement Over NVIDIA DIGITS at CES** - Panelists emphasize NVIDIA’s compact DGX “DIGITS” system as the premier AI breakthrough unveiled at CES. - [00:03:02](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=182s) **NVIDIA Targets Integrated Workstation Market** - NVIDIA is transitioning from selling isolated GPUs to offering ready‑to‑use, optimized desktop systems that compete with Apple's Studio line, aiming to capture the entire developer and workstation value chain. - [00:06:05](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=365s) **Edge Computing Benefits for Enterprise** - The speaker explains how processing data locally—on factory floors, vehicles, and field deployments—reduces latency, improves security, and creates valuable enterprise applications. - [00:09:12](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=552s) **NVIDIA's Bold Moves Democratize AI** - The speaker highlights NVIDIA’s strategic acquisitions, aggressive price cuts, and open‑source initiatives as evidence that the company is positioning itself as the indispensable, affordable backbone of the emerging physical and agentic AI ecosystem, suggesting it is currently undervalued. - [00:12:17](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=737s) **Trust vs Functionality in AI Code Generation** - A discussion explores whether developers will accept AI code‑generation tools without scrutinizing trust and governance, referencing IBM’s research history and the early adoption of technologies like GPS that users eventually embraced despite initial skepticism. - [00:15:25](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=925s) **Overconfidence Risks in AI Code Generation** - The speaker warns that blindly trusting model‑generated code—especially when the model also executes it—can introduce bugs, emphasizing the need for human skepticism and review. - [00:18:31](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=1111s) **Challenges of AI-Generated Code Review** - It highlights the difficulty of ensuring quality when most code is produced by AI copilots, emphasizing the need for deep human understanding, effective questioning, test‑case design, and automated peer‑review agents to handle the problematic last 30 % of code. - [00:21:34](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=1294s) **Evaluating Apple's On‑Device AI** - The speakers discuss whether Apple's smaller, on‑device models meet expectations, noting trade‑offs in performance, security, and battery life. - [00:24:38](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=1478s) **Apple’s Rushed AI Debut** - The speaker critiques Apple’s hurried launch of a limited‑functionality AI feature, blaming insufficient back‑testing, resource‑constrained edge integration, and a need to catch up in the competitive AI market. - [00:27:46](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=1666s) **Apple’s Open-Source Screen Understanding** - The speaker praises Apple’s low‑power, open‑source tools for interpreting screen elements, discusses the incremental improvements expected in small AI models for diverse news domains, and highlights the challenges of achieving nuanced comprehension on mobile devices. - [00:30:50](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=1850s) **Debating AGI Definitions Post‑Altman Blog** - The panel reviews Sam Altman's reflective post on ChatGPT, noting his continued AGI focus and stressing the community’s need for clearer, shared definitions of artificial general intelligence. - [00:34:23](https://www.youtube.com/watch?v=KwGg5WhxuFY&t=2063s) **Rapid AGI Progress and Industry Leadership** - The speakers laud Sam's company for swiftly advancing toward AGI, expressing amazement at the pace of development and its position as the industry trailblazer. ## Full Transcript

0:00What are you most excited about coming out of CES? 0:01Shobhit Varshney is Senior Partner Consulting on AI for 0:04US, Canada, and Latin America. 0:06Shobhit, welcome back to the show. 0:07What do you think? 0:08NVIDIA's DIGITS. 0:10The supercomputer right next to my laptop. 0:12Mwah. 0:13Love it. 0:14Great. 0:14Uh, Skyler Speakman is a Senior Research Scientist. 0:17Skyler, welcome back. 0:18Uh, what are you most interested coming out of CES? 0:20As a long time PC gamer, absolutely the new line of graphics cards coming out. 0:25And finally, last but not least, Volkmar Uhlig is Vice President, 0:28AI Infrastructure Portfolio Lead. 0:30Volkmar, uh, what do you take out of CES this year? 0:33I'm with Shobhit, it's the 0:34DIGITS. 0:35All right, all that and more on today's Mixture of Experts. 0:42I'm Tim Huang, and welcome to Mixture of Experts. 0:45Each week, MoE is dedicated to bringing you the debate, news, and analysis 0:48you need to keep up with the top headlines in artificial intelligence. 0:52Today, we're going to be talking about a new report on developer use 0:55of AI tools, some trouble with Apple Intelligence, and Sam Altman's reflections 0:59on the second anniversary of ChatGPT. 1:01But first, let's get started. 1:02Let's talk a little about CES and Shobhit, maybe I'll kick it to you first. 1:07We're all very excited by DIGITS. 1:08For those of us who are not obsessively watching all the headlines coming 1:11out of CES, what is DIGITS and why are you so excited about it? 1:15So the intention here is, uh, shrinking their DGX, uh, all 1:19the way down to a small machine, 1:23So NVIDIA has figured out a way to squeeze in a lot more, uh, firepower. 1:28Their graphics, uh, GPU card with an insane memory, 120 GBs attached to it. 1:33So you can start to run some really large AI workloads on your desktop. 1:38Imagine a 200 billion parameter model, which is way bigger than what ChatGPT was 1:44when it came out two years back, right? 1:45You're able to run that locally, right next to your machine. 1:48Now, it comes with a flavor of Unix on it, but you can obviously, instead of 1:52Linux, uh, you can have your Mac and Windows use that as a server, and you 1:56can do some really cool things, right? 1:58But now you're talking about having personal supercomputers that you 2:02can literally keep on your desk or potentially even carry with you. 2:06It won't be out till May. 2:07It's about $3,000, which just looking at the hardware that's going in that 2:12itself is a ridiculously great price point to go deliver that, but this 2:16starts to move computing from the cloud supercomputing all the way down to your 2:21desk, so petaflops of compute at your desktop, and that just is an insane value. 2:27Yeah, absolutely. 2:28I know Volkmar, you were saying that you were excited about this as well. 2:32I know, I think we've talked about it in the past, but if you want to 2:34give our listeners a little bit of an intuition of why is NVIDIA moving 2:37into this market at all, right? 2:39Like, arguably, doesn't this put them in competition with like, Apple and 2:43all these other kind of, you know, kind of desktop personal computer creators. 2:47Whereas NVIDIA's usual thing has of course been data centers. 2:49Do you have a sense of why they're moving into this market? 2:52Yeah, I would not say that NVIDIA traditionally is a data center company. 2:56They are a gaming company. 2:58So, and the data center kind of came along and hit them three, four years ago. 3:02Right in the face, yeah. 3:02Um, yeah. 3:03And you know, good for them, they captured it and was just visible 3:06in their market capitalization. 3:08Um, I think what NVIDIA is figuring out right now is that, um, the 3:14development market, uh, or developer market was kind of limited to, you 3:19know, buy an RTX and stick it into an, you know, developer machine. 3:24Uh, and now they are effectively going all in of saying we need to 3:27cover this whole value chain creation. 3:30And I think it's very, very hard, um, like today, because you, in fact, you 3:35need to buy, um, like a Windows or Linux box and then, you know, you, you stick 3:40in a bunch of NVIDIA cards and, you know, you rake this thing up and now they are 3:44effectively coming out and saying, okay, here's a ready to go system, which is 3:47optimized for that specific workload. 3:49Um, I think. 3:51When you see what Apple did with the M-, you know, M1 to M4 now, 3:56they are effectively trying to capture that desktop market. 3:59And that is not CUDA, and that is not NVIDIA. 4:02And I think NVIDIA is doing a preventive strike here. 4:05And if you look from a pricing perspective, they're sitting right between 4:09the smaller Apple, you know, Apple Studio for $2,000 and the bigger Apple Studio 4:14for $4,000, and so they are at $3,000 and they have specs which are bigger. 4:19And so I think it's, um, and, and now it's also, it's an attachment, 4:22but it's at the same time, you can use it as your primary desktop. 4:26So I think they are- they are are effectively trying to cover their bases. 4:29What will be interesting to see as, you know, what people are 4:32now doing, if you can just, for $3,000, you can get that box. 4:35It's not a DGX, but you know, in many cases, it may be sufficient for 4:39running small scale training jobs. 4:41Um, and so I can, you know, imagine that people are just buying them by 4:46the truckload and putting them in up in data centers and giving their 4:49developers, not necessarily something on the desk, but, you know, maybe it's 4:52tethered, but it's on, on premise. 4:54Um, and so it's a really good way of actually getting 4:57that development loop going. 4:59And you could even use it for production use cases, right? 5:02So if you don't need a 19 inch rack solver, you could use something smaller. 5:07They, I think at three different points in the press release from NVIDIA, they 5:11talk about how easy it is to take the models that you've trained on your small 5:14DIGITS and move it to NVIDIA's cloud. 5:18So I also think they're really pushing for this hook here in order to drive 5:22more business to their data centers. 5:24And one of these is start small on your own personalized local system and make it 5:30extremely easy for you to then scale that up onto, of course, their data centers. 5:35So I think that also plays a lot into the strategy of why 5:37they're really pushing this. 5:38Yeah. 5:39Shobhit, maybe to turn back to you, What do you do with a petaflop? 5:43You know, it's like, it's kind of funny, like, because it is very 5:46exciting, you know, a supercomputer literally on your desktop, but 5:48like, with that level of computing power, what, what do we use it for? 5:52I mean, is it, is it just gaming? 5:54Do you anticipate people doing a lot more homebrew AI stuff? 5:58Um, what, what, what does this unlock, right? 5:59If it just really becomes super successful? 6:01So I think that the two different markets here, one is enterprise, one is consumer. 6:04Right? 6:05I think, uh, from there will be some enthusiasts that'll, uh, 6:08on the consumer side that'll obviously gravitate towards it. 6:11But I think there's a huge potential on the enterprise side. 6:13Uh, what that gives you is being able to run compute that's closer 6:17to where the action is happening. 6:18So think about industrial applications where on the factory floor, you want 6:22compute to be right next to where the manufacturing of everything is happening. 6:26Or, one of my clients, uh, large auto industry, they have a lot of 6:30trucks and buses and things of that nature, and you would want to have 6:33some mobile compute that you can actually run a model on, right? 6:37In a lot of these use cases, there's a lot of latency between calling 6:42a server or a cloud API and being, again, getting responses back, right? 6:46Those are expensive. 6:47So imagine you're taking, say, a picture on the manufacturing conveyor belt, right? 6:52You want to be able to process those near to where the images are being captured 6:56there's less latency and there's a huge security concern here right you want to 6:59make sure that the data especially if it is related to something that's very 7:03sensitive you don't want that leaving your premise either so you want to be 7:06able to run those closer to it same thing goes for say defense applications where 7:11you are doing something more tactical in the field you want to be able to 7:14compute all the images coming in from all the drones and stuff at the, at the 7:19particular place, because you may be in a territory where you really don't 7:22even have a cellular connection, right? 7:24So all of those are heavy computing workloads that used to traditionally take 7:29cloud environments to go scale up and run. 7:31That you're now being able to do closer to where the action is happening. 7:34That's a huge, huge unlock of value for enterprises. 7:37Today, we've been constrained by some cutesy little small, uh, models that'll 7:41be running on mobile devices and things of that nature, but we're not quite there yet 7:45where you can run 200 billion parameter model right next to where the action is. 7:49Yeah, that's really exciting. 7:50Well, a lot more to pay attention to. 7:52Um, I'm definitely going to get one as it sounds like many 7:54of the folks on this call are. 7:55So we'll definitely have to compare notes once they start arriving, 7:58uh, on our respective desktops. 8:00Tim, apart from the DIGITS there were some insanely good things that, uh, NVIDIA, 8:05uh, released, uh, during the keynote. 8:07There were, like, three different areas that, uh, Jensen wanted to 8:10ensure that people realize that this is what NVIDIA really does, right? 8:13So one was in physical AI, figuring out a way in which we can model 8:18the physical universe around us. 8:20a good set of starter AI open source that can understand the physics and we 8:25can start to train things around it. 8:28That leads to things, things like robotics and humanoids 8:32around us in our environments. 8:33Right? 8:34The second big area of unlock was the AI. 8:36automotives. 8:37Figuring out how do we do autonomous driving, and you need the whole pipeline 8:40of millions of sensor data coming in. 8:43How do you process that and make decisions on the, on the vehicle itself, right? 8:48And then the third one was around digital workers, agents doing regular 8:51day to day work as you and I do inside of all the softwares that we work with. 8:56Jensen spent 90 minutes on this on stage wowing the audience. 9:01That's no easy feat, right? 9:02If you analyze the entire 90 minute conversation you start to realize 9:05how an incredible communicator he is, breaking down a concept, 9:09complex concepts into such clarity. 9:12So in each of those different sections he proved that NVIDIA is in fact a leader. 9:18They are making some bold moves to ensure that the ecosystem comes along with them. 9:23They just bought, run:ai for maybe $700 million they turned 9:28around and open sourced it. 9:30It's such a baller move, $700 million and then you open source it, right? 9:34So they're trying to ensure that the entire industry moves closer 9:38to this physical AI and agentic AI and autonomous driving era. 9:42And they want to be the backbone across each one of them. 9:45Uh, last year they had, uh, in the, in the gaming industry and, and Skyler's 9:50going to, uh, chuckle on this, right? 9:51The four, the, the four, the 40 series of their, uh, of 9:55their chips used to be $1,600. 9:58They just released an equivalent compute for $550, right? 10:02So just imagine, Apple will never do this. 10:04They'll never take a $1,600 thing and the next iteration being 10:08a third of the price, right? 10:09So you're getting to this point where, NVIDIA wants to make sure that the 10:13compute is as easily accessible and democratized as plugging into electricity. 10:19But they want to be the electric superpower of the entire world. 10:23And if you look at those three different areas, my hot take, 10:26NVIDIA is undervalued right now. 10:30IBM 10:33is out with a new developer report, um, taking a look at, uh, developers views 10:38on the use of AI tools in their workflow. 10:41Um, a couple of very interesting data points, but I think the place I wanted 10:45to start is, uh, really, I think on this really interesting result where, you 10:49know, the developers were asked, okay, so what do you want most out of an AI tool? 10:54You know, the comment was, well, we want things like trustworthiness in 10:56the AI, and it should be reliable and all the things that you would want. 10:59And then they were asked, well, what are the current problems 11:01with the existing AI tool set? 11:03And it was exactly those same things. 11:05And so I do want to really ask this kind of question of the group, which is, does 11:08feel like despite all the hope- hype around code assistance and agents in 11:14developing and all this kind of stuff that we've been talking a lot about. 11:17Um, it seems like ultimately that there still is this big trust 11:20gap and it is actually preventing adoption of a lot of these tools. 11:24And I guess maybe Skyler, I'll turn it to you first is, you know, 11:27do you see that as a big problem? 11:28Like, do you think that it's ultimately going to kind of put a 11:30ceiling on the use of these tools? 11:33Um, and, and what, what should we make of this? 11:35Like I'm, I'm kind of, it was sort of an interesting result for me. 11:38I'm not sure about a ceiling is the right term, but certainly delay. 11:42Um, I spent, uh, a good time, uh, last year, end of last year, um, in 11:47San Francisco at this International Network of AI Safety Institute. 11:50So this big congregation. 11:53And the topics are, of course, around safety, robustness, trustworthiness, 11:57and those are the topics of the day in this and, uh, here when I talk 12:02to would be clients, they aren't concerned about overall accuracy. 12:06That's not their concerns. 12:07It's how are these machines reaching their conclusions and can we trust them? 12:12That's the back and forth we have now, not accuracy or even costs. 12:17So it's a concern at a global level and even at just kind of an 12:21individual client engagement level. 12:23So yes, it's been part of an IBM research strategy for many years now. 12:28What can we do with trust and governance in this space? 12:31Lots of lots of work to be done there. 12:33Yeah, that's right. 12:34And I think there's kind of one point of view and Volkmar, I don't know if 12:36you agree, you know, working with a lot of folks who are kind of in the nitty 12:40gritty of the technical aspects of this is, you know, I think the AI person's 12:44response also is, well, what do you care about, like trust or reliability, if it 12:48just works, then it just works, right? 12:50Like, you kind of think about like the early days of like Google, 12:53where it's like oh the Google Image Search, there's this GPS thing 12:55that's going to tell you where to go. 12:57Yeah, sure I don't trust that. 12:58And then over time it just turns out like the fastest way to get from point 13:01A to point B is just to put it into GPS and kind of people get over uh, Like 13:05their fear about not really knowing how these systems make decisions. 13:08Do you think that'll kind of be the case here with kind of these, 13:11all these developer tools that say we're going to do code gen. 13:14Uh, and you're like, I don't really need to understand cause it 13:16just works and I'm moving faster than developers that are not. 13:19I don't think so. 13:21So the way right now the development works usually is, and I hope this is 13:27how it works for most companies, is you use the code generation kind of as like, 13:33okay, I know what algorithm I want. 13:35And I can, I can proofread it. 13:38So I can proofread code about 10 times faster than I can write code. 13:41Right. 13:42And so if I go and I need to build something, I'm just going to an agent 13:47and then the agent produces the code. 13:49I'm still checking that the code works and there is still, you know, an architecture 13:53behind it where you are saying, you know, you're kind of interacting with the 13:57system and you're, you almost have a, um, you know, uh, an engineer at your hand 14:01who is very fast and doesn't get tired. 14:04So, uh, and, and you still need to do all the engineering practices we have. 14:09You still need to write unit tests, you know, you still need 14:11to write integration tests. 14:12And so there is a, there is a rigor to it. 14:15Now, if you have bad engineering practices and you don't write unit and integration 14:19tests, then you may actually put, you know, litter your code base with bugs. 14:23But that's more of an organizational, structural problem, right? 14:26So do you allow code which is untested in your code base? 14:30And, you know, a developer can make mistakes and the model can make mistakes. 14:34And we are primarily now asking, you know, who has the higher 14:38likelihood to get it right? 14:40Um, in the end, confidence in your code base will always come from, you 14:44know, test coverage and reviewing that the tests are written well. 14:48And typically in engineering, you're saying, you know, your test should 14:51be 10 times easier to understand than than the code you are actually writing 14:56so that you actually know you it's easier to to check that the tests are 14:59correct than the code itself is correct. 15:02If you follow those practices I think you will discover the 15:05bugs which which get introduced but if you don't yeah good luck. 15:08Yeah definitely. 15:10So do you think that the this report is mostly just revealing the fact that 15:14you know effectively the sort of AI engineering is still more buggy than 15:21humans like that effectively like the lack of trust is kind of well warranted. 15:25I think we are not at the point that I can go blindly to a model 15:29and say produce me 10,000 lines of code And they will be correct. 15:33Um, I think the big challenge is that Um, you know, humans are lazy. 15:39Um, and so there is, there is a tendency that we are overconfident 15:43what the model is doing. 15:45Uh, and if you do that and we are not very skeptical about the output and 15:48we don't, you know, review it, we will actually get bugs into the code base. 15:53Um, I would flip it around and say the more Um, like, open ended question we have 16:00right now is where we are actually putting the model in the middle of the execution. 16:05So there's one is the, is the, you know, code generation, but I can review this. 16:09What if the, if the model actually executes code? 16:11And we see this right now already in ChatGPT. 16:14You ask it a random question, and it goes out, and it produces actually 16:17Python code, and then it runs the Python code, and it gives you an answer. 16:22But then you look at the Python code, it may be buggy, right? 16:24And so sometimes the code doesn't, doesn't even, you know, like, um, when you do 16:29data aggregations, uh, you know, you have like a table and has like, you know, 16:34five values in the first column and seven values in the other column, and then says, 16:38oh, Panda, sorry, I got an exception. 16:41So that, you know, and this happens in real life. 16:44And so you get these answers, which are just bogus, uh, simply 16:47because the code generation and then the code execution is wrong. 16:50And so that's, I think where it becomes much more scary where we are 16:53doing this on-the-fly code generation. 16:56Uh, and I do not think that with the current accuracy, we are there yet. 17:00And so, for small things that may be okay, but for large things, I 17:04think you still need human eyes. 17:06Will that go away? 17:07Yes, probably. 17:09Over the next three, four years, we will get to a point that, you 17:11know, the code will be better than what a human can produce. 17:14Shobhit, to bring you back into this conversation, I gotta believe that 17:17this is like your life, right, is like, customers and clients saying, well, I 17:22don't know if I trust this stuff, and then you being like, no, the water's fine. 17:25Um, I'm curious how this is kind of playing out in your world, 17:28because it feels like this is like a conversation that you have 17:30day in, day out, all the time. 17:32So, from an IBM consulting perspective, right, we have very strict, uh, guidelines 17:38and warranties and things of that nature. 17:39For any code that IBM produces for an end client, we have to be bound 17:44by what our master service agreement says, and what will go into the 17:47code, is it copyright free, things of that nature as well, right? 17:49So there's a pretty high bar for when our team members are 17:52producing code for our clients. 17:53And I think over time, you're starting to see that the quality 17:57of the engineer that is leveraging these copilots matters a lot, right? 18:02If you are a software architect, somebody who's senior, who knows how 18:05to make interns work for you, right? 18:07So say we get you some brilliant software developers and they have these parts 18:11of brilliance, they'll show you some code that's like, oh my God, I don't 18:13believe that this intern wrote this. 18:15And you realize that they actually copied it off from- from Stack Exchange 18:18and they modified it a little bit or something of that part, right? 18:21So it was brilliant, but it was because they had access to other 18:24things and stuff like that, right? 18:25But unless you know how to judge that piece of code, it's very 18:28difficult for you to even think about putting that into production, right? 18:31So the, the bar of the manager for intern is pretty high. 18:37Similarly, when you get a copilot who is behaving like an intern and you're 18:40trying to ensure that the person who is, uh, who's using that copilot should 18:43understand how code is written, right, to, uh, to the earlier points we've made. 18:48We need to know what good looks like, but if the code is being generated 18:52100 percent by, uh, by a copilot, then it's very difficult for you 18:55to understand what logic was used. 18:57Right? 18:58Earlier you said that you can just proofread a code, but then you need 19:00to be really good enough and have done this over and over again before 19:04to understand what to even look for. 19:06What's happening in reality today, 70 percent of the code gets 19:09generated and it works pretty well. 19:11The last 30%, the last mile is where we get stuck. 19:14Right? It's an iterative process. 19:15It takes one step forward, but then it ends up taking two steps backward and 19:19may introduce some other bugs, right? 19:20So unless you really know how you, how the code was written, how you would 19:25have written it yourself if you had the time, You're not really able to get 19:28to that 100 percent unlock of value. 19:30So this tandem between a human and copilot, we also need to figure 19:34out a little bit better on how to ask the right questions, how 19:36to create the right test cases. 19:39And I think having an agent that's going to go review and be the peer reviewer 19:43for the code that's being generated, we're moving towards that place. 19:47A lot of our deployments with our clients, when we introduce other 19:50agents to review the code, review the errors, that multi agent is delivering 19:54higher quality code for our teams. 19:57Then what we got from an LLM that we just start spitting out the code into it. 20:00It's really interesting to think about this is like part of like 20:02a maturity of the overall AI tool chain that needs to happen. 20:05So like the lack of trust is the fact that we have this AI code gen thing, but it's 20:09really not connected to any other AI tools around it, sort of is what you're saying. 20:13It's a new year. 20:14We can be optimistic. 20:15One of the insights from the same, from the same study, what was the 20:19lowest item on this list of 10? 20:21The one that people, the developers don't think is a problem, and 20:24I think is really interesting, it was the quality of the LLM. 20:28So the, these developers are, I think are, are correct and convinced that The LLM 20:33quality is going to continue to increase. 20:35That's not one of their concerns. 20:37Um, and it's, it's really interesting to see that sort of play out here as 20:40the lowest of the 10 options given here. 20:43This came up about half as often as the trustworthy issues did. 20:46Um, so I think that's, that's a pretty interesting takeaway from here. 20:49LLMs will get better. 20:50How we integrate them into the decision making process, that's a 20:54different story, but I think there is kind of a global optimism that these 20:57LLMs Are going to become stronger. 21:03For our next segment, we're going to talk a little bit about Apple Intelligence. 21:07Uh, there was a really interesting news story that popped up in the last week 21:10about how, uh, this new summarization feature that was part of Apple 21:13Intelligence had been messing up, right? 21:16So this would, be a summary of your voicemails, your text messages, and 21:20importantly, your news stories, your news headlines that you were getting. 21:24And they found that in many cases, Apple was actually summarizing incorrectly. 21:27It was hallucinating. 21:28So Apple apologized and promised that they'd be doing better on 21:32the version two of this feature. 21:35Um, I wanted to bring up this topic just because when we talked about this 21:38earlier last year, before the feature came out, you know, the opinion that we 21:41had was AI is going to be perfect for Apple, and they're going to get this so 21:45right, and it's going to be so targeted. 21:47So I wanted to just go back and talk a little bit about, 21:50were we right, were we wrong? 21:51And I guess, I don't know, maybe Shobhit, I'll throw it to you first 21:54on what your hot take is on that. 21:55I think it underperforms in a lot of different scenarios. 21:59I think Apple, uh, is using a lot smaller models to do this on device, to dialing 22:05up on the security side of things to make sure that they're, uh, they're 22:08small, can run, they don't- they're not using some insanely large model to 22:11do the summaries and stuff like that. 22:13So a little bit of the performance hit, uh, I believe is happening because of the 22:16size of the models that they're using. 22:18And we see this in, in real world as well as we're building multi agent 22:21systems and stuff like that too, right? 22:22So I think there's a little bit of, uh, the balance between, hey, 22:25should I, make sure that everything runs on device, and I'm going to 22:28constrain it only to a few things. 22:29It has to be, it cannot start draining the battery and a few other, uh, uh, 22:33things that they have to solve for. 22:34Well, so do I really get a really intelligent model to go do these 22:37summarizations and things of that nature? 22:38Skyler, maybe do you have a similar take or? 22:40I think in addition to Apple getting burned here, I think there's, at 22:44least from what I've seen from the headlines, it's other news agencies 22:48that were using Apple technology. 22:50And so, for example, the BBC, you see this BBC breaking news coming up. 22:55And it's completely made up. 22:56And so the BBC is actually feeling quite burnt in this. 22:59It's not just Apple with egg in their face. 23:01It's partners that they've gone with because now they're getting these 23:04headlines blasted to their customers with the BBC icon next to it. 23:09Gibberish. 23:09So I think it's going to be, yeah, Apple really has to really think about how. 23:14Obviously the technical challenges of getting these hallucinations- 23:16hallucinations taken care of. 23:18But then how do you really pass that messaging on to the consumers going 23:21through another news agency the right way? 23:24Because I think, I think they got hurt on this one. 23:26Yeah, that's right. 23:27Volkmar, one question I had in particular for you, I was having a 23:30conversation recently where he was- a friend of mine was making the argument 23:33that like Apple is ultimately like, they have hardware brain, right? 23:38They like, they do hardware, um, and he was saying that, you know, 23:41he's a machine learning researcher who is like, um, machine learning 23:45is like very different, right? 23:46It's just like, you throw a bunch of data at it and then the machine 23:48just sort of figures it out. 23:50And so its attitude is a lot more just like, you know, just try it. 23:54And then if it works, you know, then it was like a basically a lot more shooting 23:57from the hip than kind of like the mentality of it is that is to do hardware. 24:01And so kind of from this argument, he was saying, like, culturally, Apple's 24:04just like, not well positioned to kind of like play and win in this 24:07space because of like, kind of how careful Apple is in a lot of ways. 24:12Do you think that's right? 24:13Like, is there kind of a point of view here, which is like, in some ways, 24:15like Apple was slow to launch the product, and then it just can't bear 24:19like, organizationally the risk of these things and so it's always kind 24:23of always kneecap that like really launching good features in the space. 24:26I don't think so. 24:27So the way- like, do you remember when Apple kicked out Google from 24:33their phone and did Apple Maps and it was a disaster, right? 24:37Yeah. 24:38And and they took a lot of heat for it. 24:40And now it's you know, one of the main routing applications, right? 24:45I think what's happening is Apple was kind of in a bind because 24:49they were late to the game. 24:51They didn't build a really strong AI team. 24:53This was very visible, like, you know, I was living in Silicon 24:56Valley and Apple was just not there. 24:58Like, they were not present. 24:59Uh, and now they were effectively in a bind of, okay, we need 25:02to bring something out. 25:03We need to make an announcement. 25:04So they made a big splash. 25:06And they, you know, ship the product, they try to keep the functionality 25:09really limited, but effectively make a strong statement, Hey, we are going 25:13to put something on our devices. 25:15We are not like missing the whole thing. 25:18Um, and, uh, I think they had to rush it out. 25:21I think fundamentally there is a backtesting problem. 25:24Those things could have been found. 25:26If they would have done decent back testing on very large scale 25:29data, they have very large scale data on the devices they didn't 25:32Um, and so now they get burned. 25:34Uh, do I believe that will get fixed? 25:36Yes Yeah, I think what Apple is doing is uh, it's defining on edge 25:42devices, you know, how you do deep integration, I think it's still clunky, 25:47like the whole, you know, rewrite my email and rewrite my text messages. 25:50It's not good. 25:51The models are not good yet. 25:53You know, we have much better models out. 25:55I think figuring out how to squeeze something into that form factor with 25:59the resource constraints you have, but the power constraints you have 26:03is a really the tough challenge. 26:05Um, on the flip side, what we are seeing now is every generation of a 26:09new model, we pretty much get the same capabilities for the next smaller model. 26:13So the 70 billion parameter model gets to a 20 billion and the 20 26:17billion parameter model becomes 13 billion and 13 billion parameter model 26:20becomes 7 billion parameter model. 26:22And so, you know, just by waiting 6 to 12 months, we will see capabilities which 26:29you know, have only been traditionally, uh, been able to do in the cloud 26:33on a, you know, like two GPUs or so will be possible to run on a phone. 26:37Uh, but if they would have waited the year, you know, like they 26:42would have lost the market. 26:43So I think they were in this bind. 26:45It's like, okay, technology's almost there. 26:46There's a lot of hype around it. 26:48We need to do something, so let's get something out. 26:50And now they burn their fingers. 26:51Mm. 26:51Yeah. That's interesting. 26:52But it's cool that to think of that, this is basically like Apple Maps. 26:55Again, as someone who just switched actually from Google Maps to 26:57Apple Maps, I'm like, wow, this is actually in fact way better. 27:00Uh, but it was such a funny thing 'cause I remember the initial 27:02reputation of it was terrible. 27:03So I didn't touch it for... 27:04It was terrible like, you're driving the ocean. 27:06That's how I didn't touch it for, so, 27:07And it's, by the way, it's still like that. 27:09For anywhere except for their Apple offices. 27:12So we went to Japan and like, you know, Apple Maps sends you into the forest. 27:19That's amazing. 27:20Um, Shobhit do you- do you agree with that? 27:22It's kind of like, uh, it seems like Apple is the most disappointing, but it seems 27:25like what Volkmar is saying is like, give it time like they will eventually win just 27:29because you can't be you can't beat Apple. 27:32So if you look at the actually, uh, Apple did a lot in the open source community 27:37last year and it's fairly impressive what they did with their Ferret-UI models. 27:41They have these smaller adapter models can run on on device 27:45and things of that nature. 27:46The power envelope is pretty low so they've done an incredibly good job and 27:50open source all a lot of that right. 27:51There are a few things where I think Apple, uh, would, has a, 27:54has a lead over some of the other mobile manufacturers and stuff. 27:58Understanding of what's on the screen, as an example. 28:00They have some brilliant work that they've open sourced that lets you 28:03understand the different elements, so you can then build on, on top of that 28:06and create apps that can take actions on the screen, things of that nature, right? 28:09So they've done some really good fundamental uh, work in 2024 and I'm 28:14expecting that 2025 they're going to start taking better use of the compute 28:18power going up as well as the fact that now they've learned so much the challenge 28:23with a really really small model and As you said earlier this year, the model, 28:29the small models will get better than where they were last year and so forth. 28:32So we're seeing that that'll get incrementally better. 28:35But the fact that you are picking a small model to cover news articles from 28:39every domain, that is a challenge, right? 28:42If you're asking a small model to do a bespoke piece of domain 28:47expertise, that works really well when we deploy this for our clients. 28:50But on an Apple phone, you're expecting it to understand the nuances of negation 28:55and things of that nature on a news article that could be around biology 28:59or it could be around some politics or sports and things of that nature, right? 29:04It needs to have the understanding of every term that's used in golf. 29:08That's different from the way you talk about it when 29:10you talk about soccer, right? 29:11Soccer versus football. 29:12Things of that nature, right? 29:13So you do need a larger model to the summary, but that's the balance 29:16there that they're trying to make. 29:18And I think they will catch up in 2025, but 2024, fundamental work that 29:22they did was, was really, really good. 29:24I think, uh, I really agree with Shobhit here. 29:26Like the foundational work, how do you think about UI integration, uh, how 29:31to think about on device processing and also the offload and then also, 29:36you know, how the, the, the cross. 29:37Um, Quest data domain integration, you know, understanding maps, understanding 29:42your calendar, understanding your email, all that foundational work. 29:47I think it's, it's incredible what they did. 29:49And so I, my expectation is that we will get AI kit too, where also 29:54people can bring their own adapters. 29:56Right now you can't, but that's just the next logical step because, you know, you 30:0020 big models live on a phone because you just don't have the memory capacity. 30:05And so. 30:06The next logical step is like, okay, I can take the Apple model and I can fine 30:11tune it for my specific domain and I can load my adapter into it so that I can 30:15bring, you know, new AI capabilities on device and, but have shared base weights. 30:20And so this is where, where I think Apple did this foundational work 30:24and by saying, hey, we are providing this as part of the operating system 30:27that, you know, people can build on. 30:29And this is, I think, their strength. 30:31So they will, they will do that ecosystem play and give access to it. 30:35But, you know, Apple always starts with the walled garden, you know, and nobody 30:39can do anything until they figured it out by themselves, until they enabled all 30:43the applications and then it will become kind of obvious how, how you build this. 30:47And then we'll run it on our DIGITS, uh, you know, 30:49Right. Exactly. 30:50Yeah. 30:52Exactly. 30:53Yeah. 30:54Exactly. 30:55So 31:01for our last segment, let's do a little final round the horn, uh, Sam 31:04Altman on his personal blog, uh, put out a reflections blog post looking 31:08back at the last two years of ChatGPT. 31:10Um, there's a lot in it. 31:11It's a very long blog post. 31:13I think the big thing that came out of it for me, um, was really just 31:16The degree to which Sam still really believes in AGI as the mission of OpenAI. 31:21He hits on it multiple times, and it's still the big thing he's 31:24rallying the company towards. 31:26But I kind of wanted to get the view of all of you on the panel on, you 31:28know, what you thought was surprising, what you thought was interesting. 31:31Shobhit, I'm curious if you have any thoughts on, on the blog post, and 31:33if there's anything that you thought was surprising or kind of worth 31:35it for people to pay attention to. 31:37Yeah, so he's, he talked a lot about AGI and I think, uh, we as a community 31:41have not agreed to what should be the levels of defining what AGI is. 31:46So I think we need to do a better job before we can even 31:48evaluate people's opinions on whether AGI is achievable or not. 31:52If we don't agree on a definition, of artificial general intelligence 31:56between even humans, right? 31:57How do you even evaluate human intelligence? 31:59Ten kids in a classroom or in high school or in college? 32:03It's very difficult for us to have a good measure for that. 32:06So the community in 2025 needs to have better definitions, just like we did with 32:10autonomous driving, different levels and hear the scenarios, hear the test cases, 32:14we should do a little bit better job of defining that before we can evaluate 32:18if Sam is, is really telling truth about where, how far we are from AGI. 32:24Yeah, for sure. 32:24Skyler, any thoughts? 32:25Uh, hot takes? 32:26Opinions? Yeah, slightly humorous take. 32:28I had forgotten that he was fired and hired back. 32:32So kudos to the PR team for that and it wasn't until reviewing the 32:36blog that I had that trigger again. 32:38You're like, Oh yeah, he was briefly not CEO. 32:40Exactly. 32:40I had forgotten about that. 32:42And so I guess that was, if you're talking for a hot take of reflection 32:45and reading that, that's probably what jumped out at me as it, it 32:48just triggered that memory again. 32:49So, uh, Um, yeah, that that's my that's my hottest take of that. 32:53That's right. 32:54Yeah that's that's such a funny thing because that was such a big story and 32:57I had a very similar experience where I was like Oh, yeah, that was last year. 33:00So, last but not least Volkmar curious if you've got any takes. 33:04Yeah, I I think it's it's a mix of both. 33:06So I think you know having been in you know startups and venture 33:11capital for more than 10 years. 33:14Uh, you know, I can feel for the pain he is going through and you know the ups 33:18and downs and um, you know, getting fired from your own company is really not fun. 33:24Um, but I think that um, it's really interesting to see the um, the product 33:29evolution they are going through. 33:31And, you know, he is pointing this out, like, you know, we, we did ChatGPT 33:35and we released this thing into the wild and, you know, it's the fastest 33:37growing consumer product, uh, ever. 33:40Right? 33:41So that's really amazing to see that, you know, how, how AI took off. 33:45I think in the end, open AI created. 33:49you know, this new wave, uh, they, they took the risk, um, you know, they figured 33:54it out, you know, kudos to him, um, and now it's, it's really the question, like, 33:59I mean, they have, they have this, um, really big north star of, like, we want 34:04to get to AGI, and if you look at, uh, the, like, you know, 2024 with o1, where 34:11they actually say we, we want to get to, you know, human level reasoning, and they 34:15are still innovating, and it's really impressive that, you know, if you look at 34:19OpenAI, um, they are clearly the leader, I think, in this industry right now. 34:23They are defining, you know, the next steps, and I think it's part of Sam's 34:28vision to say we want to get to AGI in a, you know, human scale time frame. 34:33So, and we, you know, every time they're releasing a new product, 34:37it's like, wow, this is possible? 34:39You know, I think they are still driving the industry and 34:42everybody else is a follower. 34:43So that's, that's really impressive. 34:45Yeah, I love that. 34:46Yeah, I think that was one big reflection on the blog post was just 34:49this guy who's running this company seems himself kind of surprised 34:52about how fast things are moving. 34:54You know, it's like, Oh, yeah, wow, like we're doing this thing. 34:57It's only been two years. 34:58And that's like, very fun to see that even he is like, continually uh, 35:02confounded by how things are happening. 35:05So, um, well, great. 35:08Well, thanks for joining us. 35:09Uh, Shobhit as always, Volkmar as always, and Skyler as always. 35:12It's a pleasure to all have you on the show. 35:14Um, and thanks for joining us, uh, all you listeners out there. 35:17If you enjoyed what you heard, you can get us on Apple Podcasts, Spotify, 35:20and podcast platforms everywhere. 35:22And we will see you next week on Mixture of Experts.