Learning Library

← Back to Library

ChatGPT 5.1: Conversational Style Focus

31m • Unknown Channel • ai-ml • interview • intermediate • Watch on YouTube ↗

Key Points

The community sees the recent GPT‑5 updates as a mixed “fix” that may prioritize cost optimization over genuine improvements in model warmth and performance, especially compared to earlier models like GPT‑4o.
“Mixture of Experts” introduces a weekly panel of AI thought leaders—including Kautar El Mangroui, Aaron Botman, and Mihai Krivetti—to dissect key developments in artificial intelligence.
This week’s AI news roundup highlights Anthropic’s $50 billion U.S. data‑center investment, Elon Musk’s AI‑chip fab for autonomous vehicles, Baidu’s new AI chips, and a Tucson restaurant deploying AI‑driven robotic cats for food delivery.
OpenAI launched two new ChatGPT 5.1 variants—Instant (fast) and Thinking (advanced)—emphasizing a shift from benchmark bragging to improving conversational style and user enjoyment.
Panelists debate why conversational style now tops performance metrics, suggesting users value engaging, “human‑like” interactions as much as raw intelligence in modern AI systems.

Sections

Full Transcript

# ChatGPT 5.1: Conversational Style Focus **Source:** [https://www.youtube.com/watch?v=5sFJVAoafFI](https://www.youtube.com/watch?v=5sFJVAoafFI) **Duration:** 00:31:45 ## Summary - The community sees the recent GPT‑5 updates as a mixed “fix” that may prioritize cost optimization over genuine improvements in model warmth and performance, especially compared to earlier models like GPT‑4o. - “Mixture of Experts” introduces a weekly panel of AI thought leaders—including Kautar El Mangroui, Aaron Botman, and Mihai Krivetti—to dissect key developments in artificial intelligence. - This week’s AI news roundup highlights Anthropic’s $50 billion U.S. data‑center investment, Elon Musk’s AI‑chip fab for autonomous vehicles, Baidu’s new AI chips, and a Tucson restaurant deploying AI‑driven robotic cats for food delivery. - OpenAI launched two new ChatGPT 5.1 variants—Instant (fast) and Thinking (advanced)—emphasizing a shift from benchmark bragging to improving conversational style and user enjoyment. - Panelists debate why conversational style now tops performance metrics, suggesting users value engaging, “human‑like” interactions as much as raw intelligence in modern AI systems. ## Sections - [00:00:00](https://www.youtube.com/watch?v=5sFJVAoafFI&t=0s) **Mixture of Experts: AI Model Updates** - A panel dives into mixed community reactions to GPT‑5’s recent fixes, previews new model releases like ChatGPT 5.1 and Kimmy K2, and highlights industry headlines such as Anthropic’s $50 billion U.S. data‑center push and Elon Musk’s AI chip fab plans. - [00:05:09](https://www.youtube.com/watch?v=5sFJVAoafFI&t=309s) **New Model Release or Just Tweaks** - The speaker questions whether the latest GPT‑5‑based update is a genuinely new model or merely fine‑tuned guardrails, prompts, and UI changes, noting cost‑vs‑speed trade‑offs and mixed community attitudes toward paying premium for maximum performance. - [00:08:34](https://www.youtube.com/watch?v=5sFJVAoafFI&t=514s) **IQ vs EQ AI Market Segmentation** - The speaker argues that AI model differentiation will revolve around customization, cost‑performance, trust, and user preferences for raw intelligence versus emotional intelligence, creating distinct market segments and specialized offerings. - [00:13:55](https://www.youtube.com/watch?v=5sFJVAoafFI&t=835s) **Open‑Source AI Milestone Shifts Landscape** - The speaker explains that a new open‑source trillion‑parameter MOE model achieves competitive performance and efficiency, challenging proprietary AI dominance and marking a “Linux‑like” shift toward shared, permissively‑licensed ecosystems, particularly driven by China. - [00:17:36](https://www.youtube.com/watch?v=5sFJVAoafFI&t=1056s) **Skepticism Over Open‑Source Model Trust** - The speaker questions Kimi K2’s performance and transparency, calls for third‑party benchmarking, and advocates using managed AI services over self‑hosting due to trust, tool integration, and reliability concerns. - [00:21:05](https://www.youtube.com/watch?v=5sFJVAoafFI&t=1265s) **Enterprise AI Agents and Open‑Source Competition** - The discussion highlights open‑source models now rivaling frontier AI, raises safety concerns about claims of invoking hundreds of tools, and details Microsoft’s announced plan to roll out autonomous enterprise AI agents with unique identities that can access corporate systems, attend meetings, edit documents, and collaborate with humans and other agents. - [00:24:36](https://www.youtube.com/watch?v=5sFJVAoafFI&t=1476s) **AI Agent Identity and Compliance Risks** - A discussion about Microsoft's AI agents becoming user-like entities, raising data integrity, governance, and accountability challenges for CIOs and CISOs. - [00:27:48](https://www.youtube.com/watch?v=5sFJVAoafFI&t=1668s) **Proliferating Autonomous Agents in Workplaces** - The speakers predict a future where countless AI agents, governed by zero‑trust frameworks, blur the line between human and machine in the office, creating hybrid collaborations. ## Full Transcript

0:01I think it's more like a fix. They've been trying 0:03to fix the issues since they've launched GPT5. I think 0:07the overall community has mixed feelings about it. They're still 0:10attached to the performance they were getting out of models 0:13like GPT4O. Some of the community feels it's more of 0:17a cost optimization as opposed to really an issue with 0:20how warm the model is responding. All that and more 0:23on today's Mixture of Experts. Foreign I'm Tim Huang and 0:32welcome to Mixture of Experts. Each week Moe brings together 0:34a panel of the finest minds in technology to distill 0:37down what's important in artificial intelligence. Joining us today are 0:40three incredible panelists. We've got Kautar El Mangroui, Principal Research 0:44Scientist and Manager, Hybrid Cloud Platform Aaron Botman, IAM Fellow 0:48and Master inventor and Mihai Krivetti, Distinguished Engineer at Genti 0:51AI. Alright, this episode we're going to be covering a 0:53lot of interesting developments in the model space. We'll be 0:56talking about OpenAI's release of ChatGPT 5.1, the incredible performance 1:02we're seeing out of Kimmy K2 thinking. And we're going 1:05to end with a sort of interesting story about Microsoft 1:07and its release of Agentic users. But first we've got 1:09Aili with the news. Hey everyone, I'm Ili McConnen, a 1:17tech news writer for IBM Think. Here are this week's 1:20AI headlines. AI startup Anthropic said on Wednesday it would 1:24invest 50 billion in building data centers in the U.S. 1:28elon Musk plans to build a massive AI chip fabrication 1:32plant to create chips for self driving cars and robots. 1:36Baidu unveiled two artificial intelligence chips as Chinese tech giants 1:41ramp up their chip making efforts. A new Tucson restaurant 1:45is using AI robotic cats to deliver food to customers 1:49tables. For more subscribe to the Think newsletter linked in 1:53the show notes and now let's see what our Experts 1:56think of ChatGPT 5.1. Let's start with ChatGPT 5.1, which 2:05by default is the biggest story of the week. The 2:10big news here is that OpenAI has announced two of 2:13its kind of like latest editions of its model. So 2:15this will be ChatGPT 5.1 Instant, which is their sort 2:18of fast model and ChatGPT 5.1 Thinking, which is their 2:21sort of like advanced technology deluxe model. And actually I 2:26think Aaron, I want to start with you. I think 2:27one of the most interesting things about this is typically 2:30when companies have touted new models in the past, they 2:34have tended to tout the fact that, look, they're so 2:36good at reasoning and they're so good against all these 2:38benchmarks. But the thing that OpenAI leads with, and I'll 2:42quote the blog post, is actually conversational style. So OpenAI 2:46says, quote, we heard clearly from users that great AI 2:49should not only be smart but but also be enjoyable 2:52to talk to. GPT 5.1 improves meaningfully on both intelligence 2:56and communication style. And I guess, Aaron, I'm curious about 2:59what you think is leading to that, right? Are people 3:03just not very impressed by performance on benchmarks anymore? Why 3:07is style such an important part of this launch? Yeah, 3:11style is critical. I think it develops a sense of 3:15empathy with the user and trust so that if the 3:19model can have a more warm type personality and, you 3:24know, respond in a way, then it develops that relationship 3:27further. Which I think we'll talk more a bit about 3:30that later in the podcast. But I want to mention 3:33what I really like about GPT 5.1 is this router 3:37mechanism that whenever you're speaking with it or having a 3:42conversation with it, with the style of which they have 3:45infused into this model, it goes into one of the 3:48variants that it has. It can go into an instant 3:51or thinking type variant, which is great, right? Because if 3:56I want to have a very quick instantaneous response with 4:00low response time, if that's the use case, then I 4:04more than likely can also coerce that router to go 4:07into that particular variant. If I need it to go 4:09into a deeper chain of thought, then it can go 4:11that way too. But then it joins back up in 4:14the middle and that's where that stylistic choice comes in 4:17to help develop, you know, that said relationship with the 4:20user. So it becomes more fluent. Yeah, that's great. And 4:23I did want to talk a little bit more about 4:24that. I mean, Mihaly, the question, kind of cheeky question 4:27I was going to ask is like, should we be 4:29covering this at all? You know, the movement from 5 4:31to 5.1 is like maybe a little bit incremental. How 4:35big of a deal is this launch, you think? I 4:37think it's more like a fix. They've been trying to 4:39fix the issues since they've launched GPT5. I think the 4:43overall community has mixed feelings about it. There are still 4:46attached to the performance they were getting out of models 4:49like GPT4O and some of the community feels it's more 4:53of a cost optimization as opposed to really an issue 4:56with how warm the model is responding. Like even I 4:59feel like, am I getting gaslit here? Is it like, 5:02you know, it's not that the model is bad, you 5:04see, these results are great. You don't like the results 5:07because it's so direct. Can you help me with this? 5:09No. So I was always wondering, is this actually a 5:15new model release or have they just fine tuned the 5:18model or did some slight changes to the guardrails, or 5:21did some slight changes to the prompts, or the way 5:23they're, you know, exposing The UI and APIs and all 5:26these other kind of things. It does look like they've 5:29actually trained a new model, or at least iterated on 5:32the same GPT5 family. So there are definitely some changes 5:36there. But to me, this still feels like they're trying 5:40to address the issues with the GPT 4.0 transition. There 5:44are some cost optimization challenges where instant obviously provides much 5:49faster responses and this router provides cost efficiency. But there 5:55are mixed feelings within the communities. Like, for example, when 5:57I use one of these models, just turn it up 5:59to 11. Just give me the good results. I don't 6:03care about. I don't care. Just think as much as 6:05possible, Think. As much as possible and give me something 6:07that works. So that's going to get expensive fast. Yeah, 6:11that's right. I have a friend who is just like, 6:12I just love the idea that when I push the 6:15button, it's working as hard as possible for me, even 6:17if the task is like, very, very simple. So, like, 6:20that psychology, I think, is very fun. I'm paying, what, 6:22240 Euro, whatever with tax for the GPT 4 Pro? 6:25I'm sorry, GPT? Take your money's worth. I'm going to 6:28get my money's worth. Kautzer, what did you think? Have 6:32you played with this model yet? What's your vibe? Check 6:35on 5.1. Yeah, I played the Woodpecker with it a 6:38little bit. I think I agree what my colleagues here 6:42are saying, but I feel they're trying to find a 6:45differentiation through the user experience and the empathy and the 6:49customization, especially in a world where raw intelligence is becoming 6:54a commodity, thanks to models like Kimmy K2. So they're 6:58trying to focus on maybe the fluid conversation, the daily 7:01use. One of the features that I liked is the 7:05adaptive reasoning. So deciding when to think before reasoning, before 7:10responding basically to these complex questions, which seems to lead 7:16to better accuracy than the previous fast models, while trying 7:19to remain quick on these simple tasks. But also the 7:26tone that they're saying, designed to be warmer or even 7:29playful, reflecting basically this strategic choice to improve the conversational 7:34feel. So I don't know, are we seeing here segmentation 7:37of the markets, models that are focused on efficiency, like 7:40what we're seeing in the Open source with Kimik 2 7:43or models that are trying to win the user experience, 7:47the personality. So it's interesting to see here. Yeah, that's 7:51right. And I think I did want to pick up 7:52a little bit before we move on to talking about 7:54Kimike2 thinking on this point about customization. I think it's 7:59very, very interesting that they kind of really sell the 8:03point, like, oh, we are trying to make these models 8:05more customizable for you as a user, which is a 8:08little bit different, particularly in an enterprise business case. It's 8:12not like Microsoft Word is like, oh, well, Microsoft Word 8:15is going to be customized for you specifically. But in 8:18AI, it does definitely feel like we're headed towards a 8:21world where everybody's experience of, say, ChatGPT is going to 8:24feel pretty different over time as they allow for more 8:28and more customization and I guess, Kalto, how do you 8:31feel about that? Do you think that's going to be 8:33just where the market's going on, some of this stuff? 8:35Yeah, I think so. Definitely customization is going to be 8:39an important piece of it. And whether there's also the 8:44cost per performance, the user experience, all of these things 8:49kind of will segment the market here. Which models are 8:52we going to go to? But I think also trust 8:53and governance and compliance, those also will be very important. 8:58So it's going to be interesting to see how these 9:00things evolve. But definitely there is a war here. Is 9:03it the IQ war or the EQ war? So is 9:06it the, you know, intelligence or the emotional cuic quotients 9:11here? So are we going to segment along those two 9:14dimensions? Yeah, we'll see. I don't know. I think it'd 9:17be very funny if it just turns out that there's 9:19going to be kind of a battle for right brain 9:22users who just want the model to be as smart 9:24as possible. And then a battle for left brain. I 9:28got it flipped. Right. And there'll be a battle for, 9:30I think some users who just want like much more 9:33natural conversation. And that's actually how the market will sort 9:35of divide over time. Ideally you'd want a model that 9:38does both really, really well. But it seems like the 9:41companies are trying to specialize a little bit over time. 9:43Yeah, I mean, I could certainly see a reality where 9:46you could bring your own style or bring your own 9:48behavior through like an Alora weight or something like that. 9:51Right. So you bring your own adapter and then you 9:53can upload it, plug it in, or even mix together 10:00which would be pretty interesting. And this might be somewhat 10:04aware of this warmer tone, more conversational piece is going 10:07and I wonder if GPT 5.2 or GPT 6 might 10:12push the ecosystem a bit further down towards that way. 10:17I do want to say this is raising all sorts 10:18of red flags with me. I'm definitely in the camp 10:20of technical person that the only smart device in my 10:25home is a printer and I keep a gun next 10:26to it in case it makes a funny noise. I 10:29have an inherent distrust of any system that learns about 10:32me, learns about my behavior, adapts over time because the 10:36only thing I'm seeing in my mind is like Advertisement 10:40influencing my decisions, learning and optimizing its responses to drive 10:45my behavior. So I actually like simple systems. I like 10:49my AI like I like my headphones with a wire 10:52with switches I can toggle on or off. I want 10:55to be the one in control. I don't want the 10:57router, I don't want the memory, I don't want it 10:59to learn about me. I want to tell it what 11:02it needs to know every time. So I think there's 11:04a balance there to strike. Yeah, yeah. You're actually even 11:06against the router. So I guess like the thing that 11:08Aaron finds so interesting, you're kind of like, I don't 11:10want it to tell you, to decide how much to 11:12think. Yeah, this is something happening whether we want it 11:15or not. Unfortunately, these customizations and adaptation and so on. 11:19It is the more we use these systems, the more 11:22they're learning about our behaviors and so on. And I 11:25think whether we have control or not, I don't know 11:28these systems, they're not giving us control. And that's something 11:31maybe another design point. Can these AI systems give the 11:36users control whether they want to learn about our behavior, 11:39we want them to learn about us and things like 11:42that and adapt, or we want maybe just simpler interactions 11:47and kind of robotics ones without any kind of implied 11:53intelligence. Yeah, exactly. We don't want it to be too 11:56smart in a certain way. I'm going to move us 12:03on to our next topic. Another model I think to 12:06cover, which I think is a really interesting counterpoint to 12:09the ChatGPT 5.1 story, is the hype around Kimik 2 12:14thinking. So just to kind of quickly review, Kimik 2 12:18is a model produced by a Chinese AI startup called 12:21Moonshot AI and they dropped this model, which is an 12:25open source model. Which incredibly, has been able to claim 12:30superior performance against even proprietary models on a set of 12:34pretty big benchmarks. So on humanity's last exam, they're doing 12:38great. On Browse Comp, they're doing great. On Sui bench, 12:41they're doing great. And this is a pretty, I think, 12:44interesting story, right? Which is, I think for a long 12:46time on moe, you've talked about when will open source 12:49triumph over the proprietary models? And this seems to be 12:53a case where the open source model is doing at 12:56or better than all the proprietary models. And I guess, 13:00Michal, I'll give you a chance to kind of lay 13:02out your conspiracy theory, because before the episode you were 13:05saying, you know, maybe the timing of ChatGPT 5.1 is 13:08a little. Suspicious with K2 thinking. Do you want to, 13:11do you want to just quickly lay that out? Yeah, 13:12let's quickly increment that one. Because something big is coming 13:15in the open source space, and that's my theory there, 13:19that there's definitely a response in the market to this 13:22very, very powerful open source model. Yeah, and I think 13:25that's like, I mean, that's maybe the cynical view on 13:27sort of like, oh, you're going to tell all these 13:29style and emotional communication things because now you're getting beat 13:32on all the benchmarks, I guess. Katra, maybe let me 13:37take a step back though, is like, are benchmarks still 13:41a useful way of even looking at this? Right. So 13:43obviously the companies care a lot about it, but I 13:45think on the show we have talked about like, well, 13:48are we kind of reaching the end of the usefulness 13:51of some of these benchmarks in terms of showing performance? 13:55Obviously this is a milestone for open source, but does 13:58it really say that open source is now better than 14:00proprietary models? How do you read these results? I think 14:04if we look at the results and the benchmarks, I 14:06mean, it's saying something. So there must be a way 14:09to evaluate these models. And right now the only way 14:12that's kind of viable is benchmarking, trying these things and 14:19also on these standard benchmarks. So what this is saying 14:22is this is actually a big open source milestone. It 14:25is a challenge to the entire closed AI economy. So 14:28if the best model in the world is open weights, 14:31the center of gravity in AI shifts from secret models 14:34to shared ecosystems. And so this is also, this move 14:38is positioned in China also as a serious contender in 14:41this global open model race, which is paralleling kind of 14:45the Linux moment in the AI era. So there, I 14:49think the results are showing really superior performance using the 14:53MOE architecture. With 1 trillion parameters with only activating 32 14:58billion per inference per token. So there is a big 15:02focus here on compute efficiency as much as also the 15:07capability one. And another thing is the license that they 15:10have is very permissive but also strategic. So they're saying 15:14if you're doing this at a massive scale, you have 15:17to mention Kimik too. I think the rule is, I 15:20have written down here is 100 million monthly active users 15:24or 20 million USD per month in revenue. Yes, yes. 15:29So it's like anything that is related to massive scale. 15:32Give us the attribution. And so for enterprises, this open 15:37way dominance really means that you can finally bring top 15:40tier reasoning in house with much lower costs. And this 15:48is kind of could be the start of this open 15:52apex era where proprietary adventures is decaying faster than ever. 15:58So I think for big tech, the open outperformance like 16:04K2 thinking they're forcing, the question is how do you 16:07justify like a 20 million monthly burn rate when community 16:12models can really close the gap. And I think here 16:16the next frontier is may not be maybe the model 16:20quality, but also the integration who builds the most trusted 16:24compliant and secure deployment pipelines. So I think that's going 16:29to become super important here. So it's not just about 16:32building the most intelligent and most powerful model, but also 16:36about building a model that you can integrate and trust 16:39and it is compliant and you have these secure deployment 16:43pipelines. Aaron, do you want to make a forecast for 16:45where we're going from here? I mean, I think the 16:47old rule used to be, well, open source is going 16:49to lag behind, quote, state of the art by X 16:52months. Now we're in a world where if you buy 16:54Lehigh's theory, it's basically like now open source is ahead 16:58in some ways of the proprietary models. Are we in 17:01for a long period of kind of rough parity, I 17:03think between open source and proprietary or do you feel 17:05like actually over time open source is now going to 17:07even accelerate further? We'll start talking about like, oh well, 17:10how long is it going to take for OpenAI to 17:12catch up relative to open source? Curious about how you 17:15think about that. Yeah. So real quick, before I answer 17:18that, I wanted to just pull the thread a bit 17:19on what about these standards and benchmarks, marks, you know, 17:23around this? I did, I want to make the point 17:25that I think a third party independent assessment needs to 17:29be made right around this model because I was looking 17:33around and we can make stats tell us anything. Right. 17:36You know, you know I can, I can say, hey, 17:38if I Walk outside and it's zero degrees, you know, 17:42Celsius for example, that it's healthy. Right. So I mean 17:46I'm not sure how much I would really buy, you 17:49know, lots of the performance that Kimmy did on like 17:53the browse comp, the SWE bench, the live code bench, 17:56you know, those pieces and elements until a third party 17:59test this model. Right. So that's a one point. Right. 18:03The other is that I don't think this is an 18:06open source when proprietary is finished argument. I do think 18:11we're in a clear inflection point where open source models 18:14can compete and complete some of the highest levels of 18:17reasoning tasks. But I always like to think this, you 18:20know, Kimi K2 isn't just thinking, it's overthinking for all 18:24of us. Right. So that being said, you know, there's 18:27no, no free lunch. Right. We need to pick the 18:29right tool for the problem that we're going to use 18:32and put them together. Right. And I think one of 18:35the biggest areas that Kimi has an issue with a 18:39is trust and transparency even though it is open source. 18:42Right. Again, I want to see a third party show 18:46us what the standards are and what the benchmarks are 18:49like within the real world here and then also the 18:54ecosystem. I used to host my own models. Very difficult 18:59to do. I like to go and use like a 19:03Watson X for example or a Bedrock that host models 19:06for me so I can use the tooling that's available. 19:11And I think for example, we just talked about GPT 19:155.5.1. I think the tooling and that part is very 19:19much a hedge. So I think there's pros and cons 19:22for all of these and we're moving into a world 19:24where we're going to ensemble these types of models together 19:27within these very large graphs that are conversational and then 19:32they use these 808 protocols to communicate together. I guess 19:37maybe a final question to Mihaly before we move on 19:40to the last topic I want to cover was I 19:42know the reaction you just had to 5.1 was it's 19:45learning about me, it's deciding how much to think. I 19:49just want it to be simple. Right. I just don't 19:52want that much. And is it ultimately, I guess a 20:02would a proprietary model, I guess. Miha, the question just 20:05to kind of get it to a question is like 20:08how common do you think you are? Right. Do you 20:11feel like the average consumer wants this Level of control? 20:14No. Okay. I don't think they're necessarily aware of it, 20:17but as a developer or somebody who builds AI agents, 20:20who builds AI tools, I'm looking at Kimik2 and I 20:24think, look, 300 sequential tool calls, 256k of context, 10 20:29times cheaper than GPT5. And I can run it there, 20:33okay, at one token per second with a distilled model. 20:36And it's a 1 TB download. And my wife is 20:39going to kill me when I start the servers, but 20:41I can run it, I can run it myself. It 20:43goes through no API. Nobody's looking at my data, nobody's 20:47putting a router in front of it. Nobody gets to 20:49make those choices for me. And it was kind of 20:52funny when I was flying out last week. I had 20:57GPT OSS 20 billion on my laptop and I was 20:59able to do coding like that. Level of control. The 21:02ability to run these models locally I think is priceless. 21:06And the fact that we now have models within the 21:08open source space that can compete, genuinely compete with some 21:12of the frontier models, I think it's just awesome. Yeah. 21:15One of the biggest claims was that he could call 21:17200 to 300 tools. Right. That's a big claim. And 21:22this long horizon reasoning, I would really like to see 21:25that validated. And you're calling external tools, that's a safety 21:30issue as well. So that's something to be very cognizant 21:34about. All right, last story of the day that I 21:40want to cover is kind of this fun story. The 21:43Register, the kind of tech news site, basically reported on 21:48this. Kind of interesting, not really a leak, but some 21:51sort of teasing that Microsoft is doing about what it's 21:53working on in terms of agents for the enterprise. And 21:58specifically it's releasing or planning to release what they're calling, 22:01quote, a new class of AI agents that operate as 22:04independent users within the enterprise workforce. So the quote here, 22:09which I think is just fun to read, is like, 22:10each embodied agent has its own identity, dedicated access to 22:14organizational systems and applications, and the ability to collaborate with 22:17humans and other agents. These agents can attend meetings, edit 22:21documents, communicate via email and chat, and perform tasks autonomously. 22:25So this is a kind of fun dream. We've been 22:28talking about AI agents all year, but this is maybe 22:31like the first one where a company's starting to make 22:33the claim, like, oh yeah, we're going to just have 22:35like a drag and drop coworker who will be an 22:38agent that will operate on the enterprise exactly the same 22:41way as any other user does. And So I guess, 22:46Mihaly, I see you making a face, so maybe I'll 22:47call on you first. Particularly someone who works on agents 22:50all the time. Is this marketing hype? Is this a 22:54good idea? Is this a bad idea? I comment on 22:56printers. Just a printer on a gun next to it 22:59in case it makes funny noises. As somebody who's actively 23:03building security software for AI agents, I'm building contextforge, which 23:06is a gateway for agents and MCP servers. This is 23:10great news. I mean, I'm sure there's going to be 23:13hundreds of clients interested in how do we secure authentication, 23:16authorization, governance, how do we ensure PII data doesn't leak. 23:20But from an enterprise perspective, it's like this can be 23:23a security nightmare. Not only do you have to manage 23:26the user's identity, now you have potentially hundreds or thousands 23:30of agents who are moving data left and right with 23:33no accounting for governance compliance, the UA act, gdpr, not 23:39filtering necessarily your PII data, with no clear way to 23:43do evaluations or to evaluate their performance. And it won't 23:48be long before you know your boss is now Cortana. 23:51You're reporting into an AI agent. Your reports are AI 23:54agents. You're bringing your agents into the conversation. It's not 23:58necessarily something new. So even on GitHub today, I can 24:01trigger GitHub Copilot to help review my PRs. There are 24:06already agents on Microsoft Teams. I'm building one of these 24:10agents for Office365 that calls our Consulting Advantage platform. So 24:14these things kind of already exist in many organizations. The 24:19part I don't necessarily feel comfortable with is what level 24:22of control do organizations have when all these hundreds of 24:28agents start popping around in various catalogs, in teams, in 24:32all the product suites and CIO offices are running around? 24:36How do I turn this off? How do I turn 24:37this off? How do I ensure data integrity? So I 24:40think this is inevitable. The only thing that feels off 24:43about this story is how much choice is this, like 24:47a new product that we can buy? Or is this 24:49something that's just gonna happen to us from now on? 24:52There's gonna be that cute dog from Windows XP bouncing 24:54around and saying, I'm. Here to help you search for 24:57your Kaltar. Maybe good for business teams, but a nightmare 25:01for CIOs and CISOs. But isn't Microsoft? I mean, Microsoft 25:06is the world leader at this kind of compliance. Why 25:09are they pushing down this route if it comes with 25:12all of the kind of crazy security risks that Mia 25:15is pointing out? Yeah, it's really interesting to See this 25:19new direction that Microsoft is pushing forward. You know, this 25:25is, you know, we're seeing here the shift from having 25:28a tool, AI is a tool to AI as a 25:31teammate. And I agree, you know, here with Matmihai said 25:37this can be like a compliance nightmare here. So because, 25:42you know, governance and auditing. So if it's an agentic 25:46user with its own identity and if it violates a 25:49compliance policy, who is accountable? Is it the admin who 25:53created it, the human who trained it? The organization really 25:57needs a unified oddity log that can differentiate between human 26:01and agent actions. And so I think there are a 26:05lot of interesting implications here. This is pretty disruptive with 26:10these agentic users because they're really full fledged user objects. 26:15They have email identity and so on. Yeah, they can 26:18do everything. Right now. Yes, currently they might be augmenting 26:22us, but maybe at some point might be replacing also 26:25layers in the organization. So I think there are some 26:33interesting pieces to this, but there are also scary pieces 26:36to this that we have to be watching for. And 26:41I think the discussion here is this is organizational and 26:45legal implications of giving AI kind of a corporate identity. 26:49And this is really profound shift from the tooling to 26:53teammates and co workers and what the implications on this, 26:58you know, for hr, for governance, for auditing. And there's 27:06also maybe a cultural shock to this. You know, how 27:08will human employees react to teammates that doesn't need breaks, 27:12works 247 and potentially accesses all their files. And so 27:17this is a major change management challenge that I think 27:21HR departments are not ready for. Yeah, and I did 27:24want to take, I want to end the episode by 27:25taking maybe like a little bit of a step into 27:27the future. Right. Because Aaron, where my head goes on 27:29this is like if you've ever worked, you know, if 27:31you work at a big company, there are people that 27:33you work with that you maybe never meet in person. 27:35You largely experience through Slack. Like I just have this 27:38vision that in the future you're like, you know, someone's 27:40trying to start an office romance only to discover that, 27:43you know, the person they've been working with is actually 27:45an agent that may very well be in our future. 27:48The minute you instantiate user agents that operate like this 27:52in the office. I don't know. Do you have any 27:54wild predictions, Aaron, for the future of the Office place 28:02cook breakfast. That'd be fantastic. Right? Yeah. So I did 28:09look around at what Microsoft's Strategy looked to be and 28:13it looks like this somewhat started with a paper that 28:16they had called the Agentic Economy. But then I think 28:21as the field noticed that there's a lot of security 28:24pieces around this. So then they had groups working on 28:27zero trust agents securing and governing autonomous agents with Microsoft 28:33security. So they're attacking I think both sides to try 28:36to create this agentic architecture with eval frameworks and governance 28:41to help people feel more comfortable. But I could see 28:45a future where there's what, over 8 billion people on 28:48the planet where there's more agents than there are people 28:52on the planet. Well, if you count copilot as an 28:56agent where the population of agents operating society is actually 28:59quite large now. Right? That's right. You can copy paste 29:02agents, you can clone agents. Agents can create potentially other 29:06agents. So this fractal piece goes on and on. But 29:11I do think the blurring between what is a human, 29:14what is an agent is going to become very, very 29:16difficult. I think there's going to be hybrid humans and 29:18agents working together where even if you are talking with 29:23a human, are you really talking to the intent of 29:25the human or is a human simply almost like a 29:28puppet of the agent feeding it what to say? So 29:32that trust and transparency about this blending of these biomimetic, 29:38you know, pieces all working together might be hard to, 29:43to parse but I think it's something that we need 29:45to get ready for. Right. And even prepare our kids, 29:47you know, you know, what, what they could be in 29:50for. Yeah. How do you deal with it when your 29:53boss is a real jerk but he's Cortana, you know. 29:56That's right. That's right. Yeah, it's. Yeah, it's going to 29:58be. And, and the other part too is that these 30:01agents are going to manifest itself within the, within the 30:03physical world. So the way that they change the physical 30:07world are going to give us other confusing signals is 30:12what changed the color of my TV or what changed 30:15the channel or what changed the color of this light 30:18for example, these small things. So it's going to be 30:22a really interesting. The effect of computing in essence I 30:26think is going to be combined with agentic agents. But 30:32yeah, for sure it'll be fun, concerning and interesting all 30:35of it together. All at the same time. Mihaly, do 30:37you want to have the. Final word here on the 30:39episode with consumption based pricing? So how long before your 30:43agents start to attend every meeting or email everyone and 30:46they're going to charge you 0.01 cent for every interaction 30:51and maybe even impersonate humans. Like you've seen that YouTube 31:02the UAI act and start ramping up some of those 31:06fines. Yeah. What I was also thinking on the agentic 31:09economy is that agents will pay us. Right. That we 31:11become businesses. Right. And then we take payment from agents 31:14to get access to a certain element. That would be 31:17nice. The agent has decided it's more cost effective to 31:22delegate to a human. Yeah, exactly. It's a dark and 31:26dangerous future. All right, well that's a great note to 31:29end on and that's all the time that we have 31:31for today. So Kautar Aaron Mihaly, always great to have 31:35you on the show and thanks for joining all you 31:37listeners. If you enjoyed what you heard, you can get 31:39us on Apple Podcasts, Spotify and podcast platforms everywhere. And 31:42we'll see you next week on Mixture of Experts.