Learning Library

← Back to Library

Copilot vs Clippy: Agent Battle

32m • Unknown Channel • ai-ml • interview • intermediate • Watch on YouTube ↗

Key Points

Vyoma Gagyar argues Microsoft Copilot is a sophisticated code‑translation and coordination tool, not a revival of the outdated “Clippy” assistant.
Volkmar Uhlig notes the industry is in a “training‑wheel” phase where AI agents act as copilots under human supervision, but will eventually evolve into fully autonomous pilots.
The imminent “agent jungle” sees major players like Microsoft and Salesforce deploying competing enterprise‑agent platforms, sparking a 2025‑era battle for market dominance.
Current experimentation focuses on user‑interface designs for these copilots, as developers test how much human oversight is needed before moving toward screen‑less, fully autonomous systems.

Sections

Full Transcript

# Copilot vs Clippy: Agent Battle **Source:** [https://www.youtube.com/watch?v=HYHgJkWnPdQ](https://www.youtube.com/watch?v=HYHgJkWnPdQ) **Duration:** 00:32:49 ## Summary - Vyoma Gagyar argues Microsoft Copilot is a sophisticated code‑translation and coordination tool, not a revival of the outdated “Clippy” assistant. - Volkmar Uhlig notes the industry is in a “training‑wheel” phase where AI agents act as copilots under human supervision, but will eventually evolve into fully autonomous pilots. - The imminent “agent jungle” sees major players like Microsoft and Salesforce deploying competing enterprise‑agent platforms, sparking a 2025‑era battle for market dominance. - Current experimentation focuses on user‑interface designs for these copilots, as developers test how much human oversight is needed before moving toward screen‑less, fully autonomous systems. ## Sections - [00:00:00](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=0s) **Copilot Not Just Clippy** - In a podcast intro, AI experts discuss why Microsoft’s Copilot differs from the nostalgic Clippy, framing it within the emerging 2025 rivalry between enterprise AI agent platforms from Microsoft, Salesforce, and others. - [00:03:03](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=183s) **Experimenting with AI Agent Interfaces** - The speakers compare today’s AI assistants to early tools like Clippy, noting that firms are still in a “training‑wheel” phase, experimenting with user interfaces while leveraging extensive legacy interaction data. - [00:06:07](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=367s) **IBM’s Green Tech Outlook** - The speaker discusses IBM’s commitment to minimizing the environmental footprint of AI and data centers, noting current energy use and projected increases as high‑power GPUs become more common. - [00:09:12](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=552s) **Sustainable AI Powered by Nuclear Energy** - In 2024, major tech firms and clients are increasingly prioritizing greener AI compute by leveraging nuclear power and government tax incentives to reduce emissions, avoid pipeline interruptions, and improve efficiency. - [00:12:21](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=741s) **Inference Market as Commodity** - The speaker explains that AI inference functions as a token‑priced, perfectly competitive commodity market, driving a race‑to‑the‑bottom across models, hardware, and power generation, which spurs broad innovation and involves billions of dollars in economic stakes. - [00:15:23](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=923s) **AI‑Driven Interface Innovation Discussion** - The speakers debate a new AI‑powered computer-use model’s promise for training, enablement, and disability support, noting the ironic shift from human‑designed GUIs to machines now steering the very interfaces they created. - [00:18:26](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=1106s) **AI for Software QA Automation** - The speaker proposes leveraging the demo technology to automate quality‑control and debugging tasks in software development, such as visual UI checks and cross‑browser validation, rather than focusing on generic machine‑to‑machine interactions. - [00:21:31](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=1291s) **Bridging Legacy Systems with AI** - The speaker explains how to introduce AI to tech‑focused clients still using legacy infrastructure by leveraging retired experts’ knowledge and historical logs to enrich new software, demonstrating value and easing the transition. - [00:24:35](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=1475s) **Why Text Watermarking Matters** - The speaker argues that text watermarking is essential for establishing ethical standards, building client confidence, and enabling future regulation despite industry concerns about user adoption and detection. - [00:27:38](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=1658s) **Tool Tagging and Societal Split** - The speaker argues that tagging every use of amplified tools—like large language models—creates a divided society between users and non‑users, while regulators seek to protect the latter despite the inevitability of technological adoption. - [00:30:43](https://www.youtube.com/watch?v=HYHgJkWnPdQ&t=1843s) **Cautious Optimism on AI Adoption** - The speaker argues that AI is still in its infancy, urging continued experimentation before widespread comfort and regulation can be achieved, much like earlier technological revolutions. ## Full Transcript

0:00Is Microsoft Copilot just like Clippy 2.0? 0:02Vyoma Gagyar is an AI 0:04technical solution architect. 0:06Vyoma, welcome to the show for the first time. 0:08Tell us what you think. 0:09Thank you. 0:10I do not think that it is Clippy 2.0. Microsoft Copilot has been one of 0:15the, uh, pioneers in the field of code 0:17translation, extraction, coordination, 0:20Volkmar Uhlig is Vice President, 0:22AI Infrastructure Portfolio Lead. 0:24Volkmar, welcome to the show. 0:25Uh, what do you think? 0:26I think the judgment is out. 0:28I'll wait for 2.5. 0:29All that and more on today's Mixture of Experts. 0:37I'm Tim Hwang and welcome to Mixture of Experts. 0:39Every week, we're going to bring you the 0:41world class analysis, debate, and thinking 0:43you need to navigate through the rapidly 0:45changing universe of artificial intelligence. 0:48We've got a discussion about nuclear 0:49power, AI using computers, but first 0:52we really want to talk about the 0:53rumble happening in the agent jungle. 0:57Um, the question is, uh, co 0:59pilot just like Clippy 2.0 was inspired by a spicy 1:02tweet from Mark Benioff. 1:04Um, but I think more generally we want 1:05to focus here on a mixture of experts 1:07on kind of taking a look back over the 1:09last few months and the fact that, um, 1:11Salesforce has launched an agent platform. 1:14Microsoft has launched an agent platform. 1:17Really kind of 2025 is shaping up to be 1:19like a battle of competition over agents 1:22and specifically agents in their enterprise. 1:24And so I really want to spend a little bit 1:26of time talking about that and giving all 1:28you listeners out there an intuition of what 1:30to expect over the next 12 months or so. 1:33And maybe Volkmar, I'll turn to you first. 1:35You know, I think what's most interesting 1:36is that, you know, now there's going 1:38to be so many different agents. 1:39platforms to choose from. 1:41Um, do you see different companies taking 1:43different approaches to kind of offering 1:45these technologies to the enterprise? 1:46What do you think are like the 1:47big kind of, you know, competitive 1:49dynamics that are playing out here? 1:51All companies are trying to experiment. 1:53Um, and we are, uh, in a, in a world where we 1:57are slowly moving from, you know, The training 2:00wheels on where the sys, where the systems, uh, 2:03get supervised by humans, uh, then, and right 2:07now the, the system is in the passenger seat. 2:09That's why it's the co-pilot and not the pilot. 2:11Um, then at some point I think there will be 2:13a switch over that the system is, are more 2:15powerful, more trustworthy, and then the system 2:18becomes the pilot and the the users to co-pilot. 2:21And at some point we can get the 2:23Copilot out of the, out of the seat and 2:25the systems can be fully autonomous. 2:27So I think we, we are in a progression of. 2:29of how the technology is evolving. 2:31But I think at this point in time, 2:34human eyes are required on the systems. 2:37And I think the big experimentation right 2:39now is how these user interfaces look like. 2:42We kind of know how the fully 2:44autonomous systems look like. 2:45You know, there's not even a screen. 2:47Or, you know, in cars there's no 2:48driving, no steering wheel anymore. 2:51But in the systems today, we are experimenting. 2:54If you look at Microsoft, they 2:56integrated it sometimes as a chat agent. 2:59Uh, on the side, sometimes 3:01directly in, in the applications. 3:03Um, Apple took a different approach. 3:05Salesforce is taking different approaches. 3:07So everybody's, is experimenting with the 3:09user experience at this point in time. 3:11But, you know, technology 3:12is not, you know, the, the. 3:14Training wheels are still on. 3:16And so we are going through 3:17the training wheel phase. 3:18Yeah, for sure. 3:19And I think it's so interesting is like how 3:20much some of the competition is just happening 3:22on the level of like the interface, right? 3:24It's just like we don't even know how to 3:25effectively interact with these agents. 3:28Um, I think you bring another angle to this 3:30question that I think is worth touching upon 3:32though, because You know, in some ways, right? 3:35Like I think for the kind of, you 3:36know, outside observer, they take 3:38a look at some of this stuff. 3:39And I think they occasionally 3:40are like, this is just Clippy 2.0, right? 3:42Like this is back in like 3:44the 90s or early 2000s. 3:45And you know, we're just talking to a 3:46paperclip on a word processor asking 3:48me whether or not I'm writing a letter. 3:50But it kind of sounds like one reason you 3:52think that this is a genuinely different 3:54thing, like what's happening in this market, 3:56is also that there's a lot of experimentation 3:58happening under the hood as well. 4:00Is that right? 4:01That is correct. 4:02And all the information, the legacy information 4:05that has been gathered from Clippy, you see 4:07that Microsoft has been a great company, 4:10which has been operating seamlessly for years. 4:14Imagine the amount of data that it has gathered. 4:16The Clippy data, as everyone's claiming that 4:19to be, there is so much other information 4:21around, um, the other platforms such as 4:23GitHub or anything, et cetera, as well. 4:26Um, Imagine feeding all of that information 4:28into a large language model and making 4:30your day to day life much better. 4:32So I feel that is what we are aiming at. 4:34There are just a couple of, uh, 4:37problems or a couple of solutions 4:38that we want to get from this. 4:39And first being enhanced productivity. 4:42And I think Microsoft Copilot helps you do that. 4:45And it also gives us a lot of our free time back 4:48to do something more productive and creative. 4:49Yeah, that's great. 4:50And I do think that, like, You know, 4:53particularly Volkmar, I know your background 4:54is working on autonomous vehicles, like 4:56this kind of model that like, the agents are 4:58sort of the less, next level of autonomy, 5:00but we're sort of getting people like to 5:01trust the technology enough to be able 5:03to take it to the next autonomous level. 5:05I think it's like a really interesting 5:06set of problems that we'll see, 5:07uh, kind of play out in the space. 5:09The nice thing here is your 5:10life doesn't depend on it. 5:11Yeah, that's right. 5:13All that will happen if this technology 5:14fails is, you know, the code breaks, or 5:16you send a really awkward email to someone. 5:18So the stakes are a little bit lower. 5:25Well, perfect. 5:26I think one of the topics I really wanted to 5:28touch on, uh, moving on to the next segment. 5:31Um, is on the topic of AI and energy. 5:34A few weeks ago, the news kind of leaked out 5:36that Microsoft was considering restarting 5:40the Three Mile Island nuclear power plant. 5:42And, you know, all the current 5:44projections suggest that future models 5:46are going to need, you know, gigawatts 5:48of power in a data center to run. 5:50Um, and I think we've danced around this topic 5:52in previous episodes of Mixture of Experts. 5:55Um, but I think I wanted to kind of 5:57just tackle it head on is, How are we 5:59thinking about dealing with kind of the 6:01environmental impact of these models and 6:04how much energy is going to be required 6:06to kind of unlock all of their potential? 6:08You know, someone who's like very excited 6:09about the technology but also kind of 6:11concerned about climate change, you 6:12know, it's a topic that I think is like 6:14really sort of near and dear to my heart. 6:16And, and I am really sort of interested 6:17in, you know, the approaches that people 6:19are thinking about and, and trying. 6:21Um, I guess maybe Volkmar, do you want 6:22to, I'll start with you, I'm curious about 6:24like how IBM is thinking about it, but in 6:26general how you're seeing the space kind 6:27of evolve around this this tricky problem. 6:30So I, in general at IBM, we are trying 6:32to look at being, you know, leaving 6:35a green thumb fingerprint on the 6:38planet when we are looking at tech. 6:39So you're trying to be conscious, you know, 6:42there, there is an environmental impact. 6:45The power consumption right now 6:47for data centers stands about 1.5 percent of total power 6:50production in the United States. 6:52So it's, it's tiny, right? 6:54So, and, um, and then with the expected 6:57growth in AI and the projections 7:00are kind of not really friendly. 7:03Uh, assuming that, you know, H100s with, 7:05you know, seven, eight, nine hundred watts. 7:07And then the next ones AMD is 7:09producing, which is like 2000 watts. 7:11I think we have not yet done the 7:12projections of technological improvements. 7:15And so I do not believe that we will, 7:17we will see these high powered cards. 7:20In the long run, I think 7:20it's just a moment in time. 7:22But even if we stay on that projection, 7:25then the total power consumption we are 7:27going to have is an increase from 1.5 % to 4%. 7:31Okay? 7:32Well, take the population growth 7:34of the United States right now. 7:36Um, that's, that's, That's nothing, right? 7:38So it's just the population growth is 7:40already bigger than what we are adding here 7:42in total data center power consumption. 7:44So I think that the moment right 7:46now is that there is a concentrated 7:49interest in very rapid build out. 7:51And we are actually putting the discussion 7:54about what constitutes green energy 7:56and efficient energy back on the table. 7:59And I do not think that has anything to do 8:00with AI, but it's actually a key moment 8:05of a tipping point where we can actually 8:07have a conversation about nuclear power in 8:10the United States, and I'm really excited 8:12about that because I, you know, this is one 8:13of the cleanest power sources and actually 8:16looking at it from tech companies trying to 8:18put on, you know, nuclear power and then, uh, 8:22actually doing that in a, 8:24Uh, careful, you know, 8:26orchestrated way is a good thing. 8:28And, you know, if the conclusion is then, 8:31oh, we should still not do it, then, 8:32you know, that's a, that's a consensus 8:34between the people who have their, 8:36these power plants in their backyard. 8:37But I think the discussion needs to be 8:39in a, in a rational way, and I think 8:41over the last 50 years it was irrational. 8:43Yeah, for sure. 8:43And I think that'll be the 8:44most interesting thing. 8:45I mean, like, so often happens in AI. 8:47It's almost like the, the AI isn't the 8:49thing, but it is triggering the bigger 8:50discussion, which I think is fascinating. 8:52I guess, Vyoma, you work a lot 8:54with customers and clients. 8:55Is, is the environmental 8:57discussion kind of popping up? 8:58Like, are clients raising it? 8:59Or, you know, people looking for solutions 9:01on the kinds of solutions that you work on 9:03saying, I want you to deliver this, but we 9:05have to make sure that the emissions are, 9:06are good, you know, on, on what you deliver? 9:09I'm just curious about what you're 9:10seeing kind of on the front lines there. 9:12Yeah, of course. 9:13Yeah, that's a good question. 9:142023, we were just getting 9:15up with this technology. 9:17People wanted to know more about it. 9:19But in 2024, now we see that so many of our 9:21clients want to make it much more sustainable. 9:24As you see that these clients and companies 9:27such as Microsoft, Sam Altman also kind 9:30of is investing in a company called Ookla. 9:35Google has its own um, And Amazon has 9:38its own different ways in investing 9:41in some of these nuclear plants. 9:43But as you see, that they are trying 9:44to make this more sustainable. 9:46They're trying to avoid the lag. 9:48Because if something like models or 9:50like AI runs on nuclear emissions, 9:54They run much more faster, seamlessly. 9:56There is very less chances of it being, 9:59um, a breaking in the middle so that you 10:02have to rerun those pipelines which take 10:04hours and hours of compute and resources. 10:07So that is something that clients, we 10:09are making them much more aware about it. 10:11I remember I was at a client location 10:13two weeks ago and I was telling them 10:14that right now, 15 to 20 percent of our 10:19electricity comes from nuclear plants. 10:20That's something that we have to look into. 10:21The government is also, uh, helping you with 10:25the inflation reduction, uh, reduction 10:28act, giving you more, uh, the tax credits 10:30for that as well because as mentioned, we 10:33have a much more better structure around it. 10:36Um, technology has evolved trust, 10:39and we should be doing much fine. 10:41And one thing that I wanted to add here, but 10:43not everyone wants to be leveraging these 10:45large language models to do their jobs. 10:47People are pivoting towards having 10:50a smaller model which can do just 10:51the job right by techniques such as 10:54fine tuning or even prompt tuning. 10:57So I feel that is also a caveat 10:59that I'm seeing nowadays. 11:00Yeah, for sure. 11:01And I think this you, I think you 11:02and Volkmar actually represent really 11:03two sides of a very interesting coin. 11:05I think, you the argument 11:07that you just made as well. 11:09Actually, customers are thinking about 11:11smaller models as a way of reducing 11:13their kind of like energy footprint for 11:15the deployments that they want to do. 11:17And Volkmar also says, well, look, a lot of 11:20the projections are based on the idea that 11:21the chips that we're getting, the boards 11:23we're getting, like that energy consumption 11:25is just going to be the case forever. 11:27But it's actually likely that 11:28the next generation will actually 11:29consume a lot less energy as well. 11:31And so there's actually this really interesting 11:32interplay of basically like, Do, does the, 11:35does the model need to consume as much energy 11:37and does the, does the actual hardware need 11:39to consume as much energy and kind of like the 11:41efficiencies that you're gonna get accordingly? 11:43Um, like, I could see, basically, like, 11:45a world where I guess Volkmar what you're 11:46saying doesn't come to pass for some time. 11:49And so customers increasingly want 11:50smaller models to deal with this question. 11:53I can also imagine a world where there's 11:54some breakthrough where the next generation 11:56of boards is like so energy efficient 11:58that people are like, let's just run 11:59the biggest model that we can because it 12:00costs a lot less energy than it used to. 12:02It's just way more efficient. 12:03Um, I think it'll be really interesting 12:04to see that play out, but I'm, I'm 12:06curious if either of you have kind of an 12:07impression on almost what's going to hit 12:09first, right, seems to be the question. 12:11The moment you have something which is so 12:12dominant in the market, uh, and costs so 12:15much money but has a huge upside potential, 12:18uh, innovation will take place, right? 12:21And so, and we are in a already, 12:23like, if you look at inferencing, 12:24we are on a perfect market, right? 12:26It's a commodity. 12:27Uh, you pay by, by tokens, um, and so you now 12:31have price competition, and so the race to the 12:34bottom is on, right, and so the race to the 12:37bottom is, uh, across different disciplines, so 12:40I can make smaller models, they run faster, I 12:43can make faster inference, um, or I can produce 12:47power more cheaply, right, and I'm what I'm 12:49expecting is that each participant in this 12:52market, because it's such a big market, right? 12:53If you consider something being, you know, 12:562, 3 percent of total power production 12:59or consumption in the United States, um, 13:02that is billions of dollars at stake. 13:05And so I think each of them will innovate. 13:07You know, the model people will innovate 13:09on the models, the hardware people will 13:10innovate on the hardware, and the power plant 13:12people will innovate on the power plant. 13:14So I think overall we are better off. 13:16But, you know, because now there is 13:18a very specific problem which then 13:19radiates into the rest of the economy. 13:22So if we can suddenly make power at 13:23half the cost, that's wonderful, right? 13:25It will make, you know, a model cheaper. 13:27Yeah, there's like other 13:27reasons why we want to do that. 13:29Right, exactly. 13:29Volkmar, this is the time you 13:32should get back to the Bay Area. 13:33Startup idea. 13:35Yeah, exactly. 13:37Have you considered getting into Fusion? 13:39Yeah. There you go. 13:46I'm going to push this on to our next 13:47topic that I really wanted to talk about. 13:49Um, Anthropic just last week, uh, launched 13:52a new feature, uh, called computer use. 13:54Um, and the basic premise of it is pretty 13:57simple and it's kind of a fun feature. 13:58It's basically the idea that, you know, 14:00ultimately, uh, Your AI, your agent, will 14:03be able to take over your mouse and pilot 14:05your cursor around and do things for you 14:07as if you were like a user on screen. 14:09Um, and uh, and this generated all sorts of 14:13really funny stories I want to talk about. 14:14You know, one of them is that they talked about 14:16during the testing how, You know, the computer 14:19use feature would occasionally get distracted. 14:21So, like, beyond the way of doing a task, 14:23and then it would take a pause to, like, 14:24look at photos of Yellowstone National 14:26Park for a while before going on to its 14:27next task, is that these models actually, 14:29like, have, like, these very funny kind of, 14:31like, simulations of actual human behavior. 14:34But I think I want to just first start 14:35with, like, the business question. 14:37Um, which is, and you know, maybe I'll, 14:39I'll toss it over to you is why is Anthropic 14:41working on a feature like computer use? 14:43Like is it just a cool demo from a research 14:45lab or is it actually really connected to 14:47what they need to do as, as a business? 14:49Look at Anthropic, look at agents, everything 14:52that all these companies are trying to do 14:54is come up with some sort of a symbiotic 14:56relationship between humans and machines. 14:59And whatever use case that you take 15:01in this case, I think Anthropic 15:03is just trying to do that. 15:04I feel, um, with the. 15:06Claude models that are coming into play, 15:09they are trying to help augment some of our 15:11behavior and help us make our lives better 15:15or help us be so much more productive. 15:18I was just speaking about this, uh, 15:20to my mother yesterday and she's 15:21like, I need to book this ticket. 15:23Help me. 15:24And I'm like, I'm in the middle of a meeting. 15:25I don't have time for this. 15:26Just give me half an hour. 15:28Imagine if she had this, I was like, I was 15:30reading about it and I was like, imagine if 15:31she had it and she had the computer use model. 15:34It would help so many people in training, 15:38enablement, people with disabilities. 15:41It has a social impactful angle 15:44to it, which just goes unseen. 15:46And I feel that are the things that the Um, 15:50market, the people in the market, the clients 15:52want something like this going in the future. 15:56So that's something that I 15:56feel has a great potential. 15:58Yeah, for sure. 15:59I think Volkmar, this is kind of fun 16:01because it does connect to what you're 16:02talking about earlier in terms of like 16:04innovation on the interface level. 16:06Um, you know, I, I think what's really funny 16:08is like we invented GUIs and the operating 16:11system in part just because like we needed an 16:14easier way for humans to interact with machines. 16:17But now we have this very funny thing 16:18where now the machine is taking over 16:20that interface to pilot the machine. 16:22Um, and it's kind of like a very funny 16:24historical development that that ended up 16:26being the case, but kind of curious about 16:27how this fits into your earlier thoughts 16:29about, you know, all this innovation 16:30that we're seeing on the interface side. 16:32Yeah. 16:33So I think. 16:33When I looked at their video where they demoed 16:36it, um, it felt kind of useless at first. 16:41tell us more. 16:42So why, why? 16:43And, um, but I think there is, um, like there 16:47is a certain level of smartness behind it. 16:50So. 16:52I believe that if a computer interfaces with 16:54a computer, there are many, there are much 16:57better ways how you can actually do that. 16:59You know, so, like, if you think 17:01about it, they use the browser. 17:02So, like, there's a browser, there's 17:04an engine, that engine is JavaScript. 17:07I can just directly hook 17:08into a JavaScript engine. 17:09I don't need to render something into pixels. 17:11So that rendering effort, and then, 17:13I'm translating that rendering 17:15effort is just insane, right? 17:17So if, if computer to computer 17:18interaction happens, you do this to APIs. 17:21Now, I think we are seeing something very 17:24interesting emerging, which is the API to 17:26the computer becomes the English language. 17:28That's effective with large 17:29language models too, right? 17:30So I talk to you in English 17:32and you're interfacing with. 17:33the outside world. 17:34And the outside world, like, you know, if you 17:36look at what ChatGPT is doing, is, you know, 17:39they're, they're creating a Python script, and 17:41they run the Python script for you to automate 17:43a task, and they pull data out of the internet, 17:46uh, and then, you know, they convert it into 17:48JSON, and then they give you an answer back. 17:50So there, there are the 17:50translator in the middle. 17:52So I think there, the, the ability to 17:54actually interface with a human Uh, the 17:58alternate, the outer human perception, which 18:00is the visual and not the, you know, not 18:02text based, which is usually auditory or 18:04like, we are reading letters, but they're 18:06all letters, is, uh, is the visual domain. 18:08And so suddenly if I can understand 18:10the visual domain a human is consuming, 18:12now I can actually interface with that. 18:14So if I would be a business, uh, and I would do 18:17what Entropic is doing, my guess would be that 18:19they're probably looking at, uh, automating 18:22development processes and automating debugging. 18:25Right? 18:26So, and so what the demo is effectively just 18:29showing is like, Hey, look, we can do this. 18:31But if you, if you convert this into something 18:34which has an economic value, it is probably 18:36in the testing quality control Q& A of 18:40software development, which is, you know, has 18:43millions of people employed today, automate. 18:47So this is, I think, from a, from a, 18:49you know, business value perspective, 18:51that's the direction I would take this. 18:53And so, and that's exactly then, it's not any 18:55more about, um, you know, replacing a machine, 18:59machine to machine interaction, but actually 19:01doing what the human is doing and saying, 19:03okay, are all my buttons correctly aligned? 19:06Is my text formatted correctly? 19:08And now then, then it makes sense, right? 19:10So it's effectively in that realm. 19:12Quality control, potentially data generation, 19:15where you can actually visually then 19:17inspect whether your, your code generation 19:19was correct, if the webpage renders 19:21correctly in all browsers, et cetera. 19:23That's where I can see 19:24where you could take this. 19:25Yeah, that's really interesting. 19:26Yeah, it's kind of a debugging thing. 19:28I think it's, it's fascinating that 19:29kind of like their, their stated reason 19:31for releasing it is not really kind of 19:33like ultimately the business purpose. 19:35Um, I mean one angle, which I don't know if 19:37you buy Volkmar is kind of like, you know, 19:40we don't live in a world with perfect APIs. 19:42Right. 19:43Um, and it is possible that you could 19:44imagine these kinds of models being helpful 19:46for, you know, facilitating interactions 19:49when, you know, there's no clean API 19:51for the system to talk to a system. 19:53I don't think you would do this to the 19:54visual domain, rendering something in a 19:56browser and having a laptop somewhere. 19:58I think it's still like a crazy way to do it. 20:00Yeah, it's just such an inefficient way. 20:02How do I convert, you know, like 10 20:04characters of JSON into a million 20:06pixels and then try to understand that? 20:08Um, and so the, I think there will 20:11be a different layer, but I think 20:12each of these layers has a value. 20:15And so, you know, you may either, you 20:17could also, I mean, if you want to make it 20:19efficient, have the code generated for the 20:22API by the large language model, right? 20:24But now you can go one layer up, and you say, 20:26okay, I run a JavaScript engine, and then the 20:28next layer up is, I run the, The output of 20:30the JavaScript engine in a web browser, and 20:32I'm reading the pixels of the screen, right? 20:34So there are, well, I read the DOM, you know, 20:37they could have just read the DOM out instead 20:39of actually converting the DOMs into pixels. 20:42But, you know, that's why I'm like, oh, 20:43this is my immediate reaction was like, 20:45oh, yeah, this is like kind of weird. 20:47I think from a quality control 20:49perspective, that's huge, right? 20:51And then now you can also say, okay, 20:53please judge me if this interface 20:55is better than that interface. 20:56So suddenly you can do experimentation. 20:58And I think that's where the true value comes. 21:00If you can actually understand the screen. 21:02Well, we actually have like, I 21:03think a little bit of a difference 21:04of opinion between you and Vyoma. 21:06Cause I think Vyoma, you, you made an 21:07argument a little bit earlier, which is, 21:09this is like amazing as a way of interfacing 21:11with agents for like your mom, right? 21:13Like, I kind of curious, like, it seems 21:15like Volkmar is taking a very, technical 21:16approach, which I think is like very genuine, 21:18right, which is there's much more efficient 21:20ways of doing what computer use is doing. 21:22I think one of the things that you're making 21:23an argument for, though, is it like might 21:25help people understand and interface with 21:28these systems better, even though it's 21:29kind of like technically less efficient. 21:31I don't know if you would 21:32agree with that at all. 21:33Um, yeah, there are two caveats to this. 21:35So right now we belong to the tech space. 21:38That's something that we do day in and day out. 21:40When I go out and talk to clients, they have 21:42not even embarked on this journey of AI. 21:44They're still, um, working with 21:47traditional legacy models, legacy systems. 21:50Where they do not even know what AI does. 21:52What, where do we go from here? 21:54So to like onboard these clients to onboard 21:57these use cases, I feel this is a great 21:59starting point to show them the value and 22:02then kind of get them excited about this. 22:04One of the use cases that I have seen 22:06in the past couple of days is, There are 22:09people who are retiring and who have a lot 22:11of information about COBOL or like legacy 22:14systems or network issues, et cetera. 22:17And where does all of this legacy 22:19system go, uh, information go now? 22:21So their companies are concerned that 22:23how do we reuse all of this information? 22:26And before someone retires, how can we? 22:28augment that information into new 22:30systems that we are kind of making. 22:32Imagine if you have something like a technical 22:34aspect like computer use which looks into that, 22:36okay, these are the logs or network issues 22:39that have been logged in for the past couple 22:41of years, and this is how we can embed it into 22:44our new software and help people understand 22:47through that process, uh, that this is. 22:50Not something which is going to try 22:51to replace you, but this is going to 22:53make your life much more easier and 22:56bring in all the lost information. 22:58So, code translation, code understanding, 23:01et cetera, sure is a great use case. 23:03Validation, testing is a great use case. 23:05And one of the other use cases that I feel 23:08in this entire process is, um, understanding. 23:12The language understanding the code, uh, code 23:15understanding would be one of the main use cases 23:17with computer use that I see going on that. 23:19Let's say someone built like a 70 23:22year old COBOL language function. 23:24It will tell you step by step or 23:26anyone that this is what is going on. 23:28This is how it's going to work. 23:29Go to the next step, et cetera. 23:30So it can be broken down 23:31into multiple, uh, cabinets. 23:34That's great. 23:34Well, we'll have to see how this evolves, 23:35um, and I guess we'll have a long bet 23:37on whether or not this ends up being a 23:38debugging feature or a user facing feature. 23:46The final story I wanted to focus on 23:47today was, uh, a really interesting 23:49story that came out of Google. 23:51Um, they announced, uh, a kind of, 23:54sort of, advancement that they were 23:55working on called SynthID Text. 23:57Um, and this is a, a, a thing that 24:00they've integrated into Gemini. 24:02Um, and the whole idea of SynthID Text is to 24:06help watermark generated AI generated text. 24:10And you know, if you're familiar with this 24:11space, traditionally, the problem is if you 24:13watermark this text in this way, you kind 24:15of have to force the model outputs into 24:17ways that are often not great for actually 24:20solving what you need to solve, right? 24:21Um, and their claim is that this methodology 24:24is better because you can do this watermarking. 24:26That is to say you can identify what 24:28text is created by AIs, but you don't 24:31compromise quality, accuracy, creativity, 24:33or even speed of the text generation. 24:35Okay. 24:36Um, and so, um, Vyoma, maybe I'll kick 24:39it over to you first is, you know, 24:40why is something like this important? 24:42Like, do we need watermarking for text? 24:44Like, what's, what's this for even? 24:46What's it for? 24:48Let's, let me answer this question one by one. 24:50We do need watermarking for text. 24:52And again, it is. 24:54quite controversial that I've said that. 24:56Google has been very bold to at 24:58least come up with this product 25:00and kind of be so vocal about it. 25:02There are companies who've been 25:04experimenting this, I know OpenAI has 25:06been experimenting this, but they've 25:07not brought it out in the public yet. 25:09Because, Some of these companies fear that 25:11people will stop using it because now there's 25:14a watermark uh angle to it or like oh this 25:17is something like writers etc they'll be like 25:19oh now I'll be caught or something like that 25:22that that that that really runs in the back 25:24of their mind but I feel watermarking is not 25:27there to kind of judge you or like oh give 25:29me this information but kind of creates some 25:31sort of an ethic standard this standardization 25:34around and that is something that Everyone is 25:37trying to move towards some sort of regulation 25:40that if X amount of tokens are generated by 25:43Y amount of models, then this is what we saw. 25:46This is how it should be watermarked. 25:48There is some sort of logging 25:50that we are doing on top of it. 25:51And I feel that is what brings a 25:54lot of confidence in clients, a lot 25:56of confidence in people as well. 25:58That whatever model that I'm using or whatever 26:00text that has been generated, there are some, 26:03um, marks or metrics that have been attached 26:05to it and that is an angle that I I like 26:08to pick in this because I Work very heavily 26:11in AI ethics and standards and policies And 26:14this is the this this topic comes up every 26:16other day that how do I know this decision? 26:18It takes that has been generated as a 26:20right or wrong There are teachers who 26:22would come up to me and they're like, oh, 26:24I don't know if the student has copied this 26:26assignment It is kind of going to help 26:30Us, students, teachers, all of them create 26:33a more healthier environment to sustain AI. 26:36Yeah, no, I think it's great. 26:37Uh, Volkmar, I'm curious if you 26:38have any sort of thoughts on this. 26:40I mean, I think, um, you know, clearly this 26:42is not the kind of thing that's going to solve 26:44the use of these models for spreading fake 26:47information or something like that, right? 26:48But, you know, I guess I don't know if you 26:50agree that like these kinds of measures are 26:52really necessary to kind of like make this 26:54technology be used in an ethical manner. 26:57So I'm at a total opposite side of 27:00Yeah, let's hear it. 27:03So I do 27:04Somehow I knew. 27:06I knew going into it. 27:09I have two school aged children and you know, 27:12the schools are trying to desperately prevent 27:15kids from using chat GPT to write their essays. 27:19And I believe they should just do everything in ChatGPT, 27:23Um, and the reason is that 27:26GPT does not substitute thinking, right? 27:30It just substitutes the 27:31process of content creation. 27:34And, or it enhances the 27:37content creation process. 27:39So, what we are now arguing is I have a tool 27:44and I need to tag everything which has been done 27:47or produced with the tool, but I'm not tagging 27:51if, if I use a power drill and I'm not drilling 27:55the hole by hand, I'm not tagging every hole 27:58I'm drilling into a wall that's like, oh wow, I 28:00used, you know, a power drill to make this hole 28:03and therefore I need to tell you, whereas I, you 28:06know, I used um, and a tool which amplifies my. 28:10Personal capabilities, you know, and I'm 28:12not every time when I walk somewhere, I'm 28:14like, well, I drove here by car and I need 28:16a tag that I arrived by car and used, you 28:20know, energy, which came out of fossil fuels. 28:23And so therefore, you know, I 28:24need to announce it to the world. 28:26So I think we are in a, in a, in a 28:28world right now, which is bifurcated. 28:30And that's why, and, and we have, um, We 28:32have a society which is kind of split. 28:35There's the society which actively uses large 28:37language models and uses the power of them. 28:39And we have a society which doesn't. 28:41Now the society which doesn't, and then we 28:43have the people who want to regulate everything 28:44and want to tell everybody how to live. 28:46Right? 28:47So we have. 28:48Uh, we are now going, it's like, oh my 28:50God, we need to protect the people who 28:51are not using large language models. 28:53And the poor teachers, they need to change 28:56their way of educating the kids, and it will 28:58only take a hundred years until they are there. 29:00So let's give them tools in their hands 29:02so that they can do the useless teaching 29:05they've been doing for a hundred years, 29:07and then we can figure out if someone is 29:10actually using tools of the 21st century 29:13so that the teacher can punish them for it. 29:15So I was like, well, so I'm not, I need to walk 29:19to my school because, you know, my parents could 29:21drive me and I could save 20 minutes, right? 29:23So I think we are in a world right now, 29:25which is like still the split and we are 29:28in a breaking point how, you know, the 29:32technology is not yet widely adopted, but, 29:34you know, a good chunk of society, which are 29:36the early adults, in particular children, you 29:38know, because, I mean, ChatGPT probably 29:40grew like crazy when the first kid found 29:42out that it can write the essay with it. 29:44And, you know, so, I think we need an education 29:48system which embraces it and we need to 29:50have a corporate system which embraces it. 29:52I think the second one is there's a 29:54certain arrogance by Google to say, 29:56Oh, look, um, we can watermark it. 29:59I was like, yeah, I use another chat, 30:00another chat agent, you know, which I can 30:03just download from the internet, which 30:04removes your watermark and you're done. 30:06There is even thinking that a company has such 30:10a broad distribution that they can actually 30:13push watermarking into, into the world. 30:16It just tells you, it's like, okay, 30:18there will be models of different value. 30:20There's the Google model, 30:21which watermarks everything. 30:22And there's the non watermarking model, 30:24which is actually much more valuable 30:26because nobody can see that I use the tool. 30:28Right? 30:29And so, of course, you just create an 30:31economy of, of cheating, uh, because, you 30:35know, you are trying to tag everything, 30:37except you, you as being Google, you have 30:40to watermarking for your own purposes. 30:43So, just the idea that you could actually 30:46do this is ridiculous from my perspective. 30:49We can agree to disagree on this. 30:51There are two caveats to 30:52this as what Volkmar mentioned. 30:53There are people who know AI, understand AI, 30:56and there are people who are scared to use it. 30:57So I feel the merger point, a point where 31:00everyone's comfortable with it, comes at 31:03a point when all of these techniques and 31:05tools have been experimented for a while. 31:08I still feel we are a little fresh 31:10into this, like look at the internet 31:11revolution, and then look how the chat GPT 31:14or like now that we've, uh, I don't know. 31:16Uh, made the ChatGPT and agentic. 31:18So after this whole large language model 31:21boom came in, what a short period of time 31:23it has been, there hasn't been enough, um, 31:26uh, to be completely honest, products or 31:29use cases that have gone into production, 31:31full fledged production also yet. 31:33So until and unless we reach 31:34a point where we see the. 31:37effects and the long term effects of all 31:41of these techniques that have been used. 31:42I feel, um, right now we can keep thinking about 31:45what are the best ways to come up and the best 31:47ways to regulate or not, but till then just 31:50keep experimenting, keep working on this, and 31:53I feel somewhere we'll all come to a merging 31:55point where everyone will be comfortable with. 31:57I mean, but this is true for every technology. 32:00which has been invented by humanity and if 32:03something is like, you know, three years 32:04old, um, like, you know, we, we do not know. 32:08So let's experiment with it. 32:09The U.S. in general is always, you know, we first try and 32:12then we figure out what works and what doesn't 32:14work and then we regulate it and not let's first 32:17anticipate every bad problem that could occur 32:19and then regulate it before anything happened. 32:21So I think the U.S.will probably be, you know, 32:23reactive in regulation. 32:25Typically regulators are 10 years behind. 32:29And so, like, let's, let's build something first 32:31which is valuable before we're trying to figure 32:33out how to, how to put guardrails around it. 32:35We could go much longer on this. 32:37Uh, Vyoma, we'll have to 32:37have you back on the show. 32:38Thanks for coming on. 32:39Um, and, uh, Volkmar, it's a pleasure as always. 32:42Thanks for joining us. 32:43If you enjoyed what you heard, you can get 32:45us on Apple Podcasts, platforms everywhere. 32:48And listeners, we'll see you next week.