Learning Library

← Back to Library

AI, Tennis, and the Future of Journalism

Key Points

  • The panel emphasizes that despite AI advances, the human element remains essential, especially for sports journalism.
  • Economic incentives shape whether users are treated as customers or products, influencing AI deployment decisions.
  • Experts advise against asking LLMs for information you already know, suggesting LLMs should augment rather than replace personal knowledge.
  • The podcast “Mixture of Experts” ties AI discussion to the ongoing U.S. Open, using tennis as a real‑time test case for AI‑generated content.
  • Experiments at the U.S. Open involve hybrid workflows where human writers and large language models collaborate to produce both short and long‑form match reports.

Sections

Full Transcript

# AI, Tennis, and the Future of Journalism **Source:** [https://www.youtube.com/watch?v=IKaj8gATzoY](https://www.youtube.com/watch?v=IKaj8gATzoY) **Duration:** 00:40:44 ## Summary - The panel emphasizes that despite AI advances, the human element remains essential, especially for sports journalism. - Economic incentives shape whether users are treated as customers or products, influencing AI deployment decisions. - Experts advise against asking LLMs for information you already know, suggesting LLMs should augment rather than replace personal knowledge. - The podcast “Mixture of Experts” ties AI discussion to the ongoing U.S. Open, using tennis as a real‑time test case for AI‑generated content. - Experiments at the U.S. Open involve hybrid workflows where human writers and large language models collaborate to produce both short and long‑form match reports. ## Sections - [00:00:00](https://www.youtube.com/watch?v=IKaj8gATzoY&t=0s) **Untitled Section** - - [00:03:12](https://www.youtube.com/watch?v=IKaj8gATzoY&t=192s) **From Live Sports Data to AI‑Generated Reports** - Aaron explains how their message‑driven system ingests real‑time match events from multiple feeds, formats them as JSON, and feeds them to a large language model to instantly generate written summaries. - [00:06:21](https://www.youtube.com/watch?v=IKaj8gATzoY&t=381s) **Agentic AI for Sports Commentary** - The speakers explore how agentic LLM architectures can mine and distill quirky, edge‑case sports statistics into engaging, human‑like live commentary. - [00:09:54](https://www.youtube.com/watch?v=IKaj8gATzoY&t=594s) **AI-Enhanced Sports Commentary Discussion** - The speakers explore using AI agents to generate statistical insights and narrative content for sports broadcasts, debating which sports, such as tennis, are most suitable for these applications. - [00:12:59](https://www.youtube.com/watch?v=IKaj8gATzoY&t=779s) **AI-Driven Sports Insights Platform** - Panelists discuss integrating generative AI tools like Perplexity to deliver real‑time, conversational tennis coverage, illustrating how concrete applications make abstract AI tangible. - [00:16:05](https://www.youtube.com/watch?v=IKaj8gATzoY&t=965s) **Ad Models Drive Attribution Tech** - Participants discuss how reliance on advertising limits subscription growth yet spurs investment in source‑attribution technologies for LLMs, potentially enhancing trust and mitigating hallucinations. - [00:19:15](https://www.youtube.com/watch?v=IKaj8gATzoY&t=1155s) **Balancing Revenue, Trust, and AI Transparency** - A panelist argues that ad‑based AI businesses must provide advertisers with guarantees while generating enough revenue to sustain the industry, yet they must also preserve user trust and transparency to keep highly educated users engaged. - [00:22:24](https://www.youtube.com/watch?v=IKaj8gATzoY&t=1344s) **AI Authority and Critical Thinking** - The speakers warn that persuasive, confident language‑model answers can erode users’ ability to trace sources and think critically, highlighting the need for transparency, advertiser accountability, and new filtering strategies compared to traditional Google searches. - [00:25:27](https://www.youtube.com/watch?v=IKaj8gATzoY&t=1527s) **Beyond Copilot: Code Understanding** - The speakers debate whether Copilot is already obsolete, stress the trust challenges of LLM auto‑completions, and argue that the next breakthrough in AI for software engineering will be tools that help developers comprehend massive, unfamiliar codebases. - [00:28:46](https://www.youtube.com/watch?v=IKaj8gATzoY&t=1726s) **From Prompts to Agentic AI** - Speakers examine the transition from larger‑context prompting toward autonomous, planning‑driven AI agents and debate the balance between human‑in‑the‑loop oversight and fully self‑directed operation. - [00:32:04](https://www.youtube.com/watch?v=IKaj8gATzoY&t=1924s) **Balancing LLMs with Engineer Skill** - The speaker acknowledges that large language models boost code quality and productivity but warns they can erode essential coding expertise and creativity—especially for new paradigms like quantum computing—so they advocate limiting tool usage to preserve engineers' sharpness and long‑term team continuity. - [00:35:16](https://www.youtube.com/watch?v=IKaj8gATzoY&t=2116s) **Legacy Code Barriers and IBM Fellows** - The conversation explores the challenges of modeling legacy and esoteric languages such as COBOL and PL/I—highlighting financial incentives to overcome these barriers—before host Tim Hwang introduces three distinguished IBM Fellows and outlines the prestigious program’s goals and future direction. - [00:38:20](https://www.youtube.com/watch?v=IKaj8gATzoY&t=2300s) **Technical Fellows as Guardians** - The speakers reflect on the honor and responsibility of being a technology fellow, emphasizing their role as a check‑and‑balance for business decisions and as an inspirational benchmark for others. ## Full Transcript
0:00Tim Hwang: So is AI going to wipe out all sports journalists? 0:02Aaron Baughman: No matter the sport, you know, we're always 0:05working with the same constant. 0:06That's the human. 0:07Tim Hwang: Seems to me that paid search and perplexity 0:09poses some really big questions. 0:11Trent Gray-Donald: It's all about simple economics and who's incented 0:14to do what and are, are you the customer or are you the product? 0:19Tim Hwang: Should I be using cursor? 0:22Kush Varshney: You shouldn't ever ask a question of an LLM these days 0:25at least that you don't already know kind of the answer for yourself. 0:28Tim Hwang: All that and more on today's episode of Mixture of Experts. 0:38I'm Tim Hwang and I'm joined today, as I am every Friday, by 0:41a world class panel of engineers, researchers, product leaders, and more 0:45to hash out the week's news in AI. 0:48On the panel today, Aaron Bautman, IBM Fellow, Kush Varsney, IBM Fellow, 0:53and Trent Gray Donald, IBM Fellow. 1:01So to kick us off, the U. 1:02S. Open is this week. 1:03Um, and as usual, a mixture of experts. 1:05We're, of course, excited about the tennis, but we're really excited about, 1:09like, the AI, and I really want to talk about the role of AI in the U. 1:12S. Open. 1:13Uh, but first to kick us off, because I personally am a huge tennis fan, 1:17let's just go quickly around the horn. 1:18I want everybody's. 1:20Uh, nominee for the best tennis player of all time, um, uh, 1:24Aaron, we'll start with you. 1:26Aaron Baughman: Yeah, so that's a great question. 1:28Easy answer. 1:29Ben Shelton. 1:30Tim Hwang: Great. 1:31Kush. 1:31Kush Varshney: Leandro Piz. 1:33Tim Hwang: All right. 1:34I like that one. 1:34Very good. 1:35Very good. 1:36And, uh, and Trent, how about you? 1:37Trent Gray-Donald: Oh, I, I prefer squash. 1:39So Jonathan Power. 1:40Tim Hwang: Okay, great. 1:41Well, thanks. 1:42Well, uh, I asked that question to kind of kick us off on the discussion today. 1:46Of course, the U. 1:47S. 1:47Open is happening very right now as we record, uh, this episode. 1:51And as usual on Mixture of Experts, we're excited about the tennis, but what 1:54we really want to talk about is the AI. 1:57Um, and Aaron, in particular, I wanted to kind of have you on the panel 2:00and for you to kick off this section because I understand that you've been 2:03experimenting with using language models to generate both long and 2:08short form stories, uh, for the Open. 2:10Um, and I wanted to talk a little bit about what you're discovering, 2:13like, uh, like what's working really well, uh, in these experiments 2:16that you've been trying out. 2:18Aaron Baughman: Yeah. Well, thanks for having me. 2:19It's really fascinating to watch how we apply these AI technologies in 2:23particular, these agentic architectures with a diversity of large language models 2:28deployed out at scale, uh, to the U. 2:30S. Open that's happening right now. 2:32So if you go to dub dub dub dot usopen dot org and even go to news, you can see 2:37a lot of our, Stories that are created with both human and large language 2:41models together, but in general, we have two different types of projects. 2:45One of them is recruiting hundreds of match reports pre and post long and 2:50short form for 255 of these different matches, and then the second project 2:55that we have is called a I commentary where we take stats and we transform 3:00that into a different data representation like JSON and then input that with the 3:05prompt to get out text and then that's voiced over with text to speech and that's 3:10embedded into these highlight videos. 3:12Tim Hwang: Yeah. That's really cool. 3:13And tell me a little bit more about like how that works exactly. 3:16So how do you go from a game to like a report about a game, right? 3:19Presumably there has to be a feed about like, Oh, this person just, you know, 3:22had a, had a great serve for instance. 3:25Um, like how do you do that conversion? 3:27Right. 3:27Because I think what's interesting is you're going from, you know, 3:29Video and visual, uh, to a written medium and kind of curious about 3:33how you guys approach that problem. 3:35Aaron Baughman: Yeah, it's really neat. 3:36This is all about message driven architectures where whenever we get 3:40a score, you know, for example, in a match ends, then we get a message and 3:44within seconds, less than seconds, we'll then take that message and we'll 3:49pull in from about 14 different feeds that has raw data that describes the 3:54players, the match, where they are. 3:57And also what they've done in the past. 3:59And we also forecast what's going to happen in the future. 4:02And we take all of that and we turn it into a representation that a large 4:07language model can understand, right? 4:09And we put it into the context of a prompt. 4:11So it being, it could be JSON elements that describe, you know, with key 4:16values what's happening in tennis. 4:18So like how many aces is somebody getting or how many breaks has 4:21somebody won in the match, right? 4:23And then all of that is packaged together And then we push that into the scaled 4:28out architecture that we have granite, for example, um, and we pass it in with 4:33the prompt and then the output would be a fluent text that describes the scene 4:39that's either Just happened or that's coming up and it's it's really cool 4:44to see it You know live as it happens And there's all sorts of fact checking 4:49that happens and quality checks novelty pieces and sim similarity to make sure 4:54that it's up to par so that we can go forward and I use the word par on purpose 5:00because we also do some things for golf as well, uh, which um Uh, is, uh, part 5:06of our over three year story, you know, that, uh, has evolved, uh, into the U.S. Open. 5:12Tim Hwang: That's great. And Trent, I saw you nodding on the mention of Granite. 5:15I don't know if you've got a connection to Granite as a project, 5:17but, uh, I'm wondering if you can kind of paint a picture of where you 5:20see some of all this going, right? 5:22So we're assuming to do these experiments in golf. 5:24Sorry to do these experiments in tennis, should we expect to see like in five 5:27years that like, you know, a lot of sports coverage, a lot of sports summaries and 5:32commentary really will be AI generated or do you think this is more of like a, 5:36a sports specific thing, for instance? 5:38Trent Gray-Donald: I, I think that this is just the beginning 5:40of a lot of different initiatives. 5:42Uh, the, the reason I'm nodding is that Aaron and I actually, so 5:47I run the watsonx, uh, that does the inferences that Aaron causes. 5:55So he's basically calling my service when he does the work. 5:59This is your baby. 6:00Well, right, but I, I'm always in the, like, I'm the plumbing, right? 6:05Yeah, sure. 6:05He does all the, the interesting domain specific work around tying together 6:08all the data sources, and it just ends up, you know, coming into our service. 6:12So he and I worked together on figuring out, okay, how do we make sure that 6:17we can handle the capacity and the latencies and all those different things. 6:21Right. 6:21But in general, how, how Aaron's built it and the, I see 6:27this whole agentic universe. 6:29Uh, I mean, there's, there's from highly scripted through to 6:35let the LLMs do what they'll do. 6:37And there's obviously a big meat, there's a big, um, Uh, 6:43there's a lot of different points in that spectrum. 6:46And I think that for live events, for, uh, More and more human things like 6:53sports, we're going to start seeing increasingly interesting agentic 6:56architectures emerging that will extend beyond a given sport into more and more. 7:04I could, I could see that. 7:05I think the, the interesting question is always, uh, can you find the 7:12right unique snippets to tell people. 7:16Like one of the, one of the jokes that we have is when we, big baseball 7:20fans, and we're listening to the play by play, and they come up 7:24with these ridiculous statistics. 7:25This is the third player since 1943 who stood on their left 7:29foot and wiggled their ear. 7:31Tim Hwang: That's right. 7:32Yeah, yeah. 7:32I've come to expect that. 7:33I mean, I watch a lot of, like, um, soccer, right? 7:35And like, it feels like the commentator's just, Fill space. 7:38Just have this remarkable bank of like the most edge case 7:41statistics you could think of. 7:42. So, 7:43Trent Gray-Donald: well, exactly the question is, can, can 7:45we capture and distill that? 7:48Like obviously there's a lot of data mining going into 7:50producing those right now. 7:51It's okay. 7:52How do we connect those and make it engaging and interesting and human? 7:55Tim Hwang: Yeah, for sure. 7:57Um, and I guess, Kush, curious about how you think about this. 8:00I know one aspect of your, uh, fellow work, right, is that you think a little 8:04bit about AI governance, which ultimately is kind of like how do we think about 8:07the influence of these systems on people. 8:09Um, and, you know, I think one response always is like, okay, well, we're What 8:13is a sports journalist supposed to do in the future in this world where a lot 8:17of the work that they currently spend time on is, you know, generating this 8:20coverage, generating this commentary? 8:23Uh, curious about your thoughts on like how that all looks, right? 8:26Because I think as Aaron has already said, there's like ways of 8:28getting humans and machines to kind of work together on this front. 8:31Um, but I would love to kind of hear a little bit about like how you sort 8:34of see that relationship evolving and, and is there a role, right, uh, 8:37I guess for humans in sort of an AI enabled, you know, sports future. 8:41Kush Varshney: Yeah, I think we're going to talk more about 8:43this towards the end as well. 8:44I mean, different sorts of human collaboration and, um, the way I think 8:49about it, it's not so much of, uh, what is it about the job that, uh, we're 8:55trying to automate away and these sort of things, but really the question of 8:58the dignity of the humans involved in this, because, um, uh, if you're The 9:03human and you're subservient to the A. 9:05I mean, you have no dignity left in many ways. 9:08Um, so what are kind of the workflows that we can set up such that, uh, you 9:14get a better product still are getting the advantages of automation, but still 9:18leaving the dignity of the human intact. 9:21And, uh, one way to think about it is, uh, Like, if you remember, um, House MD, 9:27Dr. House, the TV show, and, um, he had his, uh, whatever, residents, and, um, 9:31they were, like, doing stuff for him, conducting tests, whatever, um, but, 9:35uh, it was very much an adversarial sort of relationship, so, um, like, 9:40they were always trying to prove him wrong, and if we can get the AI systems 9:44to be in that mode, working with the humans, then the humans still stays 9:48kind of Uh, with the agency, with the dignity, but, um, so that's the benefit 9:52of, uh, of all of the AI technologies. 9:54So, uh, I think something like that, um, could, uh, could play out. 9:58I mean, as we, we go forward with, uh, with a lot of different 10:01AI, um, human collaborations. 10:03Tim Hwang: Yeah. 10:04I love the idea that in the future, there's going to be like a sports 10:06commentator that has specifically an agent that generates the AI. 10:09Those like weird statistics that Trent was mentioning. 10:11It's just like an expert on finding and identifying those 10:14as the action kind of evolves. 10:16Well before we move on to the next segment, uh, Aaron, maybe we'll close 10:19this segment with you because, uh, this is some of your work that's getting 10:22some shine at the, uh, at the open. 10:24Um, I'm kind of curious if there's like, You know sports that are going to be 10:27easier or harder to do this kind of work that you're doing at the open with, um, 10:33you know, I think a little bit about like, you know, is this something where 10:36theoretically any sport is going to be easy and amenable for the kinds of sort of 10:41story generation that you're working on? 10:43Or if there's certain aspects of You know, say tennis or golf, uh, 10:47that really make it sort of like ideal for your application case. 10:50Um, I guess what I'm asking ultimately is like, did you pick this largely 10:53because like you love tennis, uh, or, or they're actual kind of like 10:56scientific reasons for why this ended up being a really good test case. 11:00Aaron Baughman: Yeah. 11:00You know, um, You know, the no free lunch theorem where, you know, 11:04there's not a perfect solution for every problem, I think is applicable 11:07here, um, because, you know, every sport has a pro and con, right? 11:11And it all comes down to what data is available and what is 11:15the scale and how, and what's the use case that fans want to see. 11:20So, so I wouldn't say that there's a perfect sweet spot 11:23in any one singular sport. 11:24There's always a challenge. 11:25Yeah. 11:26Um, some, some of the challenges I think we've already discussed here is just 11:29making sure that we have meaningful stories and stats that bubble up and 11:33some, some of the things that we do is we use like standard deviations 11:37around, let's say, aces, right? 11:39Because you can't say that, um, a pure number, of aces is significant. 11:45It all depends on what's how many sets have been played, the gender 11:49of the of the match, who's playing. 11:52So we have to break that down and apart. 11:54And if we go to like racing, we go to football, we go to soccer, you know, 11:58it's all very similar, but you apply the same mathematical techniques. 12:03To this, to the stats that then can bubble up, um, one of the other challenges, 12:08um, and, and I would say one of the other areas that's real exciting is 12:13getting human and machine working together because there's this pendulum 12:16of how creative do you want these large language models to be, as opposed to 12:20how prescriptive do you want to be? 12:22You know, with this few shot learning, for example, and we tend to go somewhere 12:26in the middle, but it's all experimental. 12:28You know, it's almost like the theory of mind, right? 12:30We want to be able to predict what action is a human editor going to take 12:35so that we can meet their expectations whenever we generate, um, that said text. 12:40And so no matter the sport, you know, we're always working 12:43with the same constant. 12:44That's the human. 12:45Um, and then the other constant is data, right? 12:49We need to make sure that we have access to the data. 12:51Um, but it's, it's, it's fun, right? 12:53And it's very impactful. 12:55And it's a way to bring people together irrespective of creed, gender, and race. 12:59And, um, it's just really exciting to use a lot of Trent's work and 13:03a lot of Kush's work and bring it together for the world to see. 13:07Tim Hwang: Yeah, for sure. 13:07And I think this is where the magic happens, right? 13:09It's like AI can be very abstract for people. 13:12It starts to become very clear if it's got an application like this, right? 13:14It's like, oh yeah, I already love this thing and AI is really 13:17helping me, you know, enjoy it more. 13:18It makes a huge difference. 13:20Aaron Baughman: Yeah, yeah, yeah. 13:21And, and, and I do encourage you to check out, you know, US Open. 13:23org so you can see our work live in real time and listen to 13:26commentary, read the match reports. 13:29I mean, it's, it's fascinating to watch the field evolve. 13:38Tim Hwang: I'll introduce this by talking a little bit about Perplexity. 13:40Um, so Perplexity is one of these leading companies in the 13:43sort of generative AI movement. 13:44Um, what they are largely providing is kind of language 13:47models as an interface for search. 13:50So the idea is in the future you'll be able to have much more sort of 13:53conversational search experiences, uh, than you have right now with 13:56something like a, a Google or something, right, where you kind of 13:58type in a search query and you get like a bunch of responses, um, back. 14:02And Perplexity has been kind of In my mind, one of the 14:05best products in the space. 14:06It's like one of the few ones that I actually pay for and that I 14:09actually use on a week to week basis. 14:12And there was an interesting news story that just popped up in the last few 14:16weeks where Perplexity announced that it would be finally moving towards a 14:21model where they roll out paid search. 14:23So the background on all this is in the past. 14:26You have had to subscribe to Perplex, and you pay them a monthly fee. 14:29Um, but now they're saying, hey, we're gonna monetize by allowing people 14:32to, uh, buy ads on our platform. 14:35So if you go, say, search for, you know, what is the best exercise 14:38machine, uh, you might see an ad from something like a, like a Peloton. 14:42Um, and so this is like a big shift. 14:44I think one of the big hopes about this technology was not just that 14:47conversational interfaces would be better, but that we might move away 14:50from ads to a world of subscription. 14:52Um, and, and as a result, maybe have a little bit more sort of faith, trust, 14:56confidence in, in the search results. 14:58Um, and so I guess I kind of want to ask, maybe Trent, I'll start with you 15:01because you're our new addition to the roster of folks at Mixture of Experts. 15:05How do you feel about this? 15:07Like, does it make search less trustworthy? 15:09Like, should we be concerned about this kind of shift on the part of perplexity? 15:13Trent Gray-Donald: Well, in my view, yes, absolutely. 15:16I'm a big fan of say, follow the money. 15:19And, and it's, it's, it's all about simple economics and who's incented to do what. 15:24And are, are you the customer or are you the product? 15:29And it's very simple. 15:31As you shift to paid search, you become more of the, Product instead 15:35of the customer and so my, my usual reaction is that this is not going to 15:43bode well for, for us as consumers. 15:46Tim Hwang: Yeah, for sure. 15:47I just remember, I mean, famously, there's this essay that was written by, uh, Larry 15:52and Sergey, right, who founded Google. 15:54And it's like, their essay that they wrote, I think, when they 15:56were still at Stanford, and they're describing the PageRank algorithm. 15:58And at the very end, they're like, and no search engine should ever use ads, 16:02because it would be the most terrible thing for a search engine to do. 16:05And of course, lo and behold, right, like, Google is like a 16:0890 percent ad based company. 16:11But I guess it's very hard, Kush, isn't it, to like, kind of 16:13avoid these incentives, right? 16:15Like, the problem with subscription is that People 16:16need to pay to use your product. 16:18Um, and so it does kind of limit user growth and all these other things. 16:22Um, is there any way you think of escaping ads as a business model in the space? 16:25Kush Varshney: Um, I'm really not sure. 16:28I mean, but, uh, one thing that I did want to point out, maybe a 16:32little bit counter to what Trent is saying is, um, uh, the investment 16:36into an ad based sort of approach. 16:39So. 16:40Should actually also lead into investment into certain technologies 16:44that do help with trustworthiness. 16:45So source attribution is a big problem with LLMs 16:49You don't know kind of where the information came from. 16:52That's in the generative output. 16:54And, um, if that's what's part of the monetization, then there will 16:58be a lot more investment into the techniques, the scalability of the 17:03source attribution sort of things. 17:05And that can actually increase the The trust, um, maybe not necessarily 17:08always just for the, uh, the ad driven sort of, um, uh, platforms, but in 17:13general, because the more, um, and better techniques that we have, uh, 17:17and we do have better trust for, um, or where the information came from. 17:20You can then, uh, go back, trace through, um, different, uh, 17:24possibilities for hallucination, things. 17:26So I think, uh, incentives can kind of work in weird roundabout ways. 17:31So, um, uh, the ad driven aspect maybe will or will not. 17:36Uh, do a good thing for trust, but maybe that'll lead to investment 17:39into certain things that do. 17:41Trent Gray-Donald: Yeah, I think so. 17:43I agree in theory, but in practice, what incentive does perplexity have in 17:48providing attribution in a better way? 17:51And do they just start obscuring it? 17:53And who's to Like, who's got the leverage to not have it obscured, right? 18:00I mean, that's always the fundamental thing, is there's 18:01always, we could, but we don't. 18:04Tim Hwang: I mean, I think there's also maybe another element, which is, I 18:06don't know if you buy the argument that, like, in a world of chat based search, 18:10the trust problem is particularly bad. 18:12Right. 18:13Because like in the past with Google, you have 10 blue links. 18:15You can say, well, why are you giving me this link versus that link? 18:18Uh, but in the very least, if we say we can maybe agree that like, 18:20Oh, all the sponsored links should have like a little label and a box 18:23around them or something like that. 18:24But in a world where it's just like a paragraph, I guess you can offer 18:27citations, but who's going to actually click through those, you know? 18:30Um, but I'm curious if you want to kind of respond to, to Trent's thinking there. 18:33Kush Varshney: Yeah. I mean, first I'll respond to Tim. 18:35I mean, I never click on the, I'm feeling lucky button because I mean, 18:38I always want to see the 10 results. 18:40Right. That's right. 18:41Yeah. 18:41Um, but, uh, yeah, I mean, I think the point that I was trying to 18:45make is, um, that, uh, whoever's paying for their stuff to appear, 18:51um, needs to be ensured that yes, I mean, um, the right thing is coming. 18:55So, uh, if you're still using a language model in between, then even if the 19:00ad, um, Uh, needs to get through the language model to appear in the output. 19:05Just making sure that that happened, um, is going to be needed. 19:09And, uh, that same technology can then be used to trace other information 19:13or other facts or other stuff. 19:15So, uh, what I'm saying is, uh, the reason it needs to be there for an 19:21ad based business is because, um, uh, the people paying for the ads need 19:24to have some, uh, sort of guarantee that, uh, that their stuff will appear. 19:30Tim Hwang: Aaron, I'm not going to let you get away with being quiet on this segment. 19:32Curious if you've got any thoughts, uh, if you're on team Kushier or 19:35team Trent or, or neither, I suppose. 19:38Aaron Baughman: Yeah. 19:38I mean, I think that the mixing of trying to drive revenue 19:42with trust and transparency could be potentially dangerous. 19:46You know, it could be used for, um, you know, potentially, um, 19:51alternative You know, methods here, but it is about balance. 19:54You know, um, I read this article a while ago about Goldman Sachs, where 19:58they said that there's too much AI spend and too little benefit, but in order 20:02to keep AI as an industry solvent, that there needs to be revenue and there's 20:06a large revenue gap, you know, today, and it could potentially be growing. 20:11Right. 20:11I, um, I know on this, uh, mixture of experts, we talked about the, what's 20:15600 billion gap with Sequoia, you know, a while ago, you know, and, and 20:19so that, that really stuck with me. 20:21But on the other hand, we need the trust and transparency to maintain users and 20:26demand because once people lose that trust, they're not going to use these 20:28systems, or at least I wouldn't, right? 20:31And one point I did want to make is that lots of the users for Perplexity, 20:36it seems are, you know, very highly educated, you know, they're high 20:40income, you know, earners as of now. 20:42Um, and so, and so they're very, if you can influence, right, that group of 20:47people, um, to, to walk down a certain way, then that can influence, you know, 20:52um, lots of other people because they tend to be sort of the leaders in fields. 20:57And so. 20:58And so just making sure that, you know, perplexity, a, they publish their 21:02papers that describe, you know, their algorithms, the systems that we can 21:05easily access and read much like Google did, I think is important, creating 21:09this digital passport, you know, that describes where the data is coming from, 21:13um, so that it's at least available. 21:16Um, and then it's up to us as a group, IBM fellows, you know, to educate, 21:20Hey, you know, if you're using these AI systems, you know, you need to 21:24do your own due diligence as well. 21:26You know, um, still maintain your posture and, you know, your own belief 21:30system, um, and understand that you're using these tools to help you, but you 21:34still need to be a critical thinker. 21:36Tim Hwang: Yeah, that's well warranted. 21:37I mean, I think just to put myself in the shoes of perplexity, if they were 21:40here in this conversation, I think they'd say something like, well, why are 21:43we being held to such a high standard? 21:44I mean, Google's been monetized on, you know, ads for all these years, and 21:49people still use it with no problem. 21:51You know, why is AI sort of like special in that respect? 21:54Um, and I, I suppose part of the worry here that, Aaron, you're bringing up, 21:57which I think is good, is, you know, this goes to, I guess, whether or not you think 22:01that people will be critical thinkers with regards to the technology, right? 22:04Like that maybe the AI, uh, makes us all a little bit too easy, um, in a way 22:09that, you know, maybe, like, actually limits in practice how much people will 22:12actually click through to the links. 22:13I mean, I, I, no, I certainly don't, right? 22:15Yeah. 22:16Aaron Baughman: Yeah, I will, I will say that, that whenever I use, when 22:18I'm driving and I'm using like a map software where this is Google Maps, 22:22I completely forget where I'm going. 22:24And I probably couldn't retrace where I went because I don't pay attention, right? 22:28So, so there's a danger of not being a critical thinker. 22:31Because the information just becomes so easy to get. 22:34And, and I think we all just need to be careful. 22:37Tim Hwang: That's right. 22:37Yeah, I had an incident a few weeks back where I like left 22:40my phone in the restaurant. 22:41I hopped into the car and started driving. 22:42And then I was like, I don't I don't exactly know how to get 22:45back to the restaurant now. 22:47It's very embarrassing, so. 22:48Um, any final thoughts on this trend? 22:50Trent Gray-Donald: Uh, I, I think some really good points there, 22:53and I think Kush's point about the advertisers are going to want to see 22:56where their money is going is actually an interesting loopback that is the, 23:01an incentive that brings towards, uh, being a little more transparent. 23:06But at the same time, like, we're used to Google coming back 23:10with a list, and it's up to us. 23:13The problem with the chat is that it's more opinionated, and it's, for 23:16lack of a better term, it's got that humanness to it where it just, that, 23:21like, mentioned is, it feels much more like somebody's just talking to you. 23:27And we all know that LLMs talk with authority, and they talk 23:32with tremendous confidence, even though when it's not warranted. 23:37And so it's going to be interesting to see how the human how we 23:42develop the right filters. 23:44Like we all know how to deal with the Google page. 23:46Okay. 23:47You scroll past the first four items and then you, or whatever it is. 23:50Tim Hwang: Right. 23:51Trent Gray-Donald: It'd be interesting to see how we build defenses here 23:54and whether they're harder to build. 23:58Tim Hwang: Yeah, I think that is going to be a big open question. 24:00I think we're going to have to learn as a society, right? 24:02It's going to just be like, when the first ten blue links emerged, right? 24:04That was also a whole process, and so it feels like we're turning that wheel again. 24:08Trent Gray-Donald: Yeah, exactly. 24:15Tim Hwang: Um, well, great. 24:15I'm going to move us on to our third story of the day. 24:18Um, So, former Tesla and OpenAI leader, Andrej Karpathy, tweeted out his 24:23love for this product called Cursor. 24:25Um, and has set off a kind of whole new discourse around the role of AI in 24:30software engineering and programming. 24:32Um, and the unique thing about Cursor, in contrast to something 24:35like Copilot or Kodi, this is like another company that's operating in 24:39the space, is that it's basically like an entirely separate product. 24:42stand alone ID, they basically forked VS code and said, okay, 24:46we're going to rebuild it from the ground up using this, this AI stuff. 24:50Um, and you know, I think one of the most interesting parts of the discourse, 24:55if you will, if you follow Twitter, it's a waste of time, but like if you 24:58do follow it, um, is that people were sort of making the argument that. 25:02You know, um, cursor is particularly interesting because it's trying to 25:05get past the, the kind of paradigm that co pilot set down, right? 25:10So when co pilot launched, the idea was, oh, well, autocomplete is the way we 25:14should think about kind of assistance of AI in, in software engineering. 25:19Whereas, it's playing around with all sorts of things, right? 25:21They're playing around with dist on your code, they're playing around 25:24with um, you know, chat interfaces which you've seen elsewhere. 25:27But like, I think they're actively trying to push beyond, kind of 25:30auto complete as a paradigm. 25:33And I guess I'm kind of curious, maybe Kush I'll turn to you, is 25:37You know, do you sort of buy that? 25:38Like, is Copilot kind of old school already? 25:40Like, is it, it's already becoming like the, you know, version 1.0 25:44of how we thought about AI in software engineering. 25:48And, you know, do you think that like we're going to look back in 10 years and 25:50no one's going to even think about using like a Copilot like interface to integrate 25:54LLMs in their, in their workflow? 25:57Kush Varshney: Yeah, that's a great question. 25:58And, um, I mean, I think it relates to what we've already 26:00been talking about today. 26:02So, um, Do you trust this thing? 26:05Um, are those auto completes? 26:07Um, the things that you can verify yourself because um, Uh, you shouldn't 26:12ever ask a question of an llm these days at least that you don't already 26:15know kind of the answer for yourself, but um, Uh, some folks in my team, 26:21they've been doing some user studies and asking people what are the features 26:25that they would actually want to benefit from in the AI for code space 26:30and what we're finding is that um It's actually the code understanding problem. 26:35So when you're given a dump of a new code base, um, uh, like just making 26:41sense of it, that's the biggest problem. 26:43Like it's whatever, like thousands or millions of lines of code and 26:48all sorts of weird configurations. 26:50And let's say you don't even know the language. 26:52Let's say it's COBOL or something like that. 26:54How do you just get a sense of where things are like? 26:57Kind of how this is organized, what it does, and that sort of thing, I 27:00think, is, uh, an even more powerful use because, um, uh, once you're at the 27:06level of, uh, kind of knowing that this is a line or this is a block that I 27:10need to write, um, you're already, like, well versed with what you need to do. 27:14So yes, it can speed things up, but even getting started, I think, 27:18is, uh, is a bigger problem. 27:19Tim Hwang: Yeah, it's funny to think actually that we've had so much focus 27:22on like the AI literally generating code, but kind of what you're saying 27:25is like the future of AI and software engineering is like, better documentation. 27:29It's like the thing that is always kind of like difficult to do and no 27:33one wants to spend time on doing. 27:34Um, I'm trying, I guess like doing the watsonx stuff, I'm sure you're kind of 27:37like interested in kind of that interface. 27:39I don't know if you, you sort of agree with Kush here that like, it's really 27:42kind of almost this understanding and documentation layer that ends 27:45up being the most important thing. 27:47Trent Gray-Donald: Absolutely. 27:47I mean, one of, one of my One of my day jobs is, uh, I'm, I'm Chief 27:52Architect for watsonx code assistant. 27:55So, very, very much my day job. 27:58And, I view this as a very, very young space. 28:03And everybody's trying different interfaces and different ways to do it. 28:08And, how do I, uh, like, I see all the statistics. 28:13And, the number of people who are using chat, Or, the chat like things that, 28:20Cursor makes easy is very large and definitely one of the first features 28:27asked for everybody thinks for a little while they want to do code gen 28:33and and there is a constituency that does want to do that, but most people 28:39actually revert back to the can you just tell me what the hell my codes 28:42doing and help me put it together. 28:47That is, is a, is a big part and then figuring out how to do that and getting 28:52the appropriate amount of context. 28:53Now we have LLMs with larger context windows and we're getting 28:56better and better techniques to build intelligent prompts. 29:00But this is going to keep evolving. 29:02And then, the, the, the bigger one, to be honest with you, is going to be the 29:08evolution towards agentic, uh, where it's much more planning and discussing in 29:17the large and The question is, okay, is it going to be human in the loop, or is 29:22it just going to be prompt and see off. 29:24Tim Hwang: Like, I just want an app that does this, and it just goes and does it. 29:28Trent Gray-Donald: And, and, I, I think that our, going back to the 29:32dignity comment, it, it's having these human in the loop is that 29:38Where you have a helper that says, you're trying to do this big thing. 29:41I think I've broken it down to these six steps. 29:44Human, do you agree? 29:46And you look and say, oh man, okay, it went right off 29:48the rails at step four here. 29:50Let's, let's fix that up. 29:52Tweak, tweak, tweak. 29:53Off we go. 29:54And this isn't going as, like, there's the whole Devon's and whatnot universe, right? 30:00Sometimes they go, everybody's experimenting from really, really 30:06tiny little baby steps to the other end, which is, hey, let it all fly. 30:12And exploring this problem space is going to be fascinating for the 30:16next several years because nobody's quite figured it out and the models 30:21are getting that much better. 30:23So I'm super excited about that. 30:25Where this all goes and I really welcome seeing the exploration that the cursor 30:31is doing around innovating on interface. 30:35For sure. 30:36I think it's like, yeah, very exciting. 30:37And you know, almost the joke will be like everybody in the future 30:40will be like an engineering manager. 30:41Basically, it turns everybody into an EM, you know, over time. 30:45Um, I guess. 30:46You know, Aaron, I don't know if, are you a VS Code guy? 30:49Like, I'm kind of curious, like, I think one of the bids of Cursor, which 30:51I think is very intriguing, is, you know, people are very comfortable 30:55once they've set up their IDE. 30:56Like, it's, it's like almost like setting up your office. 30:58Like, you want it to be comfortable, and you want to know where 31:00everything is, and you don't want the bindings to be in a particular way. 31:04It's kind of wild, which is like, Sort of what Cursor is kind of attempting in 31:07the market is, well, these AI features will be so killer that you would be 31:12willing to abandon all that, right? 31:15Or at the very least, like, get over the hump of having to, like, 31:17spend an afternoon just kind of twiddling it to get it comfortable. 31:20Um, Yeah, I'm kind of curious, I mean, as someone who, like, builds 31:23these systems, works on these technologies, like, you know, is that, 31:27is that prospect attractive to you? 31:28Like, have you tried Cursor? 31:29Would you jump onto Cursor? 31:30Um, I, I actually don't know what your daily setup looks like, but 31:33part of it to me is just like, is that value proposition strong enough 31:37to get people to do that shift? 31:40Aaron Baughman: Yeah. 31:40I mean, so, so yeah, so, you know, I write code every day, uh, VS, you know, the, the 31:44VS IDE is, you know, what I use of choice. 31:51First, I'm a big fan of paired programming and paired testing, but 31:55having multiple people work together, maybe on a singular task or a group 32:00of people working on an experiment because it does a couple of things. 32:04One, it improves code quality, engineering quality. 32:08That's the scientific process, but it also creates long lasting teams, right? 32:12That stay together for years and years and continuity of people 32:15on a team, I think, is important. 32:18Um, and, and, and so relegating software and science to maybe prompt 32:23engineering, to me, has some cons. 32:25Um, uh, Of course, the pros are, you know, it accelerates productivity, you 32:31know, it can help us code complete, it can create different types of 32:35comments so we can understand code. 32:37So there's certainly a place. 32:39However, I do think that we want to make sure that our engineers and 32:43scientists still understand code, can write algorithms, can create code. 32:48New programming languages, uh, new compute paradigms, for example, quantum, right. 32:53That's a new paradigm where I don't think LLM, uh, may, may maybe 32:58with Qiskit and be being able to create, you know, Python code. 33:01Uh, but there's all these new languages that are popping up and, you know, l LLMs 33:06have to be trained on something, you know, on some kind of, you know, pile of data. 33:10And if a human can't create that pile of data in a trustworthy 33:14way, then I think some of the. 33:16creativity and skill of the engineer, you know, might, might be lost. 33:20So, you know, so the hype around cursor, I think is real and it's a 33:24very powerful product, but I would encourage, uh, perhaps folks to say, 33:28let's put a time limit on the amount of which we can use some of these tools so 33:32that we can maintain our sharp blade. 33:35Um, you know, whenever we really need to do some. 33:38engineering, so we don't all become just prompt engineers, right? 33:42But that that's that's sort of my caution and, uh, my thought. 33:46And yes, I do use, you know, Watson code assistant, um, you know, pretty 33:50much every day through the plug in on VS code, and it helps a lot. 33:53It's really good. 33:54It creates, you know, different types of comments. 33:56I also use, um, Google, right? 33:58Right. 33:59I'll go on Google and I'll use the gen AI feature to give me 34:02ideas on how to write code better. 34:04But I always try to limit myself and my team to say, Hey, 34:07let's, let's do 20 80 or 50 50. 34:10And let's make sure we're still communicating as a team. 34:13Um, you know, so, so that human interaction to me, um, is important. 34:18Tim Hwang: Yeah. 34:18That implies almost two really interesting things. 34:20Like one of them is in the future, there'll be like almost like a screen time 34:23for these features or it'll basically be like, you've hit your limit for the week. 34:26No more, no more AI for you. 34:28Um, the other one also is like, I think, you know, particularly, There's 34:32been some discussion about, oh, well, are these systems eventually going to 34:35get so good that it actually, like, replaces a lot of jobs of engineers. 34:38But it almost feels like there'll be, like, this constant pressure to 34:41learn more and more obscure languages. 34:43Because those will be the areas that basically AI can't touch because 34:47the data sets are more obscure. 34:49Um, which I think will be really interesting to see. 34:51Trent Gray-Donald: There are definitely, I mean, there's no surprise or secret 34:55that IBM's been around for a while and has created languages that, It may be 35:00a little long in the tooth, like COBOL or PL1 or whatnot, and sure enough, the 35:05amount of data, the amount of code on the internet that most of these models 35:11are trained against is very, very small, and they can't do these languages at all. 35:16And so what one of the things that we've done is, of course, is we have 35:20more COBOL code, we have more PL1 code, we have more of these things, so we 35:23can build better models for that, and companies are approaching us with, hey, 35:29we built this weird esoteric language, can you help us do the same there? 35:33So it's while there is a barrier, Wherever there's a barrier, there's 35:38typically financial incentives to do something about it. 35:41So, esoteric languages are going to be a bit of a barrier, especially at the free. 35:49It's going to be tough. 35:55Tim Hwang: Well, great. 35:56I want to tie up today because we actually have a very unique pleasure of 35:59having the three of you on this episode. 36:02As you may have overheard, When I introduce these three guests. 36:05They are all IBM fellows. 36:08Um, and for those of you who don't know, the IBM fellows program, 36:11I didn't know much about it, but it's, it's a crazy program. 36:13Basically, the idea is to bring together some of the brightest minds in technology, 36:17um, to, to work on projects at IBM. 36:19And so I think they've included, I was looking on the website, there's a U S 36:22presidential metal of Freedom winner. 36:24Five Turing Award winners, five Nobel Prize winners, um, uh, and, uh, uh, 36:29and I figured we just kind of take the last few minutes for people to 36:32hear a little bit about sort of the program, what you've learned, um, 36:36and, and, you know, where you think the program might go into the future. 36:39And I guess, uh, Aaron, maybe I'll toss it to you cause you kind 36:41of kicked off our first session. 36:42So I'll bring you back into the conversation here, but I'm curious 36:45about kind of like how your experience with the fellows program has been 36:48and, and, um, and what you've learned. 36:50Aaron Baughman: Yeah, I mean, becoming an IBM fellow is one of those seminal 36:53moments, uh, where it's um, it's very surreal, you know, when it, when it 36:56happened and, and, you know, my first thought was, wow, you know, I really 37:00hope that I can live up, you know, to those who came, um, you know, before 37:05me, and then I can also be an example to those who will come after me. 37:10Right. 37:10So, you know, I'm sort of, you know, In the middle, and I want to make sure 37:14that I can keep the projection of what's happening and what's going to happen. 37:18And, and I take that with a big responsibility that it's, um, both we 37:23need to ensure that we keep up to date with science, engineering, push it 37:27forward in a responsible way, but also, um, to usher the next generation of 37:33IBM fellows that will come after us. 37:35And, and the process, you know, of becoming a fellow, um, I found it 37:40very rewarding because it helped me at least to reflect back on all the 37:45people who helped me achieve something that I didn't know was attainable. 37:50Um, and then being with Trent and Cush, you know, is, is a, one of those 37:54things where it's like, wow, you know, I always knew and followed their work. 37:58And, and I did not know They were going to be IBM fellows until it was announced. 38:02And so it was just, it was great to hear that, wow, I'm in the 38:06same class as Trent and Koush. 38:07You know, it couldn't be better, you know, in my, in my view. 38:10Yeah, that's great. 38:11Uh, 38:11Trent Gray-Donald: Trent, Koush, any other reflections? 38:13I think it's very important that companies in the technology space have 38:18leaders who are effectively partners. 38:20Pure technologists and can be the right balance to the business at times, where 38:30one of the unspoken or what's actually spoken things about fellows is they 38:34are supposed to be a little bit of a check and balance on what we can or what 38:41we should be doing in a given space. 38:43Tim Hwang: Right, you're like the keepers of the technical flame, you know, yeah. 38:46Trent Gray-Donald: Because sometimes that's necessary. 38:50And, uh, but it's, it's, it's a huge honor to have become a fellow. 38:55And it definitely, the, uh, the, the number of, of people who've come 39:03before and, and have that I very much look up to is, is very large. 39:08Kush Varshney: Yeah, no, I mean, it is extremely humbling. 39:11Uh, If you look at the list of, uh, all these people, as you mentioned, Nobel 39:15prizes and, uh, inventing all sorts of things that we take for granted, 39:21whether it's DRAM or, um, I mean, all sorts of different things, um, and, 39:26uh, it's just crazy to be thought of, uh, in that same light and, uh, 39:31I mean, it's been a few months now, um, I guess, uh, for the three of us. 39:35And, uh, I mean, one thing that I've learned is just, um, uh, some places 39:41that I've traveled both within IBM and, uh, outside it's people do look up 39:45to this position that it is something that people look to as an inspiration. 39:49And, um, I hadn't thought of it that way. 39:52And, um, I think it's just like, uh, Aaron said a responsibility and, uh, uh, like tr 39:59said, it's, I mean, um, a way to, uh, to, to have this check and balance as well. 40:04So all of that in one role, I mean, and it is just crazy. 40:09So, um, yeah, I think, uh, the three of us are gonna do our best and, uh, 40:14keep, uh, uh, keep, uh, keep, keep this, uh, this traditional alive. 40:19That's great. 40:20Tim Hwang: Well, look, it's an honor to have the three of you on the show. 40:22Um, I hope we'll be able to get all three of you back, um, on a future 40:25episode of Mixture of Experts. 40:27Um, but that's where I think we'll wrap it up for today. 40:30So thanks everybody for joining, um, and thank you for all you 40:33listening, um, joining us again on another week of Mixture of Experts. 40:37Um, if you enjoyed what you heard, you can get us on Apple Podcasts, 40:39Spotify, and podcast platforms everywhere, and we'll see you next week.