Learning Library

← Back to Library

OpenAI vs Google Showdown

Key Points

  • The “Mixture of Experts” podcast episode focuses on the latest showdown between OpenAI and Google, dissecting their recent flood of announcements and what they signal for the AI industry.
  • Host Stim Hong is joined by returning panelists Varney (senior AI consulting partner) and Chris (distinguished engineer/CTO of customer transformation), plus first‑time guest Brian Casey (director of digital marketing) who is slated to give a lengthy monologue on AI and search.
  • The discussion organizes the news around three major themes: multimodality (both firms pushing models that handle video, image, audio, and text inputs), latency and cost reductions (faster, cheaper inference that could unlock new downstream applications), and a flagship Google reveal that could become many users’ first exposure to the company’s next‑gen AI offering.
  • Throughout, the panel debates which announcements are truly impactful versus hype, aiming to clarify which technologies are “cool” and which are “cringe” for developers and enterprises.

Full Transcript

# OpenAI vs Google Showdown **Source:** [https://www.youtube.com/watch?v=T6DGGHlkYa0](https://www.youtube.com/watch?v=T6DGGHlkYa0) **Duration:** 00:41:00 ## Summary - The “Mixture of Experts” podcast episode focuses on the latest showdown between OpenAI and Google, dissecting their recent flood of announcements and what they signal for the AI industry. - Host Stim Hong is joined by returning panelists Varney (senior AI consulting partner) and Chris (distinguished engineer/CTO of customer transformation), plus first‑time guest Brian Casey (director of digital marketing) who is slated to give a lengthy monologue on AI and search. - The discussion organizes the news around three major themes: multimodality (both firms pushing models that handle video, image, audio, and text inputs), latency and cost reductions (faster, cheaper inference that could unlock new downstream applications), and a flagship Google reveal that could become many users’ first exposure to the company’s next‑gen AI offering. - Throughout, the panel debates which announcements are truly impactful versus hype, aiming to clarify which technologies are “cool” and which are “cringe” for developers and enterprises. ## Sections - [00:00:00](https://www.youtube.com/watch?v=T6DGGHlkYa0&t=0s) **AI Showdown Intro: OpenAI vs Google** - The host opens the “Mixture of Experts” podcast, previews a debate on the week’s major OpenAI and Google announcements, and introduces returning panelists Varney and Chris alongside first‑time guest Brian Casey. ## Full Transcript
0:00[Music] 0:09hello and welcome to mixture of experts 0:11I'm your host stim Hong each week 0:13mixture of experts brings together a 0:14world-class team of researchers product 0:16experts Engineers uh and more to debate 0:19and distill down the biggest news of the 0:21week in AI today on the show The Open Ai 0:24and Google showdown of the week who's up 0:26Who's down who's cool who's cringe what 0:29matters and what was just hype we're 0:30going to talk about the huge wave of 0:32announcements coming out of both 0:33companies this week and what it means 0:35for the industry as a whole so for 0:37panelists today on the show I'm ay 0:39supported by an incredible panel uh two 0:42veterans who have joined the show before 0:43and a a new uh contestant has joined the 0:46ring um so first off uh Varney he's 0:49the senior partner Consulting for AI in 0:52US Canada and latam welcome back to 0:54the show thanks for having me back Tim 0:56love this yeah definitely glad to have 0:58you here uh Chris hey who is a 1:01distinguished engineer and the CTO of 1:02customer transformation Chris welcome 1:04back hey nice to be back yeah glad to 1:07have you back uh and joining us for the 1:09first time is Brian Casey who is the 1:11director of digital marketing who has 1:13promised a 90-minute monologue uh on AI 1:16and search summaries which I don't know 1:17if we're gonna get to but we're gonna 1:18have him have a say Brian welcome to the 1:20show we'll have to suffer through show 1:22bit and Chris for a little bit and then 1:23we'll get to the monologue but thank you 1:25for stuff yeah exactly 1:27exactly um well great well let's just go 1:30ahead and jump right into it so 1:32obviously there were a huge number of 1:34announcements this week open AI came out 1:36of the gate with its kind of raft of 1:38announcements uh Google IO is going on 1:41and they did their set of announcements 1:43and so really more things I think were 1:46debuted promised coming out then we're 1:48going to have the chance to cover on 1:49this episode but sort of from my point 1:52of view and I think I wanted to use this 1:54as a way of organizing the episode there 1:55were kind of three big themes coming out 1:57of Google and open AI this week we sort 2:00of take in turn and use to kind of make 2:01sense of everything so I think the first 2:04thing is multimodality Right both 2:06companies are sort of obsessed with 2:08their models taking video input and 2:10being able to make sense of it and going 2:12from you know image to audio text to 2:14audio um and I want to talk a little bit 2:16about that second thing is latency and 2:19costs right everybody touted the fact 2:21that their models are going to be 2:22cheaper and they're going to be way 2:23faster right and you know I think if 2:25you're from the outside you might say 2:27well it's kind of a difference in kind 2:28things get faster and cheaper but I 2:30think what's happening here really 2:32potentially might have a huge impact on 2:34Downstream uses uh of AI and so I want 2:36to talk a little bit about that 2:37Dimension and sort of what it means um 2:40and then finally uh I've already kind of 2:42previewed a little bit um Google made 2:44this big announcement that I think is 2:45almost literally going to be like many 2:47people's very first experience with llms 2:51in full production uh Google basically 2:53announced that going forwards uh the US 2:56market and then globally uh those users 2:58of Google search will start seeing AI 3:00summaries at the top of each of their 3:02sort of search results um that's a huge 3:04change we're going to talk a little bit 3:06about what that means and um if it's 3:07good I think is a really big question uh 3:10so looking forward to diving into it 3:12[Music] 3:17all so let's talk a little bit about 3:19multimodal first so there's two showcase 3:22demos from Google and open Ai and I 3:25think both of them kind of roughly got 3:27at the same thing which is that in the 3:28future you're going to open up your 3:29phone you're going to turn on your 3:30camera and then you can wave your camera 3:32around and your AI will basically be 3:34responding in real time and so show I 3:37want to bring you in because you were 3:38the one who kind of flagged this being 3:39like we should really talk about this 3:41because I think the big question that 3:42I'm sort of left with is like you know 3:44where do do we think this is all going 3:46right it's a really cool feature but 3:47like what kind of products do we think 3:49it's really going to unlock and maybe 3:50we'll start there but I'm sure I mean 3:51this topic goes into all different 3:52places so I'll give you the floor to 3:54start so Monday and Tuesday were just 3:56phenomenal infliction points for the 3:58industry altogether is getting to a 4:00point where an AI can make sense of all 4:03these different modalities it's an 4:05insanely tough problem we've been at 4:07this for a while we've not gotten it 4:08right we spent all this time trying to 4:10create pipelines to do each of these 4:12speech to text and understand and then 4:14text it takes a while to get all of the 4:16processing done the fact that in 2024 we 4:19are able to do this what a time to be 4:21alive Man U I just feel that we are 4:23getting finally getting to a point where 4:25your phone becomes an extension of of 4:27your eyes of your listening in and stuff 4:29like that right and that is a that has a 4:32profound impact on some of the workflows 4:33in our daily lives Now with an IBM I 4:36focus a lot more on Enterprises so I'll 4:38give you more of an Enterprise a view of 4:40how these Technologies are actually 4:42going to make a make a difference or not 4:44in both cases gini and and open eyes is 4:4840 and by the way in my case 40 does not 4:51stand for Omni Omni for me 40 means oh 4:53my God it was really really that good so 4:56U we're getting to a point where there 4:58are certain workflows that we do with 5:00Enterprises like you are looking at 5:03transferring Knowledge from one person 5:04to the other and usually you're looking 5:06at a screen and you have a bunch of here 5:07is what I did how I Sol for it yeah we 5:09used to spend a lot of time trying to 5:11capture all of that and what happened in 5:12the desktop classic BP processes these 5:15are billions of dollars of work that 5:17happens right yeah and I think I pause 5:19you there like I'm curious if you can 5:20explain because again this is not my 5:22world I'm sure a lot of listeners aren't 5:23it isn't their world as well how did it 5:25used to be done right like so if you're 5:27you're trying to like automate a bunch 5:28of these workflows 5:30is it just people writing scripts for 5:31every single task or like I'm just kind 5:32of curious about what it looks like yeah 5:34so Tim let's let's pick a more concrete 5:36example say you have Outsourcing a 5:38particular piece of work and your 5:40Finance documents coming in you're 5:41comparing it against other things you're 5:43finding errors you're going to go back 5:45and send the email things of that nature 5:46right so we used to spend a lot of time 5:48documenting the current process and then 5:51we look at that 7 29 step process and 5:54say I'm going to call an API I'm going 5:55to write some scripts and all kinds of 5:57issues used to happen along the way 5:58unhappy PA so and so forth so the whole 6:01process used to be codified in some some 6:03level of code and then it's 6:05deterministic it does one thing in a 6:06particular flow really well and you 6:08canot interrupt it you can't just barge 6:09in and say no no no this is not what I 6:11wanted can you do something else so 6:13we're now finally getting to a point 6:14where that knowledge work that work that 6:16used to get done in a process that will 6:19start getting automated significantly 6:20with announcements from both Google and 6:22uh open ey so far people would solve it 6:24as a decision step-by-step flowchart but 6:27now we're at Paradigm Shift where I can 6:29in the middle of it interrupt and I can 6:31say hey see what's on my desktop and 6:32figure it out I've been playing I've 6:34been playing around with with opening 6:36eyes 40 its ability to go look at a 6:38video of a screen and things of that 6:40nature it's pretty outstanding we are 6:42coming to a point where the the speed at 6:44which the inference is happening is so 6:45quick then now you can physically we can 6:47actually bring them into your workflows 6:48early it was just take so long it was 6:50very clunky it was very expensive so you 6:52couldn't really justify adding AI into 6:54those workflows it'll be you do liver 6:57Arbitrage or things of that nature 6:58versus trying automated so the these 7:01kind of workflows infusing AI in doing 7:03this entire process into an phenomenal 7:05unlock one of my clients is um big CBG 7:08company and uh as we walk into the 7:11aisles they do things like planograms 7:12where you're looking at a picture of the 7:14of shelf and these consumer product 7:17goods companies would give you us a 7:19particular format in which you want to 7:21keep different chips and drains and so 7:22on so forth and each of those labels are 7:24turned around or they are in different 7:26place you have to audit and say am I 7:28placing things on the sh the right way 7:30like the consumer product goods wanted 7:32to that's called plog real plog IDE here 7:36so earlier we used to take pictures a 7:37human would go in and note things and 7:39say yes I have enough of the bottles in 7:41the right order then we started to take 7:42pictures and analyzing it you start to 7:44run into real world issues you don't 7:46have enough space to back up and take a 7:48picture or you go to the next Isis and 7:50the lighting is very different and stuff 7:51like that so AI never quite scaled and 7:54this is the first time now we're looking 7:55at models like Gemini and others where I 7:57can just walk past it and as create a 7:59video and just feed the whole 5 minute 8:02video in with this context length of 2 8:04million plus and stuff it can actually 8:06inest it all number do missing yeah 8:09right so those those kind of things that 8:11were very very difficult to do for us 8:13earlier those are becoming a piece of K 8:16the big question here is how do I make 8:18sure that the AI phenomal stuff that we 8:21seeing is grounded in Enterprise so it's 8:24my data my planogram style or my 8:27processes my documents not getting 8:30Knowledge from elsewhere so in all the 8:31demos one of the things that I was 8:32missing was how do I make it go down a 8:36particular path that I want right if the 8:38answer is not quite right how do I 8:39control it so I think a lot more around 8:41how do I bring this to my Enterprise 8:43clients and deliver value for them those 8:45some of the open questions Chris I 8:48totally I do want to get into that I see 8:49Chris coming off mute though so I don't 8:51want to break his role I don't know if 8:52Chris and you got kind of a view on this 8:53or if you disagree you're like ah it's 8:55actually not that impressive uh Google 8:57Glasses back baby yeah yeah 9:00no so I I think I think multimodality is 9:04a huge thing and covered it 9:06correctly right there's so many use 9:08cases in the Enterprise but also in uh 9:12consumer based uh scenarios and I think 9:14one of the things we really need to 9:16think about is we've been working with 9:17llms for so long now which has been 9:19great but the 2D Tech space isn't enough 9:23for generative AI it's it's we want to 9:26be able to interact real time we want to 9:28be able to interact with audio um you 9:31know and you can take that to things 9:32like contact centers where you want to 9:34be able to transcribe that audio you 9:36want to then have AIS be able to respond 9:38back in a human way and you want to chat 9:40with the assistants like like you saw on 9:41the open AI demo you know you don't want 9:44to be sitting there go well you know my 9:46conversation is going to be as fast as 9:48my fingers can type you want to be able 9:50to say hey you know what do you think 9:51about this what about that and you want 9:53to imagine new scenarios so you want to 9:56say what what does this model look like 9:58what does this image look like you know 10:00tell me what this is and you want to be 10:01able to interact with the world around 10:03you and to be able to do that you need 10:06multimodal uh models and and 10:10therefore like in the Google demo where 10:12you know yeah she picked up the glasses 10:14again you know so I jokingly said Google 10:16Glasses back but but it really is it's 10:19if you're going and having a shopping 10:21experience retail and you want to be 10:23able to look at what the price of a 10:25mobile phone is for example you're not 10:27going to want to stop get your phone out 10:28type type type you just want to be able 10:30to interact with an assistant there and 10:32then or see in your glasses what the 10:35price is and I give the mobile phone 10:36example for a reason which is the price 10:40that I pay for a mobile phone isn't the 10:42same price as you would pay right 10:44because it's all contract rates and if I 10:46go and speak if I want to get the price 10:49of how much am I paying for that phone 10:50it takes an advisor like 20 minutes cuz 10:53they have to go look up your contract 10:55details Etc they have to look up what 10:57the phone is and then they do a deal mhm 10:59in a world of multimodality where you've 11:01got something like glasses on it can 11:03recognize the object it knows who you 11:05are and then it can go and look up what 11:08uh what the price of the phone is for 11:09you and then be able to answer questions 11:12that are not generic questions but 11:13specific about you your contract to you 11:16right exactly that that is where 11:18multimodality is going to start start to 11:21come in kind of sounds like right yeah 11:24totally I mean Chris if I have you right 11:26I mean this is one of the questions I 11:26want to pitch to both you show and you 11:28Chris on this is you know actually my 11:31mind goes directly back to Google Glass 11:33like the the bar where the guy got beat 11:34up for wearing Google Glass years ago 11:36that was like around the corner from 11:37where I used to live at San in San 11:38Francisco oh wow and you know there's 11:40just been this dream and obviously all 11:42the open AI demos uh and Google demos 11:44for that matter are all very consumer 11:46right that you're walking around with 11:47your glasses and you're looking around 11:49the world and you know get prices and 11:50that kind of thing this been like a 11:52long-standing Silicon Valley dream and 11:53it's been very hard to achieve and I 11:56guess one thing I want to run by you is 11:57like and the answer might just be both 11:59or we don't know is like if you're more 12:00bullish on the beta b side or on the 12:02beta C side right because I hear what 12:04chit's saying and I'm like oh okay I can 12:06see why Enterprises really get a huge 12:08bonus from this sort of thing um and and 12:11I guess it's really funny to me because 12:12I think there's one point of view which 12:13is everybody's talking about the 12:14consumer use case but the actual 12:16near-term impact may actually be more on 12:18the Enterprise side but I don't know if 12:19you guys buy that or if you really are 12:21like this is the era of Google Glass you 12:22know it's it's back baby so so I can 12:26start first Tim um we have been work 12:29with apple Vision quite a bit um with an 12:31IBM with our clients and a lot of those 12:34are Enterprise use cases in a very 12:35controlled environment so things that 12:37where things break in the consumer world 12:40you don't have a controlled environment 12:41you have Corner cases that happen a lot 12:44right in an Enterprise setting if I'm 12:47help if I'm wearing my my vision Pros 12:49for two hours at a stretch doing I'm a 12:52mechanic I'm fixing things right that's 12:54a place where I need additional input 12:56and I can't go look at other uh things 12:58like pick up my cell phone and work on 13:00it I'm underneath I'm I'm fixing 13:02something in the middle of it right 13:04those use cases because the environment 13:06is very controlled I can do AI with 13:09higher accuracy it's reputable I know I 13:11can start trusting the answers because I 13:13have enough data coming out from it 13:14right so you're not trying to solve 13:15every problem but I think we'll see a 13:18higher uptake of these devices U by the 13:21I love the the Rayband glasses from meta 13:23as well great great to do something 13:26quick but when you don't want to switch 13:28but I think we we have moving to a point 13:30where Enterprises will go deliver these 13:32at scale the tech starts to get better 13:35and adoption is going to come over on 13:37the B Toc sign but in the consumer goods 13:39we'll have multiple attempts at this 13:41like we had with Google classes and 13:42stuff it'll take a few attempts to get 13:44better on the Enterprise side we will 13:46learn and make the models a lot better 13:48but I think there's insane amount of 13:49value that we're delivering to our 13:51clients with apple Vision Pro today in 13:53Enterprise settings I think it's going 13:55to follow that problem totally yeah and 13:56it's actually interesting I hadn't 13:57really thought about this in chis in is 13:59like um basically like the phone is 14:02almost not as big of competition in the 14:04Enterprise setting right whereas like 14:05the example that Chris gave was like 14:07literally you're trying to be like is 14:08this multim modal device faster than 14:11using my phone in that interaction which 14:13is like a real competition but if it's 14:15something like a mechanic you know they 14:17don't have they don't they can't just 14:18pull out their phone um Chris any final 14:19thoughts on this and then I want to move 14:20us to our next topic yeah and I was just 14:22going to give another kind of use case 14:24scenario I I often think of things like 14:26the oil rigs exam example so 14:29a real sort of Enterprise space where 14:31you're wandering around and you have to 14:33go and do safety checks on various 14:35things and most of their time if you 14:38think of the days before the mobile 14:39phone or before the tablet what they 14:41would have to do is go look at the part 14:42do the inspection the visual inspection 14:44and then walk back to a PC to go fill 14:47that in and then these days you do that 14:49with a tablet on the rig right but but 14:51then actually you need to find a 14:52component you're going to look at you 14:54have to do the defect analysis you want 14:56to be able to take pictures of that you 14:58need the G location of where that part 15:00is so that the next person can find it 15:03and then you want to be able to see the 15:04notes that they had before on this and 15:07then you got to fill in the safety form 15:09right so they have to fill in a t ton of 15:10forms so there's a whole set of 15:12information if you just think about AI 15:15just having you know even your phone or 15:18glasses pick either to be able to look 15:20at that part be able to have the notes 15:21contextualized in that geospatial space 15:24be able to fill in that form be able to 15:26do an analysis with AI it's it's got a 15:29huge impact on Enterprise cases and 15:31probably multimodality in that sense has 15:34probably got a bigger impact I would say 15:36in the Enterprise cases than the 15:37Consumer spaces even today and I and I 15:40think that's something we really need to 15:42think about the other one is and again I 15:45know you wanted this to be quick there 15:46Tim is the clue and generative AI is the 15:50generative part right so actually I can 15:54create images I can create audio I can 15:57create music things that don't exist 15:59today so and with the text part of 16:02something like an llm then I can create 16:04new creative stuff I can create develops 16:07pipelines doer files whatever so there 16:09comes a part where I want to visualize 16:12the thing that I create I don't want to 16:14be copying and pasting from one system 16:18to another right that's not any 16:20different from the oil rig scenario so 16:22as I start to imagine new new business 16:25processes new pipelines new uh Tech 16:27processes I then want to be able to have 16:29the real-time visualization of that at 16:31the same time or be able to interact 16:33with that and that's why multimodality 16:35is is really important probably more so 16:37in the Enterprise space yeah that's 16:39right I mean I think some of the 16:40experiments you're seeing with like 16:41Dynamic visualization generation are 16:43just like very cool right uh because 16:46then you basically have you can say like 16:47here's how I want to interact with the 16:49data the system kind of just generates 16:51it right on the Fly um which I think is 16:53very very exciting 16:59all right so next up I want to talk 17:00about latency and cost so this is 17:02another big Trend you know I think it 17:04was very interesting that both companies 17:06went out of their way to be like we've 17:08got this offering and it's way cheaper 17:10for everybody um which I think suggests 17:12to me that you know these big huge 17:14competitors in AI all recognize that 17:16like your your per token cost is going 17:18to be this huge bar to getting the 17:19technology more distributed um so 17:22certainly one of the ways they sold 40 17:25was that it was cheaper and as good as 17:27GPT right everybody was kind of like 17:29okay well why do I pay for pro anymore 17:31if I'm just going to get this for for 17:32free and then Google's bid of course was 17:34Gemini 1.5 flash right which is okay 17:36it's going to be cheaper and faster 17:38again um and I know Chris you threw this 17:41uh sort of topic out so I'll kind of let 17:43you have the first say but I think the 17:44main question I'm left with is like what 17:46are the downstream impacts of this right 17:48for someone who's not really paying 17:49attention to AI very closely like is 17:52this just matter of like it's getting 17:53cheaper or do you think like these are 17:55actually these economics are kind of 17:56changing how the technology is actually 17:57going to be rolled out 18:00I think latency and smaller models and 18:03tokens are probably one of the most 18:06interesting challenges we have today so 18:08if you think about like the GP T4 and 18:11everybody was talking like oh that's a 18:131.8 trillion model or whatever it is 18:16that's great but the problem with these 18:19large models is every layer that you 18:22have in the neural network is adding 18:26time to get a response back and not not 18:28only time but cost so if you look at the 18:32demo that open AI did for example what 18:35was really cool about that demo was the 18:37fact that when you were speaking to the 18:39assistant it was answering pretty much 18:42instantly right and that is the real 18:45important part and when we look at 18:47previous demos what you would have to do 18:49if you were having a voice interaction 18:51is you'd be stitching together kind of 18:54three different pipelines you need to do 18:56uh Speech to Text then you're going to 18:58run that through the model and then 18:59you're going to do text to speech back 19:01way so you're getting latency latency 19:03latency before you you get a response 19:06and that timing that it would take 19:08because it's not in the sort of 300 19:10millisecond mark it was too long for a 19:13human being to be able to interact so 19:14you got this massive pause so actually 19:17latency and the kind of tokens per 19:20second becomes the most important thing 19:23if you want to be able to interact with 19:24models quickly and be able to have those 19:26conversations and that's sort of why 19:28also multimodality is really important 19:31because if I can do this in one model as 19:33well then it means that I'm not sort of 19:35jumping pipelines all the time so the 19:38smaller you can make the model the 19:40faster it's going to be now if you look 19:42at the GPT 4 on the model I don't know 19:45if you've played with just a text mode 19:47it is lightning fast when it comes back 19:50very fast now yeah it's and noticeably 19:52so like it's just like it feels like 19:54every time I'm in there's like these 19:55improvements right so and and this is 19:58what you're doing you're sort of trading 19:59off reasoning versus uh speed of the 20:03model right and and as we move into kind 20:07of agentic platforms as we move into 20:10multimodality you need that latency to 20:12be super super sharp because you're not 20:14going to be waiting all the time so 20:16there is going to be scenarios where you 20:18want to move back to a bigger model that 20:20is fine um but you're going to be paying 20:22the cost and that cost is going to be 20:24the cost uh the price of the tokens in 20:27the first place but also the speed of 20:29the response and I think this is the 20:31push and pull that model creators are 20:33going to be playing against all of the 20:35time and and and therefore if you can 20:38get a similar result from a smaller 20:41model and you can get a similar result 20:44from a faster model and a cheaper model 20:47then you're going to go for that but in 20:48those cases where it's not then you may 20:50need to go to the larger model to kind 20:52of reason so this this is really 20:53important totally yeah I think there's a 20:54bunch of things to say there I mean I 20:56think one thing that you've pointed out 20:57clearly is that like this makes 20:58conversation possible Right like that 21:00you and I can have a conversation in 21:02part because I have low latency is kind 21:04of the way to think about it and like 21:06now that we're reaching kind of human 21:07like parody on latency you know finally 21:09these models can kind of Converse in a 21:11certain way the other one is actually I 21:13really thought about that there is kind 21:14of this almost like thinking fast and 21:15slow thing where basically like the 21:17models can be faster but they're just 21:19not as good at reasoning um and then 21:21there's kind of this like deep thinking 21:23mode which actually is like slower in 21:25some ways so Tim U the way we are 21:27helping Enterprise clients again have 21:29that kind of focus in in life there's a 21:32split there's a there's there are two 21:33ways of looking at applying gen in the 21:35industry right now one is at the use 21:38case level you're looking at the whole 21:40workflow into to end seven different 21:42steps the other is going and looking at 21:44it at a subtask level right so I'll just 21:46take pick an example I'll walk you 21:48through it so say I have an invoice that 21:50comes in and I'm taking an application 21:52I'm pulling something out of it I'm 21:54making sure that that's as for the 21:56contract I'm going to send you an email 21:58saying your voice is paid right so some 21:59sort of a flow like that right so say it 22:02is seven steps just very simplified 22:05right I'm going to P things from the 22:07backend systems using apis step number 22:09three I'm going to go call a fraud 22:11detection model that has been working 22:12great for three years step number four 22:15I'm extracting things from a paper right 22:17an invoice that came in that extraction 22:19I used to be doing with OCR 85% accuracy 22:23humans will do the Overflow of it at 22:25that point we're taking a pause and 22:27saying we have reason to believe that 22:28llms today can look at an image and 22:30extract this with higher accuracy yeah 22:32say we get up to 94% so that's nine 22:35points higher accuracy of pulling things 22:37out so we pause at that point and say 22:40let's create a set of constraints for 22:42step number four to find the right 22:44athletes and the constraint could be 22:46what's the latency like we just spoke 22:48how quickly I need the result or can 22:50this take 30 seconds and I'll be okay 22:51with it second could be around cost if 22:54I'm doing this a thousand times I have a 22:56cost envelope to work with versus a 22:57human doing if I'm doing it a million 22:59times I can invest a little bit more if 23:01I can get get accuracy out of it right 23:03so the ROI becomes important then you're 23:05looking at security constraints around 23:07does this data have any identified Phi 23:09data Pi data that really can't leave the 23:11cloud I have to bring things closer or 23:13is this something that is military grade 23:15secrets and has to be on Prem right so 23:17have certain constraints around that so 23:19you come up with a list of five six 23:20constraints and then that lets you 23:22decide whether what kind of an llm will 23:24actually check off all these different 23:26constraints and then you you start 23:27comparing and bringing it in so the 23:29split that we seeing in the market is 23:31one way with llm agents and with these 23:33multimodal models they're trying to 23:35accomplish the entire flow work for end 23:38to end like you saw with Google's 23:39returning the shoes right it's taking an 23:41image of it is going and looking at your 23:43Gmail to find the receipt starting the 23:45return giving your a QR code with the 23:47whole return process done so just 23:49figured out how to go create the entire 23:50endtoend workflow but where the 23:53Enterprises are still focused is more on 23:55the subtask level that point we are 23:57saying this step step number four is 23:59worth switching and I have enough evals 24:02before and after I have enough metrics 24:03to understand and I can control that I 24:06can audit that much better the thing 24:08that from an Enterprise perspective 24:09these end to end multimodal models it'll 24:12be difficult for us to explain to SEC 24:14for example why we rejected somebody's 24:17benefits on a credit card things of that 24:19nature so I think in the in the 24:20Enterprise World we're going to go down 24:22the path of let me Define the process 24:25I'm going to pick small models to 24:27Chris's point to do that piece better 24:29and then eventually start moving over to 24:31hey now let me make sure that those that 24:34framework evals and all of that stuff 24:35can be applied to intoing multim models 24:38I guess I do want to maybe bring in 24:40Brian here you like release the Brian on 24:42this conversation um because I'm curious 24:44about like kind of like the marketer 24:46view on all this right because I think 24:48there's one point of view which is yes 24:50yes chrisit like this is all nerd stuff 24:52right like I yeah know it's like latency 24:54and cost and speed and whatever the big 24:57thing is that you can actually talk to 24:58these AIS right and I guess I'm kind of 25:00curious from your point of view about 25:02like I mean one really big thing that 25:03came out of like the open AI 25:05announcements was we're going to use 25:07this latency thing largely to kind of 25:09create this feature that just feels a 25:10lot more human and lifelike um than you 25:13know typing and chatting within Ai and I 25:16guess I'm kind of curious about like you 25:18know what you think about that move 25:20right like is that ultimately like going 25:22to help the adoption of AI is it just 25:24kind of like a weird sci-fi thing that 25:25open AI wants to do and also I mean I 25:27think if if you've got any thoughts on 25:29you know how it impacts the Enterprise 25:30as well was just like do companies 25:32suddenly say oh I understand this now 25:34right it's because it's like the AI from 25:35her I can buy this um just kind of 25:37interesting thinking about like the the 25:39sort of surface part of this because it 25:41actually will really have a big impact 25:42on the market as well it's kind of like 25:43the technical advances are driving the 25:45the marketing of this I I mean I do 25:47think when you when you look at like 25:49some of the initial reviews of I want to 25:52say like the pin and rabbit like I 25:54remember one of the one of the scenarios 25:57that was being demoed 25:58was I think I think he was looking at a 26:00car and he was asking a question about 26:02it and the whole interaction took like 26:0420 seconds there and he went through he 26:06was just showing that he could do the 26:07whole thing on his phone in the same 26:09amount of time but the thing that I was 26:11thinking about when I was watching that 26:12was like he just did like 50 steps on 26:14his phone that was awful as opposed to 26:16just pushing a button and asking a 26:17question and it was like it was very 26:19clear that the ux interaction of just 26:21like like asking the question and 26:23looking at the thing was a way better 26:26experience than pushing the 50 buttons 26:28on your phone but the 50 buttons still 26:30won just cuz it was faster to do 50 26:31buttons than to you know deal with the 26:33latency impact of um of where we were 26:36before and so it actually it reminded me 26:39a lot of just the way I used to hear 26:41remember hearing Spotify talk early 26:44about the way that they thought about 26:45latency and the things that they did to 26:47just make the first 15 seconds of a song 26:50Land um essentially so that it felt like 26:53you know a like a file that you had on 26:55your device because I think from their 26:56perspective they if it felt like every 26:58time you wanted to listen to a song that 27:00was buffering as opposed to sitting on 27:01your device you were never going to 27:02really adopt on that thing because it's 27:04horrible experience relative to just 27:06having the file locally and so they put 27:09in all this work so that it felt the 27:11same and that wound up being a huge part 27:12of how the technology ended up getting 27:14and the product ended up getting adopted 27:16and you know I do think there's a lot of 27:19a lot of stuff we're doing that is 27:22almost like I don't want to say back 27:23office but like just Enterprise 27:25processes around how people do things 27:27operational things 27:28but there are plenty of ways where 27:31people are thinking about the way that 27:32we do more with like agents in terms of 27:34how that involves like customer 27:35experience whether it's support 27:37interactions whether it's like bots on 27:39the site you can just clearly imagine 27:41that that's going to play a bigger role 27:43in customer experience going far forward 27:46and if you feel like every time you ask 27:47a question that you're waiting 20 27:49seconds to get a response from this 27:50thing like you're just getting the other 27:52person on the end of that interaction is 27:53just getting matter and matter and 27:54matter the entire time where the more it 27:56feels like you're talking to person and 27:58that they're responding to you as fast 28:00as you're talking I think the more 28:01likely it is that people are going to 28:03accept that as an interaction model um 28:05and so I do think that that latency and 28:08like making that feel to you like to 28:10your point about having a human beings 28:12being zero latency um I think that's a 28:14necessary condition for a lot of these 28:16interaction models and so it's going to 28:17be super important going forward and to 28:19me it's also when I think about the 28:20Spotify thing it's like our people are 28:22going to do interesting things to solve 28:24for the first 15 seconds of an 28:25interaction as opposed to the F the 28:27entire interaction like you know can you 28:29get there was a lot of talk about like 28:32open AI model I want to say like 28:33responding with like sure or just like 28:35some space filling entry point um so it 28:39like it could catch up with the rest of 28:40the the dialogue so I think it'll I 28:42think people will prioritize that a lot 28:44because it'll matter a lot I love the 28:45idea that like to save to save cost 28:47basically opening eyes like for the 28:48first few turns of the conversation we 28:50deliver the really fast model so it 28:51feels like you're really having like a 28:52nice flowing conversation and then 28:54basically once you build confidence they 28:55like fall back to like the slower model 28:57that has better results where you're 28:58like oh this person is a good 28:59conversation list but they're also smart 29:01too right is like kind of what they're 29:03trying to do by kind of playing with 29:05model delivery um so we got to talk 29:08about search but Chris I saw you go off 29:09mute so do you want to do a final quick 29:11hit on the question of latency before we 29:12move on no I I was just coming to come 29:14up with what Brian was saying there and 29:16and what you were saying Tim I totally 29:18agreed it was always doing this hey and 29:21then repeat the question so I I wonder 29:23if underneath the hood as you say is 29:25there's a much smaller classifier model 29:27that is just doing that hey piece and 29:30then as you say there's probably a 29:32slightly larger model actually analyzing 29:35the real thing so I I do wonder if 29:37there's two small models or a small 29:39model and a slightly larger model in 29:41between there for that interaction so 29:43it's super interesting and but maybe the 29:45thing I wanted to add to that is we 29:48don't have that voice model in our hands 29:50today we only have the text model so I 29:53wonder once we get out of the demo 29:55environment and then maybe in a 3 weeks 29:57time time or whatever we have that model 30:00whether that's going to be super 30:01annoying every time we ask a question 30:03it's going to go hey and then repeat the 30:05question back so it's cool for a demo 30:08but I wonder if that will actually be 30:10super annoying in two weeks 30:12[Music] 30:16time all right so last topic that we got 30:18a few minutes on uh and this is like 30:20Brian's big moment so Brian get get 30:22yourself ready for this I mean Chris you 30:25can get yourself ready because 30:26apparently Brian's gonna you know you 30:27know everyone else can leave the meeting 30:29yeah take our eyebrows off here with his 30:31with his uh with his rant so the the 30:33setup for this is that basically Google 30:35announced uh that AI generated overviews 30:38will be rolling out to us users and then 30:41everybody uh in the near future and I 30:44think there's two things that to set you 30:45up Brian I think the first one is this 30:47is what we've been talking about right 30:48like is AI gon to replace search here it 30:50is you know here it is consuming the 30:53preeminent search engine so I think it's 30:55like we're here right this is happening 30:57and then the one is like I'm a little 30:59nostalgic you know someone who grew up 31:00with Google um you know I'm like the 10 31:03Blue Links you know like the search 31:04engine you know it's like a big part of 31:06how I experienced and grew up with the 31:08web and um you know this seems to me 31:10like kind of a big shift in how we 31:12interact with the web as a whole and so 31:14I do want you to kind of first talk a 31:16little about what you think it means for 31:17the market um and uh and how you think 31:20it's going to change the economy of the 31:21web yeah so I 31:24follow two communities I would say 31:26pretty closely online I follow The Tech 31:28Community and pretty closely and then I 31:31as a somebody works in marketing I 31:33follow my seo's community um and they 31:35have very different reactions to uh to 31:39what's going on I think your first 31:40question though of um you know is this 31:44the equivalent of swallowing the web um 31:47and from the minute what's funny is from 31:49the minute sort of chat GPT arrived on 31:51the scene people were proclaiming the 31:53death of search now for what it's worth 31:54if you've worked in marketing or on the 31:56Internet for a while people have 31:57proclaimed the death of search as like 31:59an annual event month for the last like 32:0225 years and so um this is just like 32:05part for the course on on some level but 32:08what's interesting to me is that you had 32:09this product chat GPT which is fastest 32:11growing consumer product ever 100 32:13million users faster than anybody else 32:16and what was interesting is it sort of 32:17like speed run speedran the sort of 32:19growth cycle that usually takes years or 32:22decades like well maybe not decades but 32:24like it takes a long time for most 32:25consumer companies to do what they did 32:28the interesting thing about that is if 32:29it was going to totally disrupt search 32:32you would have expected it to show up 32:33and happen sooner than it would have 32:35with other products that maybe would 32:36have had a slower sort of growth 32:38trajectory um but that didn't happen 32:41like if somebody who watches their 32:42search traffic super closely like 32:44there's been no chaotic drop of of this 32:48like people have continued to use search 32:50engines and like one of the reasons I 32:52think that that happened is because 32:54people actually misunderstood um like 32:57like the equivalent of like chat gbt and 32:59Google as competitors um with one 33:01another I know Google and open AI 33:04probably are on some level but I don't 33:05know that those two products are and the 33:07reason I was thinking about that is like 33:09if if chat GPT didn't you know within 33:12the within basically the time plan we've 33:14had so far uh disrupt Google the 33:17question is like why why didn't that 33:19happen and I think you could have a 33:20couple different hypothesis for that 33:21like one you could say the form factor 33:24wasn't right it wasn't text that was 33:25going to do it it was we needed Scarlet 33:27Joan 33:28on your phone and that's the thing 33:29that's going to do it and so they're 33:31maybe leaning into that thought process 33:33a little bit you could say it was 33:34hallucinations like oh the content is 33:36just not accurate uh yeah right so 33:38that's a possibility around it you could 33:41say just like learn consumer Behavior 33:43people have been using this stuff for 20 33:44years it's going to take a while to get 33:45them to do something different you could 33:47say Google's advantages in distribution 33:49so it's like we're on the phone we got 33:51browsers um it's really hard to you know 33:54get the level of penetration that we 33:55have I think all of those probably play 33:57some role but my biggest belief is that 34:00it's actually impossible to separate 34:02Google from the internet itself um 34:04Google's kind of like the operating 34:05system for the web so to disrupt Google 34:07you actually are not disrupting search 34:09you have to disrupt the internet um and 34:11it turns out that that's an incredibly 34:12High bar uh to have to disrupt because 34:14you're not only dealing with search 34:16you're dealing with the capabilities 34:17whether it's Banks or Airlines or you 34:20know retail whatever it is of every 34:22single website that sits on the opposite 34:24end of the internet it turns out that 34:26that's like an orous amount of 34:28capability um that's built up there and 34:31so I looked at I look at that and say 34:33like for as much as like I think this 34:36this technology has brought to the table 34:38hasn't done that thing um yet and so 34:41because it hasn't done that there hasn't 34:43been some dramatic shift there the thing 34:45that Google search is not good at though 34:48um and I think you see it in a little 34:50bit in terms of how they described what 34:52they think the utility of AI overviews 34:55um will be is that it's not good complex 34:58multi-art questions of saying like if 35:00you're trying to plan if you're doing 35:02anything from like doing a buying 35:03decision for a large Enterprise product 35:06or like planning your kids's birthday 35:07party like you're going to have to do 35:08like 25 queries along the way there and 35:10you just you've just accepted and 35:12internalized that you have to do 25 qu I 35:14like that is like basically like search 35:16is one shot right like you just say it 35:18and then responses come back so there's 35:19no yeah sorry go ahead yeah yeah and so 35:22like the way I was thinking about llms 35:23is they're kind of like internet sequel 35:26um in a way where you can ask this like 35:28much more complicated question and then 35:30you can actually describe the way that 35:31you want the output of that thing to 35:33look it's like I want to compare these 35:34three products on these three dimensions 35:36go get me all this data and that would 35:37have been 40 queries um at one point but 35:40now you can do it in one and search is 35:42terrible at doing that right now you 35:44have to go cherry-pick each one of those 35:46data points but the interesting thing is 35:49that that's also maybe the most valuable 35:51query to a user um because you save 30 35:54minutes and so I think Google looks at 35:56that and says 35:58um if we seed that particular space of 36:01complex queries to some other platform 36:04like that's a long-term risk for us and 36:06then if it's a long-term risk for them 36:07what it ends up being is a long-term 36:09risk for the web um I think so I 36:11actually think it was incredibly 36:12important that Google bring this type of 36:14capability into into the web even if it 36:16ends up being disruptive a little bit 36:18from a Publisher's perspective because 36:20what it does is at least preserves some 36:23of the dynamic we have now of like the 36:24web still being an important thing and I 36:26hope that used to your point I have like 36:29present and past Nostalgia for it I 36:32would say yeah exactly so I think it's I 36:34think it's important that it continues 36:36to evolve if we all want the web to 36:37continue to persist as like a healthy 36:39Dynamic Place yeah for sure no I think 36:41that's a that's a great take on it and 36:43you know Google always used to say look 36:45we measure our success based on how fast 36:47we get you off our website right and I 36:49think kind of Brian what you're pointing 36:50out which I think is is very true is 36:52that like what they never said was 36:53there's this whole set of queries we 36:55never surface that you know you really 36:57have to kind of keep keep searching for 36:59right and like that's that ends up being 37:00kind of like a the the the search volume 37:03of the future that everybody wants to to 37:05capture um well uh so Brian I think we 37:08also had a little intervention from AI 37:10the thumbs up thing we were joking about 37:11that before the show it's just 37:13yeah my ranking for worst AI feature of 37:16all time um so um but um make up the 37:20thumbnail on the on the video that's 37:22right yeah exactly um well great so 37:24we've got just a few minutes left show 37:26but Chris any final parting shots on 37:28this topic sure so I I'm very bullish I 37:32think AI overviews um have a lot of 37:35future as long as there's a good 37:38mechanism of feedback incorporating and 37:41making it hyper personalized a simple 37:43query like I want to go have dinner 37:45tonight say I tell you I want looking 37:47for a th restaurant yeah if you look if 37:49I go on on open table or yel or Google 37:52and try to find that there's a 37:54particular way in which I think through 37:55it the filters that I apply are very 37:56different from how Chris was do it right 37:58so the way I make a decision if 38:01somebody's making that decision for me 38:03great the reason why Tik Tok works so 38:07much better than Netflix on an average I 38:09think I I was um listening to a video by 38:12Scott and he mentioned that we spend 38:14about 155 minutes a week browsing 38:18Netflix on an average in the US 38:20something of that nature like pretty 38:21exited amount of time versus Tik Tok has 38:23just completely taken that fallacy of 38:26choice out for you when you go on Tik 38:28Tok the video that they have pick 38:30there's just so many data points the 17c 38:32video average 16 minutes of viewing time 38:35across your Tik Tok engagement and you 38:37have so many data points coming out of 38:39it seven 71 of them every few seconds 38:41right so they have hyper personalized it 38:44based on how you interact with things 38:46right because they have not not asking 38:48you to go pick a channel a choice that 38:51nature just showing you the next next 38:52next thing in the sequence hence the 38:54stickiness they've understood the brains 38:56of teenagers and then and that 38:57demographic really really well I think 38:59that's the direction that Google will go 39:00into it'll start hyper personalizing 39:03based on all the content if they're 39:04reading and finding out where the 39:06receipt of my shoes are they know what I 39:08actually ended up ordering at a 39:09restaurant that I went to right so the 39:11full feedback loop coming into the 39:13Google ecosystem I think it's going to 39:15be brilliant if they get to a point 39:17where they just make a prediction on 39:18which restaurant is going to work for me 39:21everything they know about me that's 39:22right yeah I mean the future is they 39:23just going to book it for you and a car 39:25is going to show up and you're going to 39:25get in it's going to take you some place 39:27right uh so conf they'll send a 39:30confirmation from your email exactly 39:33right uh Chris 30 seconds you've got the 39:35last word 30 seconds search is going to 39:38be a commodity and I think as we see the 39:40AI assistant era I dare you yeah but it 39:44will be a commodity because we are going 39:46to interact with search via these 39:48assistants it's going to be theer on my 39:50phone which will be enhanced by uh AI 39:54technology it's going to be Android and 39:57Gemini's version on there we we are not 40:00going to be interacting with Google 40:01search in the way we do today with 40:03browsers that is going to be 40:04commoditized and we're going to be 40:06dealing with her assistants who are 40:07going to go and fetch those queries for 40:09us so I I think that's going to be 40:11upended and and at the heart of that is 40:14going to be latency and multimodality as 40:16we said so uh I think they got to PIV it 40:20or they're going to be disrupted yeah I 40:21was going to say just like if that 40:23happens what's interesting is that all 40:25of the advantage Google has actually 40:26vanishes like and then it's an even 40:28playing field against every other llm 40:30which is you know that's a very 40:33interesting Market situation in that at 40:34that point yeah I'm gonna pick that up 40:36next week that's a very very good topic 40:38when we should get more into it um great 40:40well we're at time uh show bit Chris uh 40:43thanks for joining us on the show again 40:45uh Brian we hope to see you again 40:46sometime um and to all you out there in 40:49radi land if you enjoyed what you heard 40:50you can get us on Apple podcasts Spotify 40:53and podcast platforms everywhere and 40:55we'll see you next week for mixture X 40:57where so