Learning Library

← Back to Library

Computer Vision Returns via Meta SAM2

Key Points

  • Tim Hong’s “Mixture of Experts” podcast opens with a panel of technologists (Vagner Santana, Kate Soul, Ami Ganan) to decode the latest AI headlines, especially Meta’s new Segment Anything Model 2 (SAM 2).
  • SAM 2, a next‑generation computer‑vision system, can segment and track objects in images and video, highlighting a resurgence of interest in vision AI alongside the current NLP hype.
  • The hosts stress that true open‑source AI now means more than just releasing model weights; Meta’s decision to also publish the training data sparks debate about the future importance of open data in democratizing models.
  • The episode notes a striking 30 % abandonment rate for proof‑of‑concept AI projects, prompting discussion on whether this reflects optimism or underlying challenges in the industry.
  • Throughout, the panel emphasizes responsible AI development, generative‑AI research, and the strategic role of AI analytics in shaping the next wave of technology adoption.

Full Transcript

# Computer Vision Returns via Meta SAM2 **Source:** [https://www.youtube.com/watch?v=3mcLdfx6HTc](https://www.youtube.com/watch?v=3mcLdfx6HTc) **Duration:** 00:28:55 ## Summary - Tim Hong’s “Mixture of Experts” podcast opens with a panel of technologists (Vagner Santana, Kate Soul, Ami Ganan) to decode the latest AI headlines, especially Meta’s new Segment Anything Model 2 (SAM 2). - SAM 2, a next‑generation computer‑vision system, can segment and track objects in images and video, highlighting a resurgence of interest in vision AI alongside the current NLP hype. - The hosts stress that true open‑source AI now means more than just releasing model weights; Meta’s decision to also publish the training data sparks debate about the future importance of open data in democratizing models. - The episode notes a striking 30 % abandonment rate for proof‑of‑concept AI projects, prompting discussion on whether this reflects optimism or underlying challenges in the industry. - Throughout, the panel emphasizes responsible AI development, generative‑AI research, and the strategic role of AI analytics in shaping the next wave of technology adoption. ## Sections - [00:00:00](https://www.youtube.com/watch?v=3mcLdfx6HTc&t=0s) **Meta's SAM2 and AI Trends** - In a Mixture of Experts podcast, the hosts discuss Meta's new SAM2 “segment anything” model for image/video segmentation, alongside broader AI topics like project abandonment rates, notification overload, and the promise of AI hardware breakthroughs. ## Full Transcript
0:00computer vision is it cool again now we 0:03can take that and then you amplify it 0:05across uh different uh problems to be 0:08solved here so is friend.com AI 0:11Hardware's breakout moment we already 0:13are so addicted to notifications and 0:16it's one more source of notifications 0:18for us we're estimating a 30% 0:21abandonment of proof of concept AI 0:23projects is that a bad thing yeah I 0:25don't think it's as pessimistic as it 0:26could be all this and more on today's 0:29episode of mixture of 0:36experts I'm Tim Hong and I'm joined 0:38today as I am every Friday by a 0:40worldclass panel of technologists 0:42Engineers and more to help make sense of 0:44a tital wave of AI news today on the 0:46panel we've got Vagner Santana staff 0:48research scientist and master inventor 0:50on the responsible Tech Team Kate soul 0:52Who's a program director of generative 0:54AI research and Ami Ganan associate 0:57partner Ai and Analytics 1:00[Music] 1:04so our first segment we're going to talk 1:06about Sam 2 uh meta this week announced 1:09the release of its next generation of a 1:10model it calls segment anything so the 1:12segment anything model is Sam and this 1:14is the next generation of it um and 1:16specifically what the model does is it 1:18allows you to segment imagery or video 1:20so you can select an object and kind of 1:22track it over time now I really wanted 1:24to cover this because you know there's 1:26just so much hype around NLP um and 1:29everybody's talking about chat Bots all 1:30the time but we kind of don't like we 1:32should not forget that there's like 1:33really really exciting things happening 1:35in other domains of AI and particularly 1:37in in computer vision so we're going to 1:39start off with a fun question which is 1:40just simply is computer vision cool 1:43again Kate yes uh Vagner yes and Ambi 1:49always has 1:50been yeah don't call it a comeback right 1:53um well I think with the violent 1:55agreement let's get into this segment I 1:57really wanted to kind of talk about this 1:58because of course it's another iteration 2:00of meta really kind of playing in the 2:02open source game but I think what's 2:04really really interesting is that it's 2:06also a really interesting marker in the 2:09ground for what sort of Open Source 2:11exactly means in the AI space so if you 2:14haven't been watching this place space 2:15very carefully you know in the first 2:18versions of Open Source in AI people 2:19said well we're going to open up the 2:21model and there's going to be weights 2:24that are available um and uh with uh Sam 2:28they're also uniquely releasing the data 2:31uh behind the model um and so Ami maybe 2:34I'll throw it to you as kind of the new 2:35panelist um is I'm curious about how you 2:37sort of see this like in the future is 2:40open data going to be a big part of what 2:42makes a model sort of truly open source 2:44um and kind of talking a little bit 2:46about how you think through some of that 2:48yeah uh so yeah listen um we we love 2:53open source right um yeah and open 2:56source means different things to 2:57different people uh it can be just you 2:59know releasing open data it could be 3:01having open weights um a whole Spectrum 3:03there right I'm really really excited 3:05that meta went ahead and did this on 3:07apach to license it's fully open weight 3:10um there is a lot of computer vision 3:13problems that we've been wrangling for 3:15several years right I remember back in 3:17my grad school days you know we would go 3:19and do uh traditional image processing 3:22and you know segmentation through 3:23watched algorithms and you know drawing 3:25little boxes and things of that nature 3:28um it's very painstaking oh extremely 3:30painstaking and extremely laborious 3:32right and uh um fast forward to today 3:35it's uh it's super super exciting to see 3:38something like this which can operate at 3:40scale on huge videos and think about 3:44this from an Enterprise setting right um 3:47my 3:48clients um I I work with clients that 3:51are you know they have huge 3:53manufacturing operations going on they 3:55have to go you know when you think about 3:57the supply chain there is um you know BS 4:00that need to be moved uh in the 4:01warehouse there is computer vision 4:03that's going on and tracking those 4:05objects or if you look at you know the 4:07the production settings in a lot of our 4:09clients a huge assembly line of objects 4:12of different types that need to be 4:14tracked through multiple different 4:15stages um or if you look at some of our 4:19um you know local governments for 4:21instance right one of the things that 4:23we've seen is um uh people tend to jump 4:27turn Styles right when you're going 4:29through public transport and that 4:31surprisingly is a huge um cost to cities 4:35and local governments right uh city of 4:37New York for instance it's uh it's a 4:39cost of like $750 million um and it it 4:43becomes a big problem to solve and in 4:47the past a lot of these have needed to 4:49be solved through very specific computer 4:52vision models custom trained for these 4:55um specific tasks right what Sam to 4:59enable is for you to be able to go and 5:02rapidly build Those computer vision 5:04models at scale because now you can go 5:06and do these automatic segmentations of 5:10large videos which means whichever 5:12domain right you throw in uh videos that 5:14it hasn't seen before domains that 5:17hasn't been uh trained on before it 5:19still is able to go and do those 5:21segmentations and track those objects 5:23over time right and so now this gives us 5:26a very very um capable mechan to go 5:30build these domain specific computer 5:32vision models at scale and so you know 5:35short answer really really exciting and 5:37that's why I think that open source uh 5:39capability helps now we can take that 5:41and then you know amplify it across uh 5:44different uh problems to be solved here 5:47yeah for sure and I think that's kind of 5:48one of the most interesting things 5:50because I think yet again this has sort 5:51of been a theme in a number of our 5:52conversations you know meta and its blog 5:54post is like this is so exciting because 5:56you could use it for AR glasses and I 5:58think one of the questions I had was 6:00like is this the technology that finally 6:01gets AR glasses to work and I'm kind of 6:03I don't know Kate if you got opinions on 6:05that or Vagner you got opinions on that 6:07but there's almost kind of one point of 6:08view which is like like again with AI 6:10like the big application is going to be 6:12like Turn Style enforcement right it 6:14actually won't be these kind of consumer 6:16elements but I don't know if anyone 6:17wants to speak up for like no actually 6:18this is this is the moment that's really 6:20going to make AR glasses work I mean I'm 6:22sure this helps us get closer not 6:24farther away but you know I'm I'm always 6:26wary of anything that's uh claimed to be 6:28a silver bullet 6:30but I I want to get back uh Tim to what 6:32you mentioned earlier about like open 6:34sourcing the data because I think it's 6:36really interesting to talk about you 6:38know meta strategy and and how in Vision 6:41they've released the data behind Sam 2 6:43but and the license of the model itself 6:45is apache2 and you look at the Llama 6:47series and you know 3.1 came out uh just 6:49last week where it's under a specific 6:52llama license uh and there is absolutely 6:55no data uh that's released or even 6:57described really in terms of um what was 7:01used in in a little B more yeah for our 7:04listeners I think they'd really benefit 7:05from like hearing so what is the 7:07difference there exactly between kind of 7:08Apache and you know what's happening 7:11Obama and I guess why right this kind of 7:12question is like why aren't they 7:13consistent yeah so Apache 2.0 is a very 7:17popular widely used open- Source license 7:20that's been around for years and is 7:22considered a very permissible license 7:24anyone can build on top of it for 7:25commercial or other uses without having 7:28to worry about further attribution to 7:30where uh things came from where llama 7:33when the models were released meta 7:35created a llama license that is custom 7:37in bespoke to handle Llama weights 7:40another big differentiation is Apache 2 7:42is normally used for licensing software 7:44um and the data that they released on 7:46Sam 2 I think was CC bya which is 7:49similar to Apache 2 but commonly used 7:50for data so you know there are different 7:52terms you want to govern different 7:54artifacts Apache for software CCB also 7:58often for data and now model weights 8:00people have started to come up with 8:01their own licenses CU model weights also 8:04fit somewhere between software and data 8:06it's a little bit unclear how to the 8:09jurisdiction there yeah I think it's 8:10such a great point to end on uh and I 8:12think if I can I like maybe just to take 8:14one more turn at that because I think 8:16it's a really important part of this 8:17question you know it strikes me that one 8:20of the reasons everybody's very excited 8:21about open source is the accessibility 8:23of the technology right this is not 8:24going to be something that you know a 8:26company just just kind of put up walls 8:28around and then charge you for access 8:29too um but it kind of s strikes me that 8:32like part of the problem of doing open 8:34sourcing is that it's also a lot more 8:35hard like difficult to control use right 8:37like you suddenly have this technology 8:39that kind of anyone can use and you know 8:40some of the people that use it are not 8:42going to use it in the most responsible 8:44way um and I feels like that's like a 8:46really hard challenge right because like 8:48I think you know kind of democratizing 8:49the technology also creates tensions 8:51with how do we like enforce use cases um 8:55and um yeah I'm curious if the panel has 8:56any kind of thoughts on on that yeah 8:59yeah I I was I think that what open 9:02sourcing has been one interesting 9:04mitigation for these situations because 9:07as the community notice notice that 9:09there's something going wrong wrong or 9:11there's a a specific um harmful use then 9:15Community takes action and uh uh we can 9:19look back to open- Source uh uh 9:22operational systems right they they are 9:26the most secure ones right that we have 9:28because the community uh um 9:30automatically or or they build on top of 9:33this uh openness right and they try to 9:37to tackle and also mitigate these um 9:39these issues so I think that in this 9:42sense I think open sourcing is a good 9:44strategy to mitigate this this uh issue 9:46if we're not transparent and open about 9:48the technologies that are available or 9:51will be available if as uh people 9:53continue to work in this area there's no 9:56way for us to build regulations and 9:58awareness and proper practices around it 10:01so I'd much rather had this be happening 10:03out in the open than you know behind 10:05some closed doors where we really don't 10:07have a good good line of sight into um 10:10what's going on that's right yeah I 10:12guess this model of like just trust us 10:13to like a world where we can actually 10:15kind of verify it by like the 10:17verification part yeah absolutely I feel 10:20like yeah I mean once you put it in the 10:21open there's you know lot more heads 10:25thinking through really tricky problems 10:28and there is a lot more diversity of 10:30solutions that come in terms of 10:32mitigating these problems right rather 10:34than trying to you know force and 10:36control it I think when you put it out 10:38in the open 10:40um you you'll have a lot more Creative 10:44Solutions coming to solve these 10:46[Music] 10:50problems okay for our next segment I 10:53want to cover friend.com um so as you 10:56all may know right there's been a long 10:59in dream in the valley that one of the 11:00really exciting things you could do with 11:02llms is the notion of really for the 11:05first time creating a fully-fledged kind 11:07of AI companion assistant um and this 11:09dream is kind of manifested in a bunch 11:11of Hardware projects that have taken 11:12place so the Humane pin that came out 11:14earlier this year is a good example of 11:16that um and friend.com is uh a most 11:20recent iteration of that so Avi shiffman 11:22and entrepreneur launched this with a 11:23teaser trailer earlier uh this week and 11:27um AV has taken a lot of criticism 11:28online but actually want to take this 11:30conversation in a slightly different 11:32direction which is that I think you know 11:34what's really interesting and what's 11:36kind of offered by friends.com is sort 11:39of the idea that maybe startups can 11:41actually start competing in the AI 11:42Hardware space and that you could 11:44actually in the future launch a AI 11:47Hardware project and even something so 11:48Advanced as like an AI Hardware 11:50companion um just being a small startup 11:53on your own right that this is not just 11:55going to be a kind of space where you 11:57know the big companies can only play um 11:59but that actually might be a place where 12:00startups can play as well um and you 12:04know I guess I want to kind of put 12:05forward this idea and K maybe I can pick 12:07on you is do you kind of buy the idea 12:09that like the costs of AI are coming 12:10down so much that you know we're about 12:12to kind of be a wash in these types of 12:14things like the idea of someone 12:15launching an AI companion product is not 12:17going to be like something only you know 12:19the biggest tech companies in the world 12:21can do but that you'll also have like 12:22these upstarts that will be able to kind 12:24of like do their own take on on this 12:26space yeah I I think it's a really 12:28interesting 12:29question because we're getting so many 12:32kind of in a way conflicting signals of 12:35what's going on in this space so you 12:37know uh Gartner just released a report 12:40yesterday or two days ago saying that 12:42they expect 30% of all poc's in gen to 12:45be never leave the PC phase yeah 12:47definitely we're going to talk about 12:48that later I think this is going to be 12:49the final final segment of the episode 12:50okay great but a lot of what they were 12:52talking about is citing the costs right 12:54so is uh that we're not seeing the ROI 12:58offset the cost enough 13:00and I I think that certainly makes sense 13:03given what we're seeing but on the other 13:05side we're seeing models get smaller and 13:06smaller and smaller like there is this 13:08clear Trend where we're able to pack 13:10more performance and fewer parameters 13:12where we're being able to get to the 13:14point where these models can run in CPUs 13:16and we don't need the Advanced Hardware 13:18at to the same degree that we did a year 13:20ago and you know some of these scaling 13:23laws are really exciting in terms of how 13:25efficient the technology is growing so I 13:27don't think it's unreasonable to think 13:28that that we could get to a place where 13:30startups could actually get into the 13:32hardware space um for geni type 13:35deployments yeah and I think it's kind 13:36of fascinating just because you know had 13:39you talked to me like five years ago I 13:40would have been like oh yeah the future 13:41is just like one one big company that 13:44has all the AI right but it kind of 13:46feels like we're going to just be a wash 13:47in intelligence like there'll just be 13:49models everywhere you know particularly 13:51with the developments in open source 13:52that we were talking about um I don't 13:54know if Ambi or Vagner you've got kind 13:56of thoughts on this about just like how 13:58accessible this and how competitive 13:59really ultimately a space this is going 14:01to be yeah so and definitely agree with 14:03Kate there right so I think small 14:05language models are becoming way more 14:07powerful and way more popular for 14:09variety of reasons right um in the 14:12consumer space like you mentioned uh you 14:14know it's uh there is a there's a lot of 14:18competition in terms of hey you know 14:19I'll put something on the edge um it 14:21could be a companion type of a device it 14:23could be for you know um something else 14:26that you just want to run on your phone 14:27locally um you know something that you 14:30want to run on a Raspberry Pi device 14:32that you're just you know tinkering with 14:35there could be a lot of different 14:36variations where you're trying to run 14:38these models on the edge um definitely 14:41on the consumer side but we're starting 14:43to see some of that on the Enterprise 14:45side as well right because now 14:48Enterprises are wondering um can I go 14:52and start building really domain 14:56specific uh models and you know this 14:59small language models then come and help 15:01them uh Power it through so if I have 15:03data that I don't want to expose at all 15:06to the internet but I still want these 15:08capabilities and I have devices in my 15:11manufacturing plant where I want you 15:14know these to be helping my uh plant 15:17workers and things of that nature then 15:19these become a solution right so small 15:23language models running on edge in local 15:26devices that's definitely becoming 15:28popular 15:29um both in the consumer phace as well as 15:31in the Enterprise phace yeah and I think 15:33thinking about the economics of this the 15:35other thing I wanted to touch on on 15:37friends.com is you know they the product 15:39is being offered for $99 with no 15:41subscription which is also like very 15:43intriguing like to think about the 15:44business model of this there's always 15:45been I think an assumption in the AI 15:47space which is well the consumers are 15:49going to demand they want better and 15:50better and better models over time but I 15:52also kind of think about like I had a 15:54tamagachi as a kid right and I built 15:56like very deep emotional relations with 15:58my tamagachi and they it's not like they 16:00sent updates over the wire to the 16:02tamagachi it was just like a thing that 16:03they printed in the factory and it came 16:04to me um and I actually wonder whether 16:07or not like there'll be almost a similar 16:09dynamic in AI like we're also you know 16:11onb to your point like there's almost 16:12assumption that like people will want 16:14the higher capacity models over time but 16:17I also kind of think that we may just 16:18have like a retr Computing movement in 16:19AI where people are like oh yeah gpt2 16:22like that's like really where like the 16:24the peak of llm creation was um do you 16:27buy that it's like my weird take that 16:29I've been kind of playing around with is 16:30like actually it may be possible to do 16:31non-subscription AI businesses because 16:34if you have a model that someone really 16:35likes interacting with they actually may 16:37not want it to change at all um and yeah 16:39curious if folks have any thoughts on 16:41that Vagner I'll maybe toss it over to 16:43you uh well I was um reading a few 16:47pieces about the the friend.com device 16:50and uh uh one thing that at least looks 16:54interesting is that um it says that the 16:58context window well it it's not 17:00processing anything beyond the context 17:02window so if you think about small uh 17:05language models imagine that we could 17:07have one uh being hosted on your mobile 17:10phone then this could be possible but 17:12friend.com nowadays use clo 3.55 so uh 17:17it's processing elsewhere right so it's 17:19a device communicating via Bluetooth to 17:22your mobile phone and again to your 17:24point on time I go I think that it it's 17:27it's a lot different 17:29in Z tamagi like feeding on people's 17:31loneliness that's that's the model 17:33basically right so that that's different 17:36because the whole Dynamics is different 17:37because before we would have like to 17:39take care of the tagoi and that was the 17:41relationship right and nowadays with 17:43this specific device it's application 17:45like it's uh uh again I'm holding myself 17:49because I have so many things to talk 17:51about this but yeah now that you mention 17:54about tamagi is like the other way 17:55around right because it's uh we already 17:58are so addicted to notifications and 18:01it's one more source of notifications 18:03for us right and it's based on uh um uh 18:08the usage that or again and I I've read 18:11one really interesting um analogy for 18:14for this is like um treating uh uh uh 18:18loneliness with this device like 18:20offering as if was a really friendship 18:23is like uh uh giving junk food to 18:25someone starving like okay may help 18:28right now but it's not a solution in the 18:30long run right so that's again to your 18:33point I think that thinking about small 18:35language models without transferring the 18:37data elsewhere I think it's an 18:38interesting way of thinking especially 18:40for startups creating new technologies 18:43but this specific use I have so many 18:45concerns I think uh the the gp2 gpt2 3.5 18:49level capabilities for generic 18:52conversation capabilities right that I 18:55think sure you can you know you can uh 18:58can have a quan version and I think you 19:00can have the small language models 19:01operating to a good degree of just 19:04general conversational capabilities um 19:07and then you could you could stop there 19:09but the moment you're trying to get to 19:11something uh specific right um you're 19:15trying to get to something uh a domain 19:17specific right you you go uh try to have 19:20a deeper conversation then you know I 19:23think you still need to get to some of 19:27the larger models right so 19:29I think I think where it will lead to is 19:31that you know um uh Solutions like this 19:35can give you that superficial shallow 19:37conversations but then the moment you 19:39try to go deeper and deeper maybe you 19:42know you you have to get out of those 19:44smaller language models at this point in 19:45time at least I don't know there was 19:47something like very satisfying to me to 19:49hear that it wasn't going to be 19:51subscription it wasn't going to try and 19:53be a large model that had deeper convers 19:55like to me it's meant it was almost more 19:58like a 19:59a meditative like tool for the near near 20:02term but like my dad is not going 20:04anywhere it's not trying to be like a 20:05real human you know like it it I really 20:09appreciated how much it can strained the 20:11scope of the use cases and what this can 20:13do by saying like look it's a device 20:15we're not going to update it and it's 20:17going to be you know running uh locally 20:20yeah for sure that almost actually is 20:21sort of interesting I mean I think all 20:22these points sort of come together is 20:24like oddly the fact that it is not 20:26updated that does not go to the cloud 20:29like almost presupposes a limitation in 20:31how far the relationship can go right 20:34it's like V to your point maybe it's 20:35actually the most ethical way of 20:37Designing this this architecture right 20:39is just like an intentionally limited 20:41system um we would actually be worried 20:43if it was like we're going to push 20:44updates and it's just going to get 20:45better and better and better and you're 20:46going to build this like massive 20:47parasocial relationship with this thing 20:49that's not a real 20:50[Music] 20:54person I'm going to move us on um so our 20:57next story and KR is already anticipated 20:59me a little bit on this is uh Gartner 21:01the industry research group uh came out 21:03with a report this week that estimated 21:05that about 30% of gen projects will be 21:08abandoned after their initial proof of 21:10concept by the end of 2025 and they cite 21:13a number of reasons for this you know 21:14poor data quality inadequate risk 21:16controls escalating costs or unclear 21:18business value and this kind of follows 21:20on a a string of reports in a very 21:23similar vein so just a few weeks ago we 21:25talked about the Goldman Sachs report 21:27and the Sequoia report um H but for this 21:29segment I think what's pretty 21:31interesting and I think this is the 21:32first place I want to start is is 30% 21:35all that bad like I was kind of taking a 21:36look at that and I'm like oh if we're 21:37doing 30% then like for a new technology 21:39we're we're killing it I had the same 21:41when I first looked at it I actually was 21:43like wait are they saying 30% will 21:44succeed or 30% will be abandoned cuz I 21:47assumed it would be the inverse honestly 21:51um so you know I I buy it I also don't 21:54think it's as uh pessimistic yeah I 21:57don't think it's as pessimistic as it 21:58could be uh and I think it's valid in 22:01that look the costs right now we're in 22:03this period where the costs are 22:05difficult and we need to have more um 22:09refined approaches for picking PC's 22:11identifying and understanding the 22:13lifetime cost and lifetime value of 22:15pcc's is going to be really important 22:18but also you know like we were talking 22:20about earlier this Tech if you look what 22:22it cost to do something a year ago 22:23versus what it cost to do something 22:25today and the rate that that's changing 22:27you know 22:29I think we're we're honestly um in a 22:32fairly optimistic place as we talk about 22:33emerging Technologies and and where gen 22:36is headed yeah this is actually a very 22:38powerful argument is almost if I hear 22:40you right you're sort of saying even if 22:42the benefit of AI stayed fixed the fact 22:44that the costs are dropping so extremely 22:47will almost end up justifying the 22:48technology like it's actually the the 22:50costs changing versus like the benefits 22:51changing over time um never really 22:53thought about it like that that's really 22:55I have a slightly different take on this 22:57you know maybe complete C here so I 23:00think when we say gen projects right 23:02there is a little bit of uh uh confusion 23:05and uh a misinterpretation on what those 23:07mean right um we've realized and when we 23:11especially work with Enterprises we 23:13realize that the the impact is when you 23:16do these generic projects you're you're 23:18trying to solve for specific problems in 23:21specific workflows and subtasks right 23:24so when you look at gen projects and 23:28solutions that are going and laser focus 23:30solving for specific subtask right those 23:33are being incredibly efficient we're 23:35seeing right um so I think when we say 23:38you know hey 30% um you know uh 30% 23:42abandonment of gen project I think there 23:45is probably a little bit of a mixture on 23:47what those gen projects mean right it 23:48could be really broad-based things not 23:50necessarily focusing on specific 23:52workflows or specific T so that's kind 23:55of how I view it right um I fully agreed 23:58that you know you know there is a focus 23:59on value that you know Enterprises 24:01definitely look at it and say you know 24:03when I'm putting in an investment into 24:05gen am I you know deriving the value out 24:08of it so uh 100% on that but when we say 24:12you know it's going into um a certain 24:15set of Abandonment rate I think it it 24:17depends on okay what exactly are you 24:19measuring right um are you measuring the 24:21things where it's going and solving 24:24specific um subtask and problems and 24:26automating a workflow or things of that 24:28nature yeah that's right and I think 24:30that was actually I mean outside of the 24:3130% I'm giving them a little bit of a 24:32hard time on their report but I think 24:34one interesting observation was they 24:36they were saying look a lot of the AI 24:38benefits are productivity benefits and 24:40that's really hard to necessarily 24:41capture in terms of like increased 24:43profits and so there is kind of this 24:44interesting breakdown where the 24:46technology can legitimately be producing 24:47a lot of benefit but actually just like 24:50as a dollars and cense or in the very 24:52least on the bottom line standpoint like 24:54is it improving my profits may be a very 24:56hard time to kind of like draw that that 24:57connection I think that's why I think 24:59those measurements become more important 25:01right I think as the technology improves 25:02and as people start driving a lot of 25:04these I me starting to see those right 25:06um one of my clients now there is a 25:08maniacal focus on saying okay I'm going 25:10to go and um see if I'm impacting this 25:14particular subtask and subflow am I able 25:16to go and figure out what metrics I'm 25:19going on solving for and I'm going to 25:20monitor those metrics so those 25:23measurements are starting to get put in 25:25place so once those measurements start 25:26coming up more and more then you'll have 25:29more visibility into it right so I think 25:31it's mostly a question of are you 25:32getting the right level of measurements 25:33and 25:34[Music] 25:38Metric for our final segment uh I think 25:41one of my favorite things that's going 25:43on in the world of large language model 25:45evaluations right now um is that 25:48everybody has their own like kind of 25:50weird you know folk eval right we've got 25:52mlu and all the official benchmarks but 25:55really where most of the action is is 25:56that when someone sits down and starts 25:57talking to a chap SP for the first time 25:59they have their own set of evals that 26:01they roll out um one of the ones that's 26:02been talked about a lot online is simply 26:04asking a model is the number 9.11 bigger 26:07or is the number 9.9 bigger and it turns 26:09out models routinely fail on this and uh 26:12so for this final section I kind of want 26:14to just do a fun little thing with 26:15particularly the experts that we have 26:17here today which is to get their uh 26:20offthe cuff evals I think I do similar 26:23eil I I usually test out on like math 26:26problems right um that's a that's a good 26:29one um you know your your your standard 26:33um um multiplication addition set of 26:36problems um those are usually a good 26:39level of uh indicator right so similar 26:42to the 9.11 versus 99 but a different TR 26:44that's right but it's like just to go a 26:45little further it's like basic 26:46arithmetic you're asking you're like 26:48what is this five-digit number plus this 26:49five-digit number or yeah maybe a little 26:51more complex right here like five 26:52numbers and then you know sort them in a 26:55sequence or you know go multiply these 26:57and then go figure out what's the 27:00response and then sort them things of 27:01that nature right so becomes like a um a 27:05math problem that I would give a third 27:06grader or fourth grader yeah for sure 27:07Vagner how about you I there's one that 27:09I like that uh sometimes uh reveals a 27:12little bit of the bias and cultural bias 27:14it's about uh describing a breakfast how 27:17does a breakfast look like so then you 27:19usually buy the materials and what the 27:21the LM spits out then you can like have 27:24a grasp of what the data is coming from 27:26to describe aast right looking for like 27:28cultural bias like describe a breakfast 27:31that U bacon and eggs or is it that uh 27:33uh uh bread with butter or is that 27:36oatmeal like something different right 27:38uh it's it's a espresso coffee or it's a 27:42americano coffee so that tells a lot 27:44about the BIOS and Cal bios inside the 27:46that's awesome I'm going to start using 27:47that one um all right well Kate round 27:49this out take us home here uh there's 27:51there's a couple of good ones none of 27:53which I came up with on my own I mean 27:54the advantage of sitting within research 27:56is you get some really creative mind 27:58um but a couple of my favorite ones uh 28:01what type of animal is a chicken uh 28:03you'd be surprised uh when the model 28:06comes back with there's a couple around 28:07safety that that I like to do you know 28:09asking about you know there's two people 28:11from different Origins which one's a 28:12criminal and see what the model replies 28:15with just to try and feel out that some 28:16of the basic levels but uh yeah there 28:19we've got a long a long list of type fun 28:21things that we like to try along those 28:22lines those are great yeah I'd love to 28:24talk more about that as just like as I 28:25collect this kind of like little library 28:27of just they're very they're often very 28:28funny too like people are just like it's 28:30a real counterintuitive way at some of 28:32these problems um well look uh Vagner 28:35Kate Ami uh thank you for joining us 28:37today um Ambi I hope you had a good time 28:39hopefully you'll join us again at some 28:40point in the future um and to all you 28:43listeners uh thanks for joining us um if 28:45you joined what you heard you can get us 28:46on Apple podcasts Spotify and podcast 28:49platforms everywhere and we'll see you 28:51next week on mixture of experts