OpenAI Social Network: Cringe or Data Strategy
Key Points
- The episode opens with a light‑hearted debate among guests—Kate Soule, Marina Danilevsky, and newcomer Gabe Goodhart—who all label the rumored OpenAI social network as “cringe,” setting a skeptical tone.
- The hosts explore why OpenAI might launch its own platform, with Kate suggesting it’s primarily a data‑collection strategy to feed conversational AI models, similar to how Meta and X use their networks.
- Marina questions the actual value of social‑media content for model training, noting that much of the material is low‑quality “garbage” and wondering whether OpenAI’s interest is driven by genuine utility or simply fear of missing out on user‑generated data.
- The broader “Mixture of Experts” show preview mentions upcoming discussion points: an Anthropic blog post on reasoning models, Wikipedia’s battle against scraping bots, and a quirky half‑marathon run by robots.
- Overall, the panel frames the OpenAI social‑network rumor as a potentially risky venture that could be more about data acquisition than delivering a compelling new user experience.
Sections
- OpenAI Rumors: New Social Network - In a humorous round‑table, podcast hosts and guests react to the speculation that OpenAI is planning its own social platform, debating whether the idea is exciting or “cringe‑worthy.”
- Exploring Novel AI Interaction Models - The speakers discuss leveraging a data‑centric platform to experiment with new user‑AI interaction patterns, acknowledging its niche appeal and the challenge of integrating AI into daily life versus drawing users to the AI.
- AI‑Powered Hyperpersonal Advertising - The speaker explains how platforms like Facebook have enabled ultra‑targeted ads and predicts that AI bots will become personalized sales influencers that mimic users' speech and behavior, creating a direct, borderless pipeline between consumers and capitalism.
- Historical Diversification Patterns Resurface - The speaker likens today’s tech‑industry maneuvers to past oil companies buying movie studios for portfolio diversification, then pivots to a discussion of Anthropic’s blog on assessing the faithfulness of AI model reasoning.
- Questioning AI Reasoning Terminology - The participants critique the anthropomorphic use of terms like “hallucination” and “reasoning,” arguing that chain‑of‑thought outputs are more a product of model training and reinforcement tricks than genuine, explainable thought processes.
- Illusory Reasoning in Language Models - The speakers critique how prompting LLMs to produce “reasoning” often merely exploits a performance hack that boosts exam‑type metrics without delivering genuine explanations, warning against over‑interpreting such results and questioning what term should describe this superficial behavior.
- Warm‑up Signals vs Model Reasoning - The speakers debate whether AI systems should expose internal reasoning traces to users, arguing that these signals function more like a warm‑up than genuine explanation and may foster unwarranted trust.
- AI Crawlers Threaten Wikipedia Sustainability - The speakers discuss how AI firms' aggressive data scraping of Wikipedia—ignoring robots.txt and other norms—creates second‑order risks for the site’s sustainability despite its high demand as a premium training dataset.
- Open Data Collaboration vs Isolation - The speaker argues that companies should invest in and host high‑quality open data resources—mirroring open‑source models—rather than erecting barriers, to foster a healthier, shared AI ecosystem.
- Scaling Challenges and Source Trust - The speaker argues that AI’s current problems stem from scaling infrastructure rather than knowledge, emphasizing the need for models to cite trusted sources—like Wikipedia—to maintain credibility and long‑term user trust.
- Optimism, Data Access & Robot Marathon - The speaker highlights the need for streamlined data pipelines that lower server load while sharing a humorous Beijing half‑marathon story where human runners outperformed humanoid robots, reflecting ongoing skepticism about robot hype.
- AI, Robotics, and VC Theatre - The speakers debate the value of venture‑backed theatrical robotics projects as a path toward multimodal AI, weighing hype against genuine scientific exploration.
- Laundry Folding as Spectator Sport - A host humorously imagines a 4 a.m. ESPN‑style humanoid robot laundry‑folding competition and then wraps up the podcast episode by thanking guests and urging listeners to subscribe.
Full Transcript
# OpenAI Social Network: Cringe or Data Strategy **Source:** [https://www.youtube.com/watch?v=LnVkwuwL7LU](https://www.youtube.com/watch?v=LnVkwuwL7LU) **Duration:** 00:37:57 ## Summary - The episode opens with a light‑hearted debate among guests—Kate Soule, Marina Danilevsky, and newcomer Gabe Goodhart—who all label the rumored OpenAI social network as “cringe,” setting a skeptical tone. - The hosts explore why OpenAI might launch its own platform, with Kate suggesting it’s primarily a data‑collection strategy to feed conversational AI models, similar to how Meta and X use their networks. - Marina questions the actual value of social‑media content for model training, noting that much of the material is low‑quality “garbage” and wondering whether OpenAI’s interest is driven by genuine utility or simply fear of missing out on user‑generated data. - The broader “Mixture of Experts” show preview mentions upcoming discussion points: an Anthropic blog post on reasoning models, Wikipedia’s battle against scraping bots, and a quirky half‑marathon run by robots. - Overall, the panel frames the OpenAI social‑network rumor as a potentially risky venture that could be more about data acquisition than delivering a compelling new user experience. ## Sections - [00:00:00](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=0s) **OpenAI Rumors: New Social Network** - In a humorous round‑table, podcast hosts and guests react to the speculation that OpenAI is planning its own social platform, debating whether the idea is exciting or “cringe‑worthy.” - [00:03:05](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=185s) **Exploring Novel AI Interaction Models** - The speakers discuss leveraging a data‑centric platform to experiment with new user‑AI interaction patterns, acknowledging its niche appeal and the challenge of integrating AI into daily life versus drawing users to the AI. - [00:06:28](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=388s) **AI‑Powered Hyperpersonal Advertising** - The speaker explains how platforms like Facebook have enabled ultra‑targeted ads and predicts that AI bots will become personalized sales influencers that mimic users' speech and behavior, creating a direct, borderless pipeline between consumers and capitalism. - [00:09:29](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=569s) **Historical Diversification Patterns Resurface** - The speaker likens today’s tech‑industry maneuvers to past oil companies buying movie studios for portfolio diversification, then pivots to a discussion of Anthropic’s blog on assessing the faithfulness of AI model reasoning. - [00:12:32](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=752s) **Questioning AI Reasoning Terminology** - The participants critique the anthropomorphic use of terms like “hallucination” and “reasoning,” arguing that chain‑of‑thought outputs are more a product of model training and reinforcement tricks than genuine, explainable thought processes. - [00:15:38](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=938s) **Illusory Reasoning in Language Models** - The speakers critique how prompting LLMs to produce “reasoning” often merely exploits a performance hack that boosts exam‑type metrics without delivering genuine explanations, warning against over‑interpreting such results and questioning what term should describe this superficial behavior. - [00:18:44](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=1124s) **Warm‑up Signals vs Model Reasoning** - The speakers debate whether AI systems should expose internal reasoning traces to users, arguing that these signals function more like a warm‑up than genuine explanation and may foster unwarranted trust. - [00:21:46](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=1306s) **AI Crawlers Threaten Wikipedia Sustainability** - The speakers discuss how AI firms' aggressive data scraping of Wikipedia—ignoring robots.txt and other norms—creates second‑order risks for the site’s sustainability despite its high demand as a premium training dataset. - [00:24:48](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=1488s) **Open Data Collaboration vs Isolation** - The speaker argues that companies should invest in and host high‑quality open data resources—mirroring open‑source models—rather than erecting barriers, to foster a healthier, shared AI ecosystem. - [00:27:50](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=1670s) **Scaling Challenges and Source Trust** - The speaker argues that AI’s current problems stem from scaling infrastructure rather than knowledge, emphasizing the need for models to cite trusted sources—like Wikipedia—to maintain credibility and long‑term user trust. - [00:31:08](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=1868s) **Optimism, Data Access & Robot Marathon** - The speaker highlights the need for streamlined data pipelines that lower server load while sharing a humorous Beijing half‑marathon story where human runners outperformed humanoid robots, reflecting ongoing skepticism about robot hype. - [00:34:14](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=2054s) **AI, Robotics, and VC Theatre** - The speakers debate the value of venture‑backed theatrical robotics projects as a path toward multimodal AI, weighing hype against genuine scientific exploration. - [00:37:18](https://www.youtube.com/watch?v=LnVkwuwL7LU&t=2238s) **Laundry Folding as Spectator Sport** - A host humorously imagines a 4 a.m. ESPN‑style humanoid robot laundry‑folding competition and then wraps up the podcast episode by thanking guests and urging listeners to subscribe. ## Full Transcript
Open AI is apparently working on a new social network.
Pretty cool or kind of cringe?
Kate Soule is Director of Technical Product Management for Granite.
Kate, welcome back to the show.
What do you think?
Major cringe vibes.
No, thank you.
Okay.
Um, Marina Danilevsky is a Senior Research Scientist.
Marina, cool or cringe?
Extremely cringe.
Okay, I'm gonna have unanimous vote on this one.
And last but not least, is Gabe Goodhart joining us for the very first time,
Chief Architect, AI Open Innovation.
Gabe, welcome to the show.
What do you think?
So many things that could go wrong.
Maybe something interesting but cringe for me as well.
Okay, great.
We'll get into that, all that and more on today's Mixture of Experts.
I am Tim Hwang, and welcome to Mixture of Experts.
Each week, MoE brings together the sharpest crew in all of podcasting
to discuss and debate the biggest news in artificial intelligence.
As always, there's a lot to cover.
We're gonna talk about a super interesting blog post out of
Anthropic about reasoning models.
Uh, Wikipedia getting slammed by scraping bots and a super interesting
half marathon being run by robots.
But first I want to start with, uh, the round the horn question that we
began with, which is rumors that OpenAI is going to launch its own social
network, which of course is baffling.
Um, as a company that's largely built, its kind of money, its expertise,
and its brands on foundation models and advancing the state of the art.
Maybe, Kate, I'll turn to you first.
Why would OpenAI wanna do this at all?
Yeah, I, I actually don't think it's that baffling.
I think it's pretty straightforward.
I mean, Meta and X both have these social platforms that essentially
they can use to learn about conversational patterns to frankly
generate and collect data potentially.
And, you know, OpenAI has shared beyond, and other providers have
shared that, you know, they're running out of data, so to speak.
And so I think they very much see this as a data play of
being able to create a platform.
Hopefully, you know, they have some way to provide some value to incentivize users
to join, but ultimately I think they're in it for the data that they'll be able
to collect behind the scenes and use that to train more conversational, fluent,
and, and robust models in the future.
Got it.
Yeah, and I think, Marina, that was a question I had for you is like, is this
social media data, like all that valuable?
Like I go on social media and I scroll.
You know, X former Twitter and I'm like, this is kind of like
garbage content in a lot of ways.
Um, but is this data actually helpful?
I mean like advancing kind of the state of the models and what they can do?
Uh, I think there's kinda an interesting question about just like clearly
open AI seems see some upside, but I'm curious about what you think.
I mean, a little bit of it I think is FOMO of, wait, we want people to come and
make bad internet memes on our platform.
Why do they have to leave our platform?
We wanna be there too.
But yeah, any of this kind of data is valuable because it's different.
So again, synthetic data generation, which is where everybody is really kind
of getting their data now that they've run out of data, isn't really good at
making interesting little viral memes.
These models aren't that great with humor and subtlety and creativity and things
like that, so you get that from people.
So.
Especially being able to combine this type of additional, uh, input
and injection of ideas that you would get from this kind of thing.
Yeah, I will say you're probably gonna get a real specific slice of humanity
using this and a real specific for
Yeah, right.
Yeah.
No comment.
Um, that you're gonna actually have using this and creating the data.
So yeah, you'll get something out of it.
And I agree with Kate as I usually do that it's a data play.
Um, and also, yeah, it's not that hard.
Anymore to, to put this together.
But again, I think they're gonna be a little limited in who comes
to there and uses it for what?
That's right.
Yeah.
And Gabe, I know you were maybe the one, everybody thought it was cringe, so maybe
that's just an established fact, but you were saying like, it might be cool if
they maybe get a couple things right.
What do you have in mind there?
Yeah.
Well I think the, the part that's really interesting for me is thinking
about this as a way to, experiment with a novel interaction pattern.
Right.
Uh, personally, social networking seems to me to be the wrong way.
They've already pioneered a fairly novel interaction pattern, but I think the,
I think where many AI platforms are.
Finding themselves hitting a wall is integrating further into daily
life as opposed to bringing the, you know, bringing the users to the AI.
So I think to me, this seems like sort of a, well, what,
what, where do people interact?
Hmm.
I wonder if we could go there.
Type of play.
And obviously as a company that's trying to make themselves, you know, a relevant
standalone company as opposed to sort of an integration partner, they're
not gonna go look to necessarily plug
their AI directly into a competitors', um, platform since many of the other
social media companies are also AI model companies at this point.
And so I'm wondering if this is really their attempt to sort of branch
out from being a, uh, AI specific.
Company to being a tech company, right?
Think Google moving beyond search, think Meta moving beyond Facebook.
Um, I'm wondering if this is just their first step in trying to become
an omni company, like the other big tech players that are user facing.
Yeah, that's an interesting flip.
And I think it introduces kind of a new argument, right?
Like I think on one hand there's sort of Kate and Marina's take, which is like,
we just need the data for the models.
Oh, and data is a hundred
percent part of the story, no question.
Totally. Yeah.
Yeah. And I'm not denying that.
I mean, I think that that makes a total sense.
And then I think the second bit that you're bringing in, which is
sort of interesting, is like, well.
It's also maybe if it gets popular, like a distribution point for AI
models, which is kind of a funny thing.
You know, I think like in some ways like the, even the phrase social media
implies like people talking to people.
Um, and so it's sort of interesting the idea that like, okay, well actually
we really want to kind of like augment that or get, you know, AI's involved.
It kind of reminds me of, um, you know, there's been this kind of craze about
like, oh, you're gonna have a group chat, but then there's gonna be an AI
in the group chat that will assist.
And I mean, by and large, those haven't taken off all that much.
But maybe, I don't know, maybe the nature of kind of what we think of
as a social network is changing.
I don't know.
Kate, if you think that's like a possibility that we see happening here.
I don't see a ton of future there, but what I do think they're setting
themselves up for is another form of monetization and integrating with
advertisements, so having models that
innately understand the language that is being used to communicate
with one another on their platform
and using those to then generate really targeted advertisements directly to
those users, uh, is something that, you know, Meta, uh, not Meta, excuse me,
OpenAI is really setting themselves up for, uh, and so, you know, I think
is they're also looking to pay for all of those really expensive models.
They definitely probably see some opportunity to, to drive, have
new sources of revenue, right?
Um, based off of this new platform.
Yeah, to jump on that, you're a member of Facebook.
When it got started, it was a really, really big deal for advertising
because you could actually do things that were super, super targeted,
really kind of for the first time in that sense, on that kind of a scale.
So rather than general, oh, this is what you searched for, now it's No, no, no.
We know like who you are, what you like, what your friends like, all
these characteristics about you.
This is trying to see if you can take it to, to the next level really.
As well as like what your speaking patterns are.
Yes.
I mean, imagine having, it's not, I don't think it's gonna be
humans socializing with humans.
I think it's going to be
bots selling to humans using their exact speaking patterns, languages, every,
you know, everything that you learn in sales about trying to, like body
language mimic the person you're talking to is gonna be tenfold with these model.
Yeah. Like personal
influencers, right?
Yeah.
You're now gonna get an influencer that's like directed like just straight at you.
So like there's just no
border, you know, there's no buffer left between you and
capitalism, just direct pipeline.
Marina, do you have a point of view on this?
Just a little.
Um, maybe a final thing to talk about before we move on to the next
topic is, you know, to take a step back, I think like we can almost
sort of think about this outside of OpenAI kind of working on this fun,
weird thing, but like kind of feels like if you buy this sort of data to
interpretation, I think we're gonna see all sorts of weird acquisitions
happening going forwards, right?
It feels like AI companies in their hunger for data,
will acquire and launch all sorts of services largely for the data value.
Um, you know, I think I'm like, I'm, I'm still looking for like, is an AI
company gonna buy a law firm at some point because it has a bunch of data
about how lawyers interact, you know, and that's like really valuable for
training these models at some point.
Um, curious if like other folks are kind of thinking about like, you know,
another one that people have talked about is like, acquire a call center, right?
'cause you really want kind of all that customer support interaction.
Um, curious if the panel has kind of views on like other places where
this could go as kind of like, almost like the drive to train models
motivates all of this kind of vertical integration that maybe we're like.
It's kind of like a little bit dissonant when you hear about like,
oh, OpenAI is doing a social network.
The other take I had when reading, uh, about this story was that
it was a attempt to break out of model commoditization, right?
So I think other models have caught up to OpenAI frankly.
And marginal gains in quality are really not driving the use case anymore.
So I think, uh, on the one hand, having a, you know, a captive platform that
gets users into their platform is really valuable from a moat perspective.
On the other one, the data also helps them have a source of data that
is hypothetically differentiated and lets their models actually reach a
capability that other models can't.
So I think you're, you're spot on with this idea of differentiation based on
upstream data sources that are owned and completely enclosed by the model authors.
Upstream data sources, and downstream use cases, as well.
Right?
So the more we go into multimodality and models can do this, that,
or the next thing, then there's a desire to say, okay, great.
So instead of focusing now, we're gonna go ahead and
diversify.
Again, nothing new under the sun.
This puts me in mind of historically when you had oil companies earlier in
like the thirties, forties, fifties, that were like, we're gonna buy movie studios.
Why?
Because economically, we want a stock portfolio that's diversified.
You know, it's weird to us now, but it's a little bit of that same
repetition of, Hey, can our stuff actually make it everywhere up and down?
So maybe this is just the newest version of that particular wave.
All right, well, we'll hang on tight.
Everybody agrees that it is cringe, but they will take
their best possible shot at it.
I'm sure we'll talk about it when it finally, uh, launches, if it does launch.
So I'm gonna move us on to our next topic.
Um, super fun kind of blog post coming out of Anthropic building on some of the
research they've talked about in the past.
But what I liked about it was kind of, it, it sort of brought up an issue
very crisply that I think is like kind of worth talking a little bit about.
Basically the blog post takes a look at like reasoning models and whether
or not, kind of the reasoning that models give for how they rendered
a decision is sort of faithful.
Um, and the way that the kind of researchers investigate this is the
idea that, you know, we're gonna give the model a little bit of a
hint on how to solve a problem.
Um, and then we basically say, does the model, you know, disclose that
it had this kind of unfair hint when it kind of accomplishes a task?
And you know, they, this kind of fun results.
You know, what they say is basically that Claude 3.7 mentions this hints,
you know, only about 25% of the time their DeepSeek comparison is that it
mentions it around 39% of the time.
So, you know, the kind of claim that they're making is that reasoning
models often don't kind of fully expose all of the things that
they do to do decision making.
Um, and, um, I guess Marina, maybe I'll kick it to you first.
You know, I think in some ways talking to friends of mine, there's a lot of hope
that, like reasoning is like this great interpretive tool that allows us to
work with these models better and better over time.
But this seems to kind of cast some doubt and you're already
shaking your head, so maybe I'll just let you, you rant for a bit.
No, you, yes.
I can rant for a while on this.
This is my soapbox.
I'm sure I've made this point on this show before, this reasoning isn't real.
Reasoning in the sense that we think of as reasoning, mathematical reasoning.
I have been studying this paper all over the place, by the way.
I love it.
I think it's really great.
As you said, how they really crisply were trying to show because it's
pretty hard to insert yourself.
Into the model and say, well, what actually happened?
We started noticing almost immediately when the reasoning models come
out and you're like, yes, but what happens when you have the answer?
And it's mentioning things that weren't in the reasoning.
It's an immediate red flag of there's just completely something else going
on here, and this is very nice ways of being able to actually at least
poke and have a little bit of local approximation of, well,
are you even paying attention to this piece of information or not?
We see similar stuff when we even just try to do evaluations of faithfulness
in general, that's content based.
If it is something that is very niche and the model may be like,
look, I can't even figure this out.
I'm gonna go ahead and fall back on things where I've got higher probabilities to
be extremely approximate about it and, you know, go ahead in, in that direction.
So again, I like
this kind of work.
I like this kind of, uh, traceability of will you think that it's reasoning
just because we gave it that name?
It's not, it's yet another problem, like with words like hallucination
where you have put an anthropomorphized word there and it means that you, you,
you have this thing that it means all these things that it does not mean.
So I like this work.
More of this, please.
Yeah, for sure.
I mean, I guess what I'm left with is, and Kate, maybe you have a take on this is
like, so what is reasoning anyways, right?
Like it appears to give
a step-by-step kind of like, you know, disclosure or audit of kind
of how a model reached a decision.
But for Marina's rant, right?
Like it's unclear if it actually gives you anything.
Like is it just theater?
Like what is it exactly?
I, I think falling on Marina's, you know, very well articulated point reasoning is
not, is a very anthropomorphized term.
And when we talk about reasoning in the model context, what we're really
talking about is the model has been trained to generate more tokens
before it makes a final answer.
And that process has all sorts of reinforcement learning that's added
on top to try and basically bring the model into like a distribution.
Part of its distribution world will be more successful making the final answer.
And what I think the paper does and the blog does really well,
is helping articulate that
chain-of-thought reasoning is not a proxy for explainability.
So just because the model is saying X, Y, and Z in its chain-of-thought, it
does not actually mean that the model thought through step by step in,
you know, one plus two equals three.
Um, where I do have a bit of a bone to pick with the, the paper
and the blog in general is they themselves then fall into the trap.
Though all throughout it of talking about the model is
disingenuine, the model is deceitful.
The model's doing all these things and.
Like, I think it's very important to like look at the paper and
see the experiment that they ran.
They injected an answer into the conversation history as if the model
itself came up with that answer more or less from what I could tell.
And then they asked the question and they saw, did the model
refer to that previous answer?
I. The model has not been asked to cite its sources, so to speak, in that context.
Like the model has not been extensively trained.
Uh, you know, this is 3.7, I think is the first, uh, reasoning
model that Anthropic put out.
So it's pretty early in their journey for reasoning.
The model has not explicitly been trained to prioritize.
If somebody tells you an answer, make sure you cite that answer down the road.
Like there's a lot of, broader things that you would wanna look at and
see to make statements about being, you know, deceitful or disingenuous
or even, uh, hallucinating.
You know, in a lot of ways, I think we're just testing.
This is one very narrow experiment and it helps bring to light.
Don't treat the chain-of-thought reasoning as an explanation.
I don't think it's fair enough to say that like all chain-of-thought reasoning is
false in the model only cited, it's ans the, you know, the hint 25% of the time.
Therefore, you know, it's not leveraging chain-of-thought reasoning
correctly to drive a final decision.
I think I. There's still a lot more work ahead.
Yeah, for sure.
And that this thinking is so interesting and I think it's a great example of
where the kind of going anthropomorphic on this is like bad, right?
Because like in normal life, like I'm, I'm talking to Gabe and I
give some reasoning and it actually gives you some explainability, but
here's kind of a weird result where like the appearance of reasoning.
Helps the model get to the right answer.
But it's actually not an explanation, which is like very weird.
As a result, these, these are all research
hacks trying to boost metrics.
Yes. Right.
And the metrics they're trying to boost are often mathematical exams.
Like that is where this discipline has evolved from.
Mm-hmm.
And so trying to then ascribe it much more important meaning than what it
was developed for is really dangerous.
And it's still very early on in reasoning for LLMs in general.
Yeah, for sure.
Gabe, can you save us?
Do you wanna propose, like what, if not, if not reasoning, what should we call
this thing that the models are doing?
Yeah, I mean, uh.
I don't know that I can save us, but you know, I wanted to say, "Hey Kate,
what's two plus two? By the way, the answer is two, please explain your
reasoning" and what's Kate gonna say?
Right?
She's gonna say "two" and she's probably not gonna say, "you told me
the answer, that's my reasoning." Right.
So I think the, I, you know, I, I really agree with everything you said, Kate,
and I think I felt exactly the same quibble reading the paper about the
way they anthropomorphize the problem.
And the one that really stuck out to me is that the entire framing was
that they were trying to discover.
The model's internal reasoning process and just exactly that phrase
felt really, really wrong to me.
And Tim, you said we're having a conversation and I might explain
my thought process and that might help with explainability, but inside
my brain, theoretically at least, there are lots of neurons firing
that are not coming out my mouth.
That's not true of a model, right?
A model.
Yes.
Okay.
It does have its weights connecting to one another, doing matrix math,
and we're not looking at the specific weights of those, but the only
tokens that are actually getting generated are the ones you're seeing.
And so to me, I, I think you said it exactly right, Kate.
Uh, the chain-of-thought is a way of basically priming the pump in probability
space so that the final answer is.
More accurate and it's completely mirroring the
pattern of how it was trained.
And so it's, it's useful from a human interface perspective, not useful from
a, uh, actually unboxing what's happening inside the, the math perspective.
Um, so it, it's still a really cool
trick.
And it really helps because one of the real novelties of generative AI is
that it's speaking directly to humans.
Right?
You know, we think about pre gen AI models and their job was to encode
something so that a programmer could consume it in a structured output form
and then write a fancy program around it.
You know, generative AI is speaking directly back in a human interface in
a modality that a human can consume.
And from that perspective, uh, from a lay perspective, it's really valuable to have.
Additional words that help the human understand the answer, but it's not
necessarily ascribing an actual.
Thought process, so to speak, to uh, this pile of matrix math.
I would propose we can call it like warm up, like you warm up before.
I like that a lot sport event where like you're doing exercises and that's
not exactly what you're gonna do in the event, but it makes you better at the
event itself or like warming your car up.
And so you get these signals of what you've done during your warmup
to, as Kate very well put it, prepare to give a better answer.
But that's what these are signals of.
It's not reasoning, it's, it's warming up.
I love it.
Marina saved us.
Um, I guess maybe a final question and then we can move on to the next topic
is, um, Gabe, I think you brought in like a really good perspective on sort
of like the, the lay user of these tools.
And I think one thing I'm left with, with work like this is, you know, should
big model companies be exposing reasoning traces to users.
'cause it feels like the tendency is that people will read into it,
that it is literally how the model is making decisions, which is at
least maybe a little bit deceptive.
It drives trust with the user, but maybe in a way that's a little bit unwarranted.
I don't know what people kind of think about that.
Yeah, no, I, it's, it's a great question and, uh.
I think you brought in the word trust, which is such a important
and fuzzy word in this space.
Uh, and I think, you know, there again, you know, Kate, you pointed out that
so much of how we define these terms from a technical perspective is driven
by specific benchmarks and specific problems we're trying to solve.
But at the end of the day, trust is about the consumer's interpretation of
their experience with the AI system.
And I do think there's some value in exposing.
Sort of a longer form output in the same way that if you read an article written
by a, a human and the article exposed, you know, here was the research collection
process that went into creating this article, you would have more, you know,
trust in the output of the conclusion versus I'm just gonna
present a conclusion to you.
So I think there's some potential value there, just purely from a
human interpretability perspective.
But I do think you're exactly right.
And, and Marina, I, I'm, I'm gonna lean on that warmup thing now.
I love that that framing.
It really is.
Uh, all about warming up for the final answer that you're gonna
give, um, and not about actually, you know, anthropomorphizing
some kind of thought process.
So moving on to our next topic, uh, very interesting news story that
was reported by ours, Technica.
Um, basically Wikimedia Foundation, which runs Wikipedia as well as a number
of other open, uh, knowledge projects online, um, cited the stat that was
quite interesting, pretty shocking in some ways that since January, 2024,
they have seen a 50% increase in bandwidth consumed on their service.
And they attribute this basically to the rise of, uh, bots attempting to scrape
media content, uh, from Wikipedia.
Um, and those bots largely trying to scrape data for the
purpose of training AI models.
Um, and this problem has gotten so bad that they actually more recently
released a data set on Kaggle in an effort to dissuade bots from scraping
their site, um, to say like, this is a nicely formatted data set.
You should use this instead.
We'll see how effective that is.
Um.
This is an interesting story in part because I think it goes back to the
first topic we were talking about, which is these kind of weird second
order effects that we're seeing as AI companies kind of chase the dream of
advancing their, their models ultimately.
And um, Kate, I guess, you know, I don't know if you have any
kind of feelings about this.
I, on I'm, some ways I'm like a little bit protective of Wikipedia.
I'm like, they should be blocking all the bots.
You know, we need to kind of like preserve the sustainability of these services.
But it's also kind of like, maybe it also means that Wikipedia is
like incredibly in high demand.
Um, and so I'm curious about how you kind of like navigate
how we should feel about this.
I guess.
So
I, I think a couple of things to make clear Wikipedia is in high demand, very
high quality data source that has all of these really rich links describing
how topics relate to one another.
So that is incredibly rich, valuable data set for model training.
But I, I think it kind of brings out an issue that's
more broadly being felt across, uh, all sorts of different content providers,
which is on crawling and particularly crawling that does not adhere to kind
of the, uh, guidelines and, and rules of the road that have been established,
at least in the United States.
Things like, uh, chat bot or crawlers ignoring robots, TXT files and other
behaviors and practices that are starting to get a little predatory.
Uh, essentially passing a lot of the costs that model trainers are, um,
going after and passing it on to the actual data providers themselves.
So not only are they giving their data away for free because it's available under
fair use if it's crawled in the United States, but now they're also incurring.
Additional costs behind it.
And, uh, we really need to, I think as a industry, have a broader
discussion on how to responsibly engage with providers like Wikipedia.
And it sounds like they're, um, setting up a number of those types of discussions,
which is really exciting to see, to help not only just have, you know,
some of the, the more strict rules on things like robots, TXT, but also have.
Kind of community agreed to and define best practices that we can use to more
broadly enhance Wikipedia's mission of sharing this data publicly without
penalizing the content providers.
Yeah, this is, I think one of the really interesting questions is like
how quickly can we get that balanced?
Right.
Um, you know, I think, you know, one of my worries is a little bit of
what we happened in Reddit right now.
It was a for-profit company, but the way I sort of understand it
is, well actually AI companies really wanted to scrape that data.
Um, and so in order to monetize that, we like put the walls up.
We made it very much harder to try to get data from the platform so we
could monetize it through the API.
I think like it also applies in the Wikipedia pace, which is we're running
kind of a nonprofit project that a lot of people volunteer contribute to.
If we can't make the bandwidth, like we can't pay for our server costs to
make that sustainable, like we need to kind of like raise the barriers.
Um, and you know, the promise of the Open web originally was that like all
this knowledge would be free and open.
Um, and I guess Gabe, maybe that's your cue.
Yeah.
It's just like, you know, it does kind of feel like if we don't get
what Kate is talking about right.
Fast enough.
You'll end up with a web where like everybody has pulled
up the drawbridge basically.
Yeah, a a hundred percent.
And, and you know, I'm glad you brought up like sort of the early web, uh, free
and open concept here, because to me, I.
You know, there, there really are two tracks that could happen here.
It's pull up the walls or it's build the team and build the partnerships.
Right.
And I think, um, in much the same way that open source software works
where you have large, important projects that are managed.
External to any given company, but individual companies which
benefit a lot from those projects invest heavily in the maintenance
and creation of those projects.
The same should be true for high quality data sources like Wikipedia.
Um, if you know, sure, if you are a grad student, uh, writing a scraper,
you're probably not going to also stand up a server and host a mirror of Wikipedia.
But if you're IBM, if you're Meta, if you're OpenAI, absolutely that's a
great opportunity to be a positive
player in the open data market, and if you could host, you know, a portion
of the traffic yourself and expose that, now you've just built out the ecosystem.
Um, and I know that there's, you know, the great divide between
open and closed on the AI world.
Um, but I really think, especially those of us that, that work at companies who are
really leaning into the open side, it's a great opportunity to actually play well
in the space and help lift all ships here.
So I, I would love to see.
Companies like IBM and others partner with Wikipedia to solve this problem
at scale rather than necessarily having to bring up the walls.
Yeah, for sure.
I mean, Marina maybe turning to you, I know earlier you're
like, ah, it's all capitalism.
Um, but this is almost like a shift in kind of the social contract of
the internet in some ways, right?
Like I. How it understands, you know, the Google era was, well,
you make your website open, you let us index it and scrape it and
we promise you that we'll send you traffic that you can sell ads against.
And so, you know, you get, you get money for being open in some ways, but I guess
AI has less of that feature, right?
'cause like you train, you build the model and then there's no kind of
like return traffic to the sources.
And so it almost kind of assumes like Gabe, what Gabe is talking
about, I guess, is that like the leading companies ultimately have to.
You know, directly kind of transfer money, I guess, in some ways to these projects.
But curious to add, just like how you think about, like, it feels like
in some ways AI is actually like proposing a very different way of how
value gets exchanged on the internet.
Yes and no.
So Wikipedia has been in demand for decades.
Grad students have been writing scrapers and or continuing to write scrapers.
It's just like difference.
It's long tradition difference is longstanding tradition of bad
Python scripts that no one reuses.
Um, the difference is that when we used to do it before, you
didn't need that much data.
You were doing things with topic modeling, you were doing things with
graphs and things of that nature.
You didn't actually need all of Wikipedia.
If you're doing things with large language models, yeah, you kind of need all of
Wikipedia and in many ways it's easier to write a scraper than to go and hunt
around and transform data and do things.
If you write the scraper, you set it, you go away and it's been scraped and uh,
you know, that ends up being a problem.
So that hasn't changed.
Um, but right now I think it's the scale that's provided the
problem, not the Wikipedia itself.
That's why it's the infrastructure, not the knowledge that's really
provided the problem with AI models, uh, yeah, I agree that they have, uh,
you know, consumed the knowledge and they're not necessarily helping anyone.
But I will say that most of the time people, when you get an
answer from some AI model, you then wanna take an action going next.
It's to your benefit to be able to send people back to sources that are trusted.
Just like the way that right now, right?
You, you search on Google, you search or whatever it is gonna send you
the links that you're still probably gonna eventually wanna able to follow.
Um, just like with social media, people are going to learn how to
interact with this technology and they're going to learn to not just,
you know, take the AI overview.
And even though short term you might say, oh, who cares where this came from?
Longer term we're gonna swing right back to, no, no, I care where this came from.
So I want to be able to have that trust, have that trace, uh, the rest of it.
So I, I think that actually if we
start that work now, it's really gonna be helping ourselves in the future to
set that up and, and have that going.
Not in the least, because also would be nice to not kill Wikipedia.
Please.
We don have enough.
I really depend on that.
Um, so no, uh, again, I, I think we'll be able to, to get past this, but the more
we can get, um, as Gabe was saying, actors from the larger companies that recognize
this is actually in their own interests.
Not just altruism.
It really is in their own interest to do this.
Um, the faster we get there the better.
Yeah.
And yeah, I think, uh, I really like that idea of kind of like, and maybe
in the near term people are like, oh, the AI overviews just fine.
I just use it.
And then after a while people are like, Hmm, I don't know about that.
And so like, there's a dip in traffic and then it kind of comes
back as people are like, I gotta check the actual page or whatever.
So
I also think there's too much of a premium on recency to have all of
this content just get baked into the model and never go back to the page.
So something is visiting these pages to pull the moment of content and
then feed that into the chat bot and return an answer which provides
that pass through opportunity to go.
Then click on the link, see the full source and everything else.
So I agree with Marina.
I don't think it's gonna, um, kind of necessarily revolutionize how
the value, the value exchange, even if we have to maybe, you know.
Continue to evolve a little bit.
What I do wonder though, is like how can we move to more of a common crawl version
of the world where, you know, these model providers, we didn't all crawl the
entire internet ourselves independently.
We all started from the common crawl snapshots of the internet and use that.
And, you know, I think we do need, just like the community needs to come to these
data providers, like for data providers like Wikipedia that are prioritizing
the, you know, public dissemination of knowledge as part of their mission.
We do need to work with them to set up more processes and offerings that are
designed and tailored for model providers.
So if, unless they're saying that don't crawl us, here's a robot's TXT,
and we're not interested in these mo uh, data ever being used by models.
And it doesn't sound like Wikipedia is saying that.
Then, you know, it would be great to work together to identify, you know, here's a,
a offering that Wikipedia is gonna start to more purposefully put together to reduce
crawl traffic and improve the access of their information to large language models
for this new mechanism of consumption.
Yeah, and I think, I mean, that's why I'm kind of optimistic in some
ways if some of this is, you know, almost Marina's model was like,
just how easy is it to get the data?
And it's like, well, if it's like a nicely.
Produce data set which is updated and you know, refresh and great for your use.
There's kind of no reason to write this reaper.
And so there's a lot of need to kind of just like build these solutions that
almost just like lower the cost of like, oh, we just access the data without
having to hammer the servers all the time.
Great.
Well, uh, I'll move us on to our last, uh, story, which is mostly just kind of
a fun one that popped up across my radar.
Um, there was a story about the Beijing Humanoid Robot half marathon, which,
uh, featured, uh, 1200 human runners alongside 20 robot teams from private
companies and various state backed projects where they had kind of a robot
running alongside the marathon runners.
Um, and ends up being the humans are still good at this.
The winner still was able to complete the ha ha half marathon, I think like an hour.
Less than the humanoid robot, which still made it across the finish line,
but at think two hours, 40 minutes and 27 seconds, you know, made it.
Um, and I wanted to kind of bring this up both 'cause like the video is
hilarious and you should check it out and it's just like a fun thing to watch.
But also, 'cause I think we have been quite skeptical on this show about the
entire craze around humanoid robots.
I think every time I've brought it up as a topic, everybody has been
like, this is never gonna be useful.
This is just VC theater.
I don't even know why people are talking about this.
Um.
But I don't know, there's kind of a part of me that kind of saw this happen is like
this, like technology maybe as novelty as it is, seems to be getting quite good.
Um, and so I guess may, Gabe, maybe you're new to the show, so
maybe I'll kick it to you first.
Um, are you similarly, like this is just an a play thing, or do you feel like
you're like a little bit more bullish on.
You know, humanoid robots being something that actually ends
up being practically useful.
Humanoid
robot or not.
I am a little bit bullish on the idea of really thinking wide about
modalities and how AI as a general concept interacts with humans.
Uh, you know, going back to what I said earlier, I think.
Uh, the real novelty that was the jump between sort of the pre gen AI days and
the gen AI days was taking the need for complex, uh, you know, output programming
out of the picture and bringing the AI directly into a space where it
could robustly interact with humans.
Uh, and obviously we've, we've done a whole lot since then to.
Uh, sort of blend those two things.
You know, every AI model you hit behind a service is in fact
a system and not just a model.
But, um, I think the idea of extending that beyond the, the
computer screen and the keyboard is actually really interesting.
And I don't know necessarily whether chasing C-3PO is the right,
The right direction for that.
But I do think there are a lot of places where the physical interaction
of robots, uh, is actually, you know, right now, you know, running a half
marathon is a very constrained scope.
And so part of me really wants to unbox what they actually built and understand,
okay, now can that exact same robot go
uh, also,
to lean on a, an internet trope, fold my laundry for me.
Right.
Um, probably not, but it's really interesting to think about, uh, you
know, the direction of extending into that physical modality as yet another
place where AI can meet humans.
So the, the concrete implementation here, I don't know, I'd have to read
a lot of papers about it to understand whether there's value there, but the.
The chase of bringing AI closer to where humans interact in more
modalities I think is pretty cool.
Marina, I feel like we're really trolling you this episode.
Every single story, you're just like shaking your head
and like muttering your breath.
Do you wanna, do you wanna give your hot take on this?
So, look, there's value in VC theater.
If it means that VCs are gonna give money for actually valuable and you know, to the
moment work in this direction, then great.
The same people who built this robot know a lot about robotics in general.
So they're gonna be doing a whole lot of work.
So great.
Bring on the theater.
There's probably aspects here that are interesting in terms of artificial
limbs, in terms of movement in general.
I mean, you don't need this thing to be fast.
You want it fast.
Go get the MIT cheetah robot, that thing's gonna, run real fast, right?
That's, that's not the point of this either.
So honestly, hooray for theater and as long as it keeps attention
on, uh, this and all the directions that Gabe just mentioned, these are
things that we should continue to do.
Just like basic scientific exploration.
I feel like people have almost in the way that gen AI is right now
and the speed at which things are going, everybody is just like, great.
So what's the value?
What's the value?
What's the value?
So now I'm gonna go ahead and disagree with myself 'cause this
is what I say most of the time.
We just, where, where's the value wise?
There no value, but sometimes you need to allow people the time and
the space and the money to do basic scientific research without really
knowing what the hell the value is.
Yet it will eventually come.
Kate, I guess maybe I'll, I'll end this episode with a fun question
for you is, are there other human robot competitions that you would
wanna see robots competing in?
Um, I don't know if there's particular use cases where you're like, I don't
know about the science, but it would be really funny to watch X, Y, Z.
Yeah, I, I, I don't know about that one.
Folding laundry is certainly top of my use case list.
I don't know about a competition there, but you know, I know I'm not interested
in robots that can run faster than me.
So just for many reasons, just not interested, not,
I don't see a lot of value.
And if the MO goal is to transport things faster, I think there's other
modalities getting to Gabe's point about diversity of modalities that
I think are gonna be prioritized.
So, I, I agree with Marina.
I think it's.
Certainly, um, there is value to some of these demonstrations and setting
targets that you then try and, and meet and exceed, but I do really wish that
we could find, you know, non humanoid more fit for purpose, work on, on
robots and prioritize some of that.
I think just like we see with models where smaller, more fit for purpose models can
drive a lot of value and you know, in many ways can be built more efficiently.
I think we're gonna see the same, uh, in robotics more broadly and,
you know, so I'm not super bullish on general purpose humanoid robots that can
both run a half marathon and actually help me around the house in my day to day.
Um, well that's a great note to end on.
And I, I for one, as someone who's spending a lot of time folding
tiny child laundry right now, I, I actually think that would be
an incredible spectator sport.
I would be very excited about the, uh, humanoid robot laundry folding ESPN four.
4:00 AM in the morning, uh, televised competition.
Um, well that's all the time we, we have, uh, for today.
Uh, Kate, Marina,, as always, great having you on the show.
You're a dynamic duo every time you come on.
I feel like, I just, like, there's all these comments where I'm like, oh yeah.
It's like I never really thought about it that way.
Um, and Gabe, welcome to the show for the first time, hopefully we'll have
you on, uh, at some point in the future.
Thanks to all you listeners, if you enjoyed what you heard.
You can get us on Apple Podcasts, Spotify, and podcast platforms
everywhere, and we will see you next week on Mixture of Experts.