Learning Library

← Back to Library

The Memory Problem in LLMs

Key Points

  • Large language models, despite their intelligence, have extremely limited short‑term memory (only a few minutes or ~200 k tokens), which hampers their usefulness for longer, contextual tasks.
  • Scaling memory to meet current user volumes (≈125 M daily active users of ChatGPT) would cost on the order of half a trillion dollars, making affordable long‑term memory (months or years) a major technical and economic challenge.
  • This memory constraint forces us to reconsider which problems are feasible for LLMs and highlights the need for breakthroughs in memory architectures as a next critical research focus.
  • The reliance on highly capable but memory‑limited AI partners may already be influencing how people think, communicate, and remember information, raising important societal and educational implications that deserve more discussion.

Full Transcript

# The Memory Problem in LLMs **Source:** [https://www.youtube.com/watch?v=BhjtZP4T0oA](https://www.youtube.com/watch?v=BhjtZP4T0oA) **Duration:** 00:03:07 ## Summary - Large language models, despite their intelligence, have extremely limited short‑term memory (only a few minutes or ~200 k tokens), which hampers their usefulness for longer, contextual tasks. - Scaling memory to meet current user volumes (≈125 M daily active users of ChatGPT) would cost on the order of half a trillion dollars, making affordable long‑term memory (months or years) a major technical and economic challenge. - This memory constraint forces us to reconsider which problems are feasible for LLMs and highlights the need for breakthroughs in memory architectures as a next critical research focus. - The reliance on highly capable but memory‑limited AI partners may already be influencing how people think, communicate, and remember information, raising important societal and educational implications that deserve more discussion. ## Sections - [00:00:00](https://www.youtube.com/watch?v=BhjtZP4T0oA&t=0s) **LLM Memory Limitations Crisis** - The speaker warns that large language models' extremely short token memory is a dominant, costly issue—potentially requiring over half a trillion dollars to provide multi‑month memory for hundreds of millions of users—raising doubts about their practical usefulness despite growing intelligence. ## Full Transcript
0:00today I want to talk very briefly about 0:02the memory problem for large language 0:04models I believe this is going to be one 0:05of the dominant issues we need to 0:06discuss in 0:082025 at the end of the day large 0:10language models are becoming very 0:12intelligent but they still have 0:13atrociously bad 0:15memory 100,000 200,000 token memory it's 0:18like having a PhD in your pocket that 0:21forgets a conversation from 10 minutes 0:23ago it's not working well and the 0:27problem is if you do the math on the 0:28cost of memory there is no easy solve 0:32given our current solution 0:34architectures I I did the math I wrote a 0:37substack on this it would take over half 0:40a trillion dollars to solve this problem 0:42just at the current daily active user 0:44count for chat GPT which is roughly 125 0:48million and that's growing all the time 0:50the problem is getting worse all the 0:52time and that's assuming you don't want 0:54human level memory which lasts years you 0:57would be happy with long-term memory 0:59that's several 1:01months and I don't even know if we would 1:04be happy with that to be honest with you 1:06but that would still be vastly better 1:08than just a few minutes of memory which 1:09is really what we have today if the chat 1:11is going well I burn through chats with 1:13Claude in about 10 minutes claud's great 1:16well Claude lasts and then we're 1:20done and I think that one of the things 1:23that we need to ask ourselves is if you 1:26have this much intelligence but it has 1:27this much short-term memory what kinds 1:30of problems are useful to solve with 1:32that kind of intelligence and what kinds 1:34of problems are we inherently limited 1:37from solving because memory is an issue 1:39even 01 Pro 200,000 tokens you have very 1:43limited memory now people are doing 1:45incredible things with it so I'm not 1:46saying that you can't find really cool 1:48problems to solve I absolutely you can 1:51but it is making me wonder is memory the 1:54next breakthrough that we need to be 1:55looking for and if it is I don't see 1:58anything on the horizon that help helps 2:00us to solve that right now and I think 2:02that's probably worth talking about a 2:04lot more than we currently do especially 2:07if we're using llms all the time and 2:09they have very short-term 2:11memories is that going to affect the way 2:13we remember things does that change the 2:15and shape the way we remember things 2:16there are anecdotal stories coming out 2:19now of the way people are changing their 2:21vocabulary changing their thinking 2:23because of the way they interact with 2:25llms especially early on at formative 2:28stages in education 2:30if that continues and we are used to 2:33working with these thinking partners 2:34that are very smart but have very very 2:36limited short-term memory like it has 2:39read everything in the world but it 2:40cannot remember your conversation from 2:4220 minutes ago does that shape the way 2:44we remember things too maybe for good 2:46maybe for ill maybe we're the ones that 2:48have to get better at remembering 2:50because our thinking partner can't I 2:52don't know but to me it's one of the 2:55most interesting Dynamics in large 2:57language models right now and I think it 2:58deserves to be talked about about more 3:00than it is so I wrote a substack on that 3:02if you're interested um otherwise enjoy 3:04the YouTube