Learning Library

← Back to Library

LLM Limits and Seven Key Use Cases

Key Points

  • LLMs struggle with breaking‑news because they’re trained on static, large‑scale corpora and can’t readily incorporate tiny, fresh pieces of information without a dedicated, up‑to‑date data pipeline.
  • Their core design as next‑token predictors makes them ill‑suited for real‑time fact‑checking or staying current with daily events, highlighting a need for systematic, frequent model updates.
  • LLMs are poor decision‑makers; they can be easily swayed and often provide advice that reflects statistical patterns rather than sound reasoning.
  • Understanding these architectural trade‑offs is essential—treating LLMs as a “magic wand” ignores their genuine strengths (e.g., language generation) and the contexts where they reliably fail.

Full Transcript

# LLM Limits and Seven Key Use Cases **Source:** [https://www.youtube.com/watch?v=uS_YX_LGCAY](https://www.youtube.com/watch?v=uS_YX_LGCAY) **Duration:** 00:15:32 ## Summary - LLMs struggle with breaking‑news because they’re trained on static, large‑scale corpora and can’t readily incorporate tiny, fresh pieces of information without a dedicated, up‑to‑date data pipeline. - Their core design as next‑token predictors makes them ill‑suited for real‑time fact‑checking or staying current with daily events, highlighting a need for systematic, frequent model updates. - LLMs are poor decision‑makers; they can be easily swayed and often provide advice that reflects statistical patterns rather than sound reasoning. - Understanding these architectural trade‑offs is essential—treating LLMs as a “magic wand” ignores their genuine strengths (e.g., language generation) and the contexts where they reliably fail. ## Sections - [00:00:00](https://www.youtube.com/watch?v=uS_YX_LGCAY&t=0s) **LLMs Aren't Magic: Breaking News** - The speaker warns that large language models cannot reliably handle real‑time breaking news and must be viewed as specialized token‑prediction tools with distinct strengths and limitations. ## Full Transcript
0:01we are going to talk about three things 0:04that AI is not large language models are 0:06not in particular and then we're going 0:08to cover seven use cases that large 0:10language models are really good at and 0:12the reason why and if you followed my 0:14Tik Tok at all you know I get really 0:17insistent that we not treat a large 0:19language model as if it was a magic wand 0:22where you could just sort of wave the 0:23magic one then it will do whatever you 0:25want I think it's important to 0:27understand architecturally what they're 0:29designed to be good at and then what 0:31they're designed to not be good at 0:33because any design makes tradeoffs and 0:35in this case the tradeoffs actually show 0:37us the kinds of jobs that large language 0:40models are actually not very good at as 0:42well as the kinds that they are so let's 0:43get into it so this is the the number 0:46one thing they're not good at and this 0:47is the thing that made me think of this 0:50I don't know about you but it feels like 0:51there's been an enormous amount of news 0:54that has broken over the last week or so 0:56in the middle of July large language 0:59models are absolutely terrible at 1:02handling breaking news unless you put 1:04them into some kind of tool chain that's 1:07only looking at a select body of text 1:09that is authoritative that is considered 1:12breaking news they're going to do badly 1:14and they're going to do badly because 1:16fundamentally large language models are 1:19designed to handle large bodies of text 1:23they are designed to look across the 1:25entire internet and to 1:28say what is the next token predictor for 1:32a given query based on my understanding 1:34of the entire body of text on the 1:36internet all of the written works of 1:38humanity Etc that makes them good at 1:41certain things it's kind of amazing that 1:43we built a machine that can do that but 1:46if you look across all of that and you 1:48have this tiny sliver of text that comes 1:50in that's brand new that's supposed to 1:51be a new authoritative fact llms just 1:55don't know how to handle it and that 1:57actually suggests an inherent weakness 1:58for llms going forward we are 2:01continually adding new facts to the body 2:04of knowledge for Humanity every 2:05newspaper that comes out there's a whole 2:07series of new facts and new events we 2:10need to have a more deliberate way of 2:13upgrading our llms on Cadence so that 2:17they are up toate on the facts right now 2:18we don't even have an idea or a concept 2:21of a world where you could open up chat 2:24GPT in the morning and say oh good I see 2:27that it's read all of the newspapers 2:29from yesterday 2:30it's up to date on the facts let alone 2:33same day 2:35news inherently if your job is to 2:38predict the next token from a large body 2:40of tokens you're going to look at that 2:42large body of tokens you're not really 2:44going to over index on this one tiny 2:46piece of breaking 2:48news and that's what makes large 2:50language models really really terrible 2:53at handling news right now they're just 2:54not good at it another thing that large 2:58language models are really bad at is 3:00making decisions I say this because even 3:04if they give you advice if you ask them 3:07to they're incredibly persuadable 3:10they're actually designed almost to be 3:12mirrors they're designed to respond to 3:16you in a way that you will find 3:19appealing and that makes them really 3:22really bad at actually making hard 3:23decisions so if you ask them to list 3:25pros and cons and to make a decision 3:28they tend to be over optimistic in my 3:31experience and if you ask them to 3:33reverse the decision they tend to be 3:34easily persuadable and reverse the 3:36decision because they're token 3:37predictors they're not thinking in 3:39symbolic logic they don't have any idea 3:43of really what a decision is they're 3:45just predicting the next token and 3:47they're predicting a token that they 3:49think is going to be able to keep the 3:52conversation 3:53going and they're predicting a token 3:55that they think will match the query and 3:58that's that second piece is actually 3:59more important I'm not sure that we 4:01actually know for sure that they are 4:03trying to juice llm so that you keep 4:05talking with them I certainly wouldn't 4:07be surprised given our history with 4:08social algorithms as a tech industry but 4:11for now what we really know is that they 4:13are designed to respond to a query and 4:16your query reveals your own biases your 4:18own opinions and anyone who's designed 4:21surveys can tell you you can write a 4:23survey that will get anyone to tell you 4:24anything it just depends on the kind of 4:26question you ask similarly when a large 4:29language model is being asked a question 4:31it will read all the nuances and Detail 4:33in that question the unique human 4:35utterance that you make and will then 4:38respond with a token and a string of 4:41tokens that's designed to match exactly 4:43that question and so it becomes an 4:45intensely suggestible conversation and 4:49that makes it very very bad at decision- 4:51making and this brings me to the last 4:53thing that large language models should 4:55not be asked to do do not ask a large 4:58language model to make a management 5:00decision and yes I am deliberately 5:02tipping my my hat here to azimov because 5:06azimov wrote In The Three Laws uh of 5:08Robotics that a number of things that 5:11robots should not be asked to do and 5:13that got expanded over the years into a 5:17llm should not be asked to make a 5:20management decision so actually while 5:22we're doing this live I am not going to 5:24chat with Chad 5:26GPT but I am going to find out for you 5:31where the rule came from uh that a robot 5:34should not make a management decision 5:37because I think that's super 5:39interesting let me see 5:43here 5:44um it's an IBM presentation from the 5:471970s see this is why we need to find 5:49our sources I thought it was azimov and 5:53then I remembered live that azim's three 5:56laws of robotics are not about 5:58management they're about 6:01ethics and as I was doing that I 6:04realized I needed to get better sourcing 6:06and so it turns out that this comes from 6:08an IBM 6:09presentation uh IBM of course built some 6:12of the first AI tooling um and I think 6:15the idea is a sticky 6:19one we should not have large language 6:21models making management decisions if 6:23they're bad at decision- making and I 6:24think we sometimes expect 6:26that and you can't have the kind of 6:29business business judgment that you need 6:31from an llm it's just not going to work 6:33they're designed to predict what you 6:34give them and so they will be far too 6:36suggestable to make good 6:39choices and we will probably over index 6:42on the choices that they do make by the 6:45way you should check your facts that was 6:48a nice little moment there right like 6:50I'm modeling that go check your 6:52facts don't assume the llm is going to 6:55give you the truth okay we're going to 6:58go to part two those are are all things 7:00that large language models are not good 7:01at what is what are large language 7:02models actually really really good at 7:04what are they designed to be good at 7:07number one they're phenomenal at 7:09synthesizing I can give a large language 7:11model a 50-page document and it will 7:14take a couple of seconds to read it that 7:18is so much faster than a human being I 7:21think we lack categories for it mentally 7:23it it still feels like magic to me when 7:26I stick a 50-page document into an llm 7:29and it just just digests it like 7:31that and then it synthesizes it and 7:34increasingly it acts as a Precision 7:36recall device for it that part has 7:38gotten better it used to be much much 7:40more hallucinatory and because of the 7:42updates in the background to our core 7:44modeling llms are as of this time July 7:48in 2024 much much better at precisely 7:52pinpointing where in the document 7:53something happened and recalling it 7:55accurately and I think that's credit to 7:58the model Builders credit to open Ai and 8:00others who have actually worked to make 8:01sure that that's 8:03true so synthesis is something they do 8:05well they will summarize something for 8:07you they will describe something for you 8:10really effectively so for instance if 8:12you upload an image of a chart or if you 8:15upload an image of an equation or if you 8:17upload an image of a house or a 8:20motorcycle or even a fairly complex 8:23crowd scene large language models are at 8:26the point where they can understand 8:30that scene and describe it really well 8:33and that is partly because of all the 8:36work that's been done that I haven't 8:37talked a ton about on the image 8:38generation side where we have had 8:40similar advances driven by this idea of 8:43being able to reliably predict patterns 8:47and images and generate reliable images 8:49from a body of images it's a similar 8:52macro motion to what we've done with 8:55text it's just on the image side and so 8:57large language models have sort of put 8:58that together I know when uh chat GPT 40 9:02came out one of the big points was that 9:03it actually had a native image 9:06ingest uh more work is going to be done 9:09there but even 9:11now describing images is something that 9:14llms do really really well and you might 9:17think well that's not super relevant but 9:18it is because a lot of what humans have 9:21to do at work involves describing images 9:24if you're looking at a chart to gain 9:26insights you're describing an image 9:30if you are conducting analysis of a 9:33diagram you're describing an 9:36image so there's a lot of pieces of work 9:39that we do that amount to description if 9:42you're writing a how-to guide for a 9:43piece of software you're describing an 9:45image it's just a series of 9:48images okay so they're good at 9:51synthesizing they're good at describing 9:53they're also eerily good at pattern 9:56recognition I would argue that large 9:58language models probably know more about 10:00grammar than any living linguist there 10:03is something 10:05phenomenally effective about the way 10:07they've mastered human 10:10language and that pattern recognition 10:14extends 10:15into asking it to understand patterns in 10:19documents you upload it's it's just 10:22something they're really good at and I 10:24think that's one area where we're just 10:26scratching the surface of what we can do 10:28with that amazing aming pattern 10:30recognition 10:32capability number four they're good 10:35Companions and whether or not we like 10:38that regardless of how we feel about it 10:40they're really really good at engaging 10:42people in conversation and feeling like 10:45there's someone else on the other side 10:47of the line and that is something that 10:49people are responding to there is a 10:52reason that AI companion apps do so well 10:55in the App 10:56Store like it or not 11:00and that gets me to the next point 11:01they're excellent conversationalists and 11:03I separated those out because a 11:04companion is sort of an emotional 11:06function that an AI provides uh help 11:08with whereas a con 11:11conversationalist is someone who can 11:13help you debate or understand something 11:15better yourself so if you want to debate 11:17and understand a particular known corner 11:20of study like if you're trying to 11:22understand an advanced piece of physics 11:24or a piece of chemistry and the concepts 11:26aren't making sense you can have a 11:28convers ation with a large language 11:30model and it will help you it will help 11:33you understand it now if you only depend 11:35on the large language model we've 11:37actually done some work there and it 11:38turns out that you don't learn as well 11:42if you're only depending on an llm 11:44because it ends up becoming something 11:46you lean on when you really should be 11:47doing your own critical thinking but if 11:49you need to wrap your head around it the 11:51first time it can be very 11:53effective all right we're getting to the 11:57second to last thing that they're good 11:59at so i' I said seven right so we have 12:01synthesizer a good describer a good 12:03pattern recognizer a good companion a 12:05good conversationalist and then number 12:06six they're a good business writer 12:09they're absolutely phenomenal at 12:10business writing and I differentiate 12:12business writing and literature because 12:15literature requires a degree of 12:18Attunement to the lived experience that 12:20llms just aren't good at they they 12:23aren't embodied creatures they don't 12:27understand how to write liter and I have 12:30seen people try and it just falls apart 12:33but they're absolutely amazing at 12:35writing business like if you need to 12:38write a quick update for the boss very 12:41good at it if you need to write even a 12:43one-pager if you have a solid idea and 12:46you can critique it it's good for 12:48drafting so they're good business 12:50writers and the last thing number seven 12:53they're really really good 12:55analysts that means that if you set them 12:58up and prompt them properly they will 13:00actually analyze and assess what is in 13:03front of them very very carefully and I 13:05think it's that attention to detail that 13:07is really helpful they look through with 13:10unfailing attention across the entire 13:12body like if you ask a financial analyst 13:15to look with the same degree of 13:16attention that an llm does across the 13:18same degree of text either it's going to 13:20take forever or they just won't like 13:23they'll skip out and that is something 13:27that we're still getting used to this 13:28idea that you can have an analyst that 13:30is always there that is always at your 13:32fingertips and that is always thinking 13:34things through that's a new one for us 13:37it's like we have a personal Analyst at 13:38our 13:40fingertips and I think maybe that's 13:42where I'll sort of wrap this thing in a 13:45conclusion the things that llms are not 13:47good at we seem to expect they'll be 13:49good at and I would say the things that 13:50llms are good at we have a lot of 13:52feelings about as humans we think these 13:54are things that humans should 13:55traditionally be good at ourselves and 13:57we worry that having an LM be good at 13:59them means that somehow we are less and 14:01I would argue that it just means that we 14:04have more time to do other things and it 14:07means that we have more options to do 14:10more interesting versions of those same 14:12tasks I do not miss when an llm writes 14:18business updates and I can like draft 14:21them very quickly from the llm draft and 14:23I'm done in half the time that's not 14:25something that I wish I could do more of 14:29I don't regret having a first pass at 14:32charting 14:34analysis it's great to have a first pass 14:38and so I I think that one of the things 14:40that we are going to need to get used to 14:42is the idea that maybe we've thought of 14:44software is something that is very for 14:47everyone and maybe what llms are 14:49reminding us is that llms can be 14:51personal software can be personal now 14:54and we can have effectively a personal 14:56assistant at our fingertips that can 14:58help us with a lot of these functions 15:00that we're asked to do because if we 15:02work professionally we're asked to 15:03describe we're asked to pattern 15:05recognize we're asked to be 15:06conversationalists we're asked to be 15:08analysts it can help to have someone in 15:11the background who's very good at those 15:13things as long as we don't ask it to do 15:15the things it's not good at like make 15:17decisions or break news all right I will 15:19leave it there I hope that this has been 15:21a helpful breakdown three things that 15:23llms are terrible at and seven things 15:25that they're actually pretty good at be 15:27curious to know uh what you think I 15:28missed in the comments