Learning Library

← Back to Library

From Keywords to AI Search

Key Points

  • Traditional search relied on keyword matching, TF‑IDF weighting, and PageRank link analysis, which struggled with context, synonyms, and user intent.
  • The introduction of transformer‑based models like BERT (2019) and MUM brought deep natural‑language understanding to search, enabling more accurate interpretation of queries.
  • Modern AI search pipelines begin with natural‑language query processing, using an LLM’s NLU capabilities to infer the user’s intent and nuances.
  • Retrieval now leverages vector embeddings and semantic (vector) search, matching query vectors with document vectors to find conceptually related content rather than exact keyword matches.
  • Large language models generate direct answers from the retrieved information, shifting search from simply providing links to delivering concise, context‑aware responses.

Full Transcript

# From Keywords to AI Search **Source:** [https://www.youtube.com/watch?v=iVUMuC7OzUI](https://www.youtube.com/watch?v=iVUMuC7OzUI) **Duration:** 00:12:01 ## Summary - Traditional search relied on keyword matching, TF‑IDF weighting, and PageRank link analysis, which struggled with context, synonyms, and user intent. - The introduction of transformer‑based models like BERT (2019) and MUM brought deep natural‑language understanding to search, enabling more accurate interpretation of queries. - Modern AI search pipelines begin with natural‑language query processing, using an LLM’s NLU capabilities to infer the user’s intent and nuances. - Retrieval now leverages vector embeddings and semantic (vector) search, matching query vectors with document vectors to find conceptually related content rather than exact keyword matches. - Large language models generate direct answers from the retrieved information, shifting search from simply providing links to delivering concise, context‑aware responses. ## Sections - [00:00:00](https://www.youtube.com/watch?v=iVUMuC7OzUI&t=0s) **Evolution of AI-Powered Search** - The speaker explains how search has progressed from basic keyword matching and TF‑IDF to link‑based ranking with PageRank, and finally to transformer models like BERT, MUM, and modern large language models that grasp context, intent, and generate direct answers. - [00:03:15](https://www.youtube.com/watch?v=iVUMuC7OzUI&t=195s) **Embedding-Based Retrieval Augmented Generation** - The speaker outlines how text is converted into semantic vectors, matched in a vector database, and combined with retrieved snippets in an LLM to produce cited, trustworthy answers, concluding with a feedback loop. - [00:06:27](https://www.youtube.com/watch?v=iVUMuC7OzUI&t=387s) **AI Search vs Traditional Search** - The speaker contrasts traditional search’s limited memory and list-based results with AI search’s contextual, multi‑turn conversation and synthesized answers, highlighting the resulting challenges for SEO. - [00:09:37](https://www.youtube.com/watch?v=iVUMuC7OzUI&t=577s) **EEAT Optimization and Formatting Relevance** - The speaker explains applying Google's E‑E‑A‑T principles to make content machine‑readable while noting that traditional HTML formatting like H1s still helps SEO but is less critical for AI‑driven retrieval. ## Full Transcript
0:00AI search is transforming how we locate and consume information online, 0:05but how? 0:06Well, back in the day, search engines were pretty simple because they were based more or less just on keyword search. 0:16They matched words in a user's query to words in documents. 0:20Several methods that they would use for that, including things like boolean keyword matching. 0:27That was one method to do it. 0:30Now keyword search has moved on since then, algorithms such as TF-IDF, 0:38they rank documents by term frequency and inverse document frequency and that helps improve relevance by assigning more weight to important terms. 0:48And Google's breakthrough in the late 1990s was called PageRank and that added link analysis to judge a page's authority, 0:58but traditional keyword search 1:01has some clear limitations. 1:03It can't truly understand context and synonyms and user intent. 1:08So when my search string includes the word Apple, am I referring to the fruit or the tech company? 1:17Well, enter machine learning and the world of ai search. 1:24So technologies like BERT from Google in 19' or 2019 1:31that introduce a transformer-based language model into search, helping better understand the context of natural language queries. 1:38And that was followed two years later by MUM, that's Multitask Unified Model, a much more powerful model than BERT to both understand and generate language, 1:49and then Today we have large language models. 1:53Where the AI generates an answer rather than just retrieving links. 1:58So how does AI Search powered by large language models actually work? 2:03Well, we can think of it in four stages and at the top here, first of all, we've got the natural language that's coming in. 2:10Specifically, we're gonna perform natural language query processing. 2:16So when a user asks a question in plain language, the system uses an LLM to interpret the query. 2:24That uses the LLM's Natural Language Understanding capabilities, it's NLU, to parse the query's intent and nuances. 2:33So if I ask what's the best way to peel an orange, 2:37well, the system recognizes I'm probably looking for a method or a tutorial, even though the query doesn't explicitly contain those words. 2:46We've moved far beyond the old days of keyword matching here. 2:50Now, with intent established... 2:53we move to the next stage, which is retrieval. 2:58Now, instead of relying solely on keyword matching, although that does still play a part, AI search often uses vectors. 3:08Specifically, it uses vector search to find relevant documents semantically. 3:15Now, how does that work? 3:16Well, text, both search queries and documents are encoded into numerical vectors. 3:22Those are called embeddings. 3:24And vectors capture semantic meaning. 3:27The user's query vector is then matched with vectors of documents in a vector database to find the content that is conceptually related. 3:36This allows, for instance, a query about puppy play things 3:40to retrieve an article that talks about dog toys, even though the wording differs because these terms are semantically similar. 3:50Who's a good boy? 3:52Now the next stage is answer generation. 3:57So this is where we have now gone to retrieval 4:01and we've retrieved some relevant documents or actually more likely not entire documents but really snippets of those documents. 4:11And now an LLM is given the query along with those retrieved snippets and it generates a cohesive answer in natural language. 4:20Using those sources of information. 4:23Now regular viewers of this channel probably recognize what this is. 4:28It's our old friend RAG or retrieval augmented generation where the LLM's knowledge is augmented with up-to-date retrieved data. 4:39By grounding its answer in retrieved facts the AI search system can provide current and accurate information. 4:47The generated answer it can include citations 4:50linking back to the original sources, 4:52which is a level of transparency that's important for gaining a user's trust, showing that this answer is not just hallucinated by the model. 5:01Now, the final stage in all of this is the feedback stage. 5:08Many AI search implementations learn from feedback to improve. 5:13So users might give a thumbs up or a thumbs down, or the system observes follow-up queries to figure out if the answer was helpful. 5:21This data can fine-tune the LLM and the retrieval component over time. 5:28So how do traditional search and AI search powered by large language models compare? 5:35Well in response format traditional search what does that return? 5:40It typically returns a list of links for a user click through, 5:47but AI search doesn't provide a list of links, 5:51it provides a direct answer to whatever it is you were searching for to your query, in natural language. 6:01It's generated original content on the fly now as for the query understanding traditional search as we've already mentioned that is kind of primarily keyword based 6:16Whereas AI search, that is based on NLU or natural language understanding to derive context and intent. 6:28And speaking of context, when it comes to contextual awareness, a traditional search, that is pretty independent. 6:38What I mean by that is it has a limited memory of a user's previous interactions. 6:45Whereas, AI search, that really keeps the context in mind. 6:52It maintains context, allowing a multi-turn conversation, allowing follow-up questions that understand references to earlier parts of a dialog. 7:03And when it comes to information synthesis, well, traditional search, that really separates results out. 7:11So, different sources, 7:14different lists, whereas AI search, it really combines information. 7:21It takes information from multiple sources and puts them into one coherent answer. 7:28Now AI search isn't just changing how results are displayed, 7:31it's really challenging how the entire web has been built, because for years websites have been optimized for traditional search engines using a practice called SEO, 7:40search engine optimization, to rank as high as possible in results pages, but 7:45What happens now when the result of an AI search isn't a list of links, but instead is written text incorporating snippets from multiple web page sources? 7:56Well, that's a good question for Donna Bedford. 8:00Donna Bedford, Global SEO at Lenovo. 8:03Now, for years, Donna has been making sure that her company's web pages rank as highly as possible in search engine results. 8:10So Donna, if somebody wanted to make their content a bit more AI-friendly today, where would they start? 8:17Well, the great news is they don't have to start afresh. 8:21What they're already doing for traditional search is gonna work for them. 8:26What they need to do is like up the game, but have a narrow focus. 8:30So what you're gonna do is focus on two real things. 8:33One, think human. 8:36And two, think like the machine. 8:39So what do I mean about that? 8:42So AI is still a machine. 8:43It still has to come and find your information. 8:45It still has to work it out. 8:47So you want to make it as easy as possible, bite-sized chunks, good structure, good navigation, so it understands. 8:54And you want a complete journey. 8:55You want everything in there so it understand. 8:58But you also have to tackle the human element, where as traditional search tends to be singular words, couple of words. 9:05This is more conversational. 9:07It is definitely more a personal journey. 9:10So you need to start writing like a human might ask. 9:13Okay, so addressing both sides of it now that makes me think about keyword counts 9:17because I know in the old days you would kind of want to stuff a webpage with as many keywords as possible to get as high a count as possible. 9:24Is that old news now? 9:26It's kind of old news but there is a variable in it that works. 9:32So what you're talking about is like keyword density, how many times can you write the exact match word on the page. 9:38What we're now extending out is using an algorithm update that Google came out with a number of years ago 9:43and is commonly used which is called EEAT. 9:47EEAT so and it's actually two E's in here originally it was just a one So we're talking about experience, expertise, authority, and trust, right? 10:00So as you mentioned, traditionally, the site change ensures a number of links and all sorts of things. 10:07Here, what you're trying to do is give the full experience to the machines, to the AI, 10:13to tell them that you have the expertise, the authority, the experience, the trust, and you're a trusted source for this information. 10:24So you write like a human, but you give the information that a machine needs to logically make the response. 10:32Okay, gotcha. 10:33So one more question for you. 10:34Okay. 10:35I want to know how much formatting matters. 10:38So formatting, like making sure you're using H1s and stuff like that. 10:42When we think now that AI is putting information from all sorts of different web pages rather than just a single page, so does formatting stuff still matter? 10:51So it does, but not in the same way. 10:54So traditional search engines, you'll use like H1 to tell the search engine how important an element is or what your page is about. 11:05In most cases, whatever you do for the AI is gonna benefit your traditional search and traditional search is not going away, right? 11:13But there's a gotcha in here that you have to watch out for. 11:17I'm saying you make it better every time, there's one particular element that you're actually gonna have to step back on. 11:23And that's JavaScript. 11:25Traditional search engines at the beginning had a problem with JavaScript. 11:28They've managed to solve that. 11:31The AI models haven't. 11:32So they have an issue with the JavaScript. 11:35So you just wanna make sure that, again, going back to the very first question, crawlable, navigable, can find the information, and that they can find it. 11:43Because if they can't find the information about you, They can't have a story about you. 11:48That makes a lot of sense. 11:49Well, thank you, Donna. 11:51So that's AI Search. 11:52It's changing both how users locate and consume information online, and even how that information is represented online in the first place.