Learning Library

← Back to Library

Build a Retrieval‑Augmented Chat App

Key Points

  • The video demonstrates building a chat app that uses Retrieval‑Augmented Generation to answer questions based on your own data, which is a low‑cost way to apply LLMs in a business context.
  • Streamlit is used for the UI, with chat input and message components, and a session‑state variable is created to store and display the full conversation history.
  • LangChain provides the core LLM integration; an IBM Watsonx LLM (Llama‑2‑70B‑Chat) is accessed via an API key and project ID, with decoding parameters set for the model.
  • The app sends user prompts to the LLM, captures the assistant’s responses, and renders both sides of the dialogue as markdown within the Streamlit chat interface.
  • A PDF loader is added in a later step, converting the document into chunks, embedding them, and indexing them in a LangChain vector store so the model can retrieve and use custom data during chat.

Full Transcript

# Build a Retrieval‑Augmented Chat App **Source:** [https://www.youtube.com/watch?v=XctooiH0moI](https://www.youtube.com/watch?v=XctooiH0moI) **Duration:** 00:02:45 ## Summary - The video demonstrates building a chat app that uses Retrieval‑Augmented Generation to answer questions based on your own data, which is a low‑cost way to apply LLMs in a business context. - Streamlit is used for the UI, with chat input and message components, and a session‑state variable is created to store and display the full conversation history. - LangChain provides the core LLM integration; an IBM Watsonx LLM (Llama‑2‑70B‑Chat) is accessed via an API key and project ID, with decoding parameters set for the model. - The app sends user prompts to the LLM, captures the assistant’s responses, and renders both sides of the dialogue as markdown within the Streamlit chat interface. - A PDF loader is added in a later step, converting the document into chunks, embedding them, and indexing them in a LangChain vector store so the model can retrieve and use custom data during chat. ## Sections - [00:00:00](https://www.youtube.com/watch?v=XctooiH0moI&t=0s) **Building a Retrieval-Augmented Chat App** - The speaker demonstrates how to create a Streamlit‑based interface using LangChain that employs retrieval‑augmented generation to let users chat with their own data, covering dependency setup, UI components, and handling message history through session state. ## Full Transcript
0:00in this video I'm going to show you how 0:01to build a large language model app to 0:02chat with your own data this is arguably 0:04the cheapest and the most efficient way 0:06to get started with llms for your own 0:07business but before we do that I want to 0:09back up a little the technique that 0:11makes this work is called retrieval 0:12augmented generation a fancy way of 0:14saying we Chuck in chunks of your data 0:15into a prompt and get the llm to answer 0:17based on that context the first thing 0:19that we need to do is build an app to 0:20chat to there's a bunch of dependencies 0:21that I need to import they're mainly 0:23from Lang chain but there's a little 0:24streamlit and what's an next thring for 0:26good measure I'll explain these as I use 0:27them so don't stress for now streamlit 0:28has a great set of chart component so 0:30I'm going to make the most of them add 0:31in a chat input component to hold the 0:33prompt and then display the user message 0:34using the chat message component via 0:36markdown this means I can now see the 0:38messages showing up in the app but it's 0:40only displaying the last message posted 0:42not all of them easy fix create a 0:44streamlet state variable I'll call it 0:46message and append all of the user 0:47prompts into it while add it I'll save 0:49the roll type in this case user into the 0:51dictionary and then I can test it out hm 0:54but the history doesn't show up Well 0:56turns out I haven't printed out the 0:58historical messages yet Lop through all 1:00the messages in the session State 1:01message variable and use the chat 1:02message component to display them and 1:05wait did I save the app of course not 1:07I'd never make a mistake like that let's 1:09just try that again and look at this 1:12historical prompts that have been passed 1:14through whoopy Doo Nick where's the L 1:16well let's do it I'm going to use the 1:18Lang chain interface to what to next. a 1:20why well it uses state-of thee out large 1:21language models doesn't use your data to 1:23train and it's built for business but 1:25that's just scraping the surface to do 1:27that I'll create a credentials 1:28dictionary with an API key and use the 1:30ml service URL you can create an API key 1:32from the IBM Cloud am menu URLs for 1:35different regions are shown on the 1:36screen right now then the llm I'm using 1:38llama 270b chat because I'm pretty fond 1:40of those furry buggers pass through some 1:42decoding parameters and specify the 1:44project ID from what to next now send 1:46the prom through to the llm and boom 1:49wait it looks like it's running but I 1:51need to show the llm responses as well 1:53easy enough with the streamlit chat 1:54message component note the chat role for 1:56the llm response is assistant rather 1:58than user this helps to differentiate 2:00the responses I'll render The Prompt as 2:02markdown and save the message to the 2:03session State as well that way the 2:05history is displayed in the app and now 2:07it works yeah yeah that's great but 2:09where's the custom data come into plate 2:11entering phase 3 I'll add in a load PDF 2:14function and specify the name of the PDF 2:15here then pass that to the Lang chain 2:17Vector store index Creator and choose 2:19the embeddings function to use this 2:20basically chunks up the PDF and loads it 2:22into a vector database chroma DB in this 2:24case wrapping it in the st. case 2:26resource function means that streamlet 2:28doesn't need to reload it each time 2:29which makes it a whole heat faster 2:31anyway I can then pass that index to the 2:32Lang chain retriever QA chain and swap 2:34out the base llm for the Q&A chain using 2:36chain. run and we can now chat with our 2:39PDF in this case a PDF to do with 2:41generative AI meta I know