Learning Library

← Back to Library

Build a Retrieval‑Augmented Chat App

2m • Unknown Channel • ai-ml • tutorial • intermediate • Watch on YouTube ↗

Key Points

The video demonstrates building a chat app that uses Retrieval‑Augmented Generation to answer questions based on your own data, which is a low‑cost way to apply LLMs in a business context.
Streamlit is used for the UI, with chat input and message components, and a session‑state variable is created to store and display the full conversation history.
LangChain provides the core LLM integration; an IBM Watsonx LLM (Llama‑2‑70B‑Chat) is accessed via an API key and project ID, with decoding parameters set for the model.
The app sends user prompts to the LLM, captures the assistant’s responses, and renders both sides of the dialogue as markdown within the Streamlit chat interface.
A PDF loader is added in a later step, converting the document into chunks, embedding them, and indexing them in a LangChain vector store so the model can retrieve and use custom data during chat.

Sections

00:00:00 Building a Retrieval-Augmented Chat App - The speaker demonstrates how to create a Streamlit‑based interface using LangChain that employs retrieval‑augmented generation to let users chat with their own data, covering dependency setup, UI components, and handling message history through session state.

Full Transcript

# Build a Retrieval‑Augmented Chat App **Source:** [https://www.youtube.com/watch?v=XctooiH0moI](https://www.youtube.com/watch?v=XctooiH0moI) **Duration:** 00:02:45 ## Summary - The video demonstrates building a chat app that uses Retrieval‑Augmented Generation to answer questions based on your own data, which is a low‑cost way to apply LLMs in a business context. - Streamlit is used for the UI, with chat input and message components, and a session‑state variable is created to store and display the full conversation history. - LangChain provides the core LLM integration; an IBM Watsonx LLM (Llama‑2‑70B‑Chat) is accessed via an API key and project ID, with decoding parameters set for the model. - The app sends user prompts to the LLM, captures the assistant’s responses, and renders both sides of the dialogue as markdown within the Streamlit chat interface. - A PDF loader is added in a later step, converting the document into chunks, embedding them, and indexing them in a LangChain vector store so the model can retrieve and use custom data during chat. ## Sections - [00:00:00](https://www.youtube.com/watch?v=XctooiH0moI&t=0s) **Building a Retrieval-Augmented Chat App** - The speaker demonstrates how to create a Streamlit‑based interface using LangChain that employs retrieval‑augmented generation to let users chat with their own data, covering dependency setup, UI components, and handling message history through session state. ## Full Transcript

0:00in this video I'm going to show you how 0:01to build a large language model app to 0:02chat with your own data this is arguably 0:04the cheapest and the most efficient way 0:06to get started with llms for your own 0:07business but before we do that I want to 0:09back up a little the technique that 0:11makes this work is called retrieval 0:12augmented generation a fancy way of 0:14saying we Chuck in chunks of your data 0:15into a prompt and get the llm to answer 0:17based on that context the first thing 0:19that we need to do is build an app to 0:20chat to there's a bunch of dependencies 0:21that I need to import they're mainly 0:23from Lang chain but there's a little 0:24streamlit and what's an next thring for 0:26good measure I'll explain these as I use 0:27them so don't stress for now streamlit 0:28has a great set of chart component so 0:30I'm going to make the most of them add 0:31in a chat input component to hold the 0:33prompt and then display the user message 0:34using the chat message component via 0:36markdown this means I can now see the 0:38messages showing up in the app but it's 0:40only displaying the last message posted 0:42not all of them easy fix create a 0:44streamlet state variable I'll call it 0:46message and append all of the user 0:47prompts into it while add it I'll save 0:49the roll type in this case user into the 0:51dictionary and then I can test it out hm 0:54but the history doesn't show up Well 0:56turns out I haven't printed out the 0:58historical messages yet Lop through all 1:00the messages in the session State 1:01message variable and use the chat 1:02message component to display them and 1:05wait did I save the app of course not 1:07I'd never make a mistake like that let's 1:09just try that again and look at this 1:12historical prompts that have been passed 1:14through whoopy Doo Nick where's the L 1:16well let's do it I'm going to use the 1:18Lang chain interface to what to next. a 1:20why well it uses state-of thee out large 1:21language models doesn't use your data to 1:23train and it's built for business but 1:25that's just scraping the surface to do 1:27that I'll create a credentials 1:28dictionary with an API key and use the 1:30ml service URL you can create an API key 1:32from the IBM Cloud am menu URLs for 1:35different regions are shown on the 1:36screen right now then the llm I'm using 1:38llama 270b chat because I'm pretty fond 1:40of those furry buggers pass through some 1:42decoding parameters and specify the 1:44project ID from what to next now send 1:46the prom through to the llm and boom 1:49wait it looks like it's running but I 1:51need to show the llm responses as well 1:53easy enough with the streamlit chat 1:54message component note the chat role for 1:56the llm response is assistant rather 1:58than user this helps to differentiate 2:00the responses I'll render The Prompt as 2:02markdown and save the message to the 2:03session State as well that way the 2:05history is displayed in the app and now 2:07it works yeah yeah that's great but 2:09where's the custom data come into plate 2:11entering phase 3 I'll add in a load PDF 2:14function and specify the name of the PDF 2:15here then pass that to the Lang chain 2:17Vector store index Creator and choose 2:19the embeddings function to use this 2:20basically chunks up the PDF and loads it 2:22into a vector database chroma DB in this 2:24case wrapping it in the st. case 2:26resource function means that streamlet 2:28doesn't need to reload it each time 2:29which makes it a whole heat faster 2:31anyway I can then pass that index to the 2:32Lang chain retriever QA chain and swap 2:34out the base llm for the Q&A chain using 2:36chain. run and we can now chat with our 2:39PDF in this case a PDF to do with 2:41generative AI meta I know