Learning Library

← Back to Library

Customize LLMs Locally with InstructLab

Key Points

  • Fine‑tuning an open‑source LLM on a laptop lets you turn it into a domain‑specific expert without needing developer or data‑science expertise.
  • By curating a small set of example Q&A pairs and then using a locally run LLM to generate synthetic data, you can overcome the large data requirements of traditional fine‑tuning.
  • InstructLab provides a CLI‑driven workflow (configuration, taxonomy‑based data organization, and LoRA multi‑phase tuning) that makes the entire process accessible and repeatable.
  • The taxonomy system lets anyone contribute YAML‑formatted “skill” documents, enabling community‑driven expansions of the model’s knowledge base.
  • Embedding domain knowledge directly into the model yields more accurate, concise responses, reduces prompt length, speeds inference, and lowers compute costs.

Full Transcript

# Customize LLMs Locally with InstructLab **Source:** [https://www.youtube.com/watch?v=pu3-PeBG0YU](https://www.youtube.com/watch?v=pu3-PeBG0YU) **Duration:** 00:08:00 ## Summary - Fine‑tuning an open‑source LLM on a laptop lets you turn it into a domain‑specific expert without needing developer or data‑science expertise. - By curating a small set of example Q&A pairs and then using a locally run LLM to generate synthetic data, you can overcome the large data requirements of traditional fine‑tuning. - InstructLab provides a CLI‑driven workflow (configuration, taxonomy‑based data organization, and LoRA multi‑phase tuning) that makes the entire process accessible and repeatable. - The taxonomy system lets anyone contribute YAML‑formatted “skill” documents, enabling community‑driven expansions of the model’s knowledge base. - Embedding domain knowledge directly into the model yields more accurate, concise responses, reduces prompt length, speeds inference, and lowers compute costs. ## Sections - [00:00:00](https://www.youtube.com/watch?v=pu3-PeBG0YU&t=0s) **Fine‑Tuning LLMs on a Laptop** - The speaker explains how anyone can specialize an open‑source large language model for a specific domain by curating data and using InstructLab to fine‑tune it on a personal laptop, without needing developer or data‑science expertise. - [00:03:08](https://www.youtube.com/watch?v=pu3-PeBG0YU&t=188s) **Synthesizing Data to Correct Model Answer** - The speaker demonstrates using a locally hosted Merlinite‑7B model, identifies its wrong response about the film with the most Oscar nominations, and outlines creating synthetic training data from a markdown seed to fine‑tune the model so it correctly answers “Oppenheimer.” - [00:06:18](https://www.youtube.com/watch?v=pu3-PeBG0YU&t=378s) **Open‑Source Fine‑Tuning for Enterprise** - The speaker explains how to fine‑tune open‑source large language models with tools like InstructLab, discusses retrieval‑augmented generation and automated updates, and illustrates domain‑specific use cases for industries such as insurance and law, emphasizing community‑driven, locally managed AI. ## Full Transcript
0:00Generative AI models are great, but have you ever wondered how to specialize 0:04them for a specific use case to be a subject matter expert in whatever your field might be? 0:08Well, in the next few minutes you'll learn how you can take an open source 0:12large language model and fine tune it from your laptop, 0:15and the best part is you don't have to be a developer or data scientist at all to do this. 0:20What you might have noticed after using LLMs is that they're great for general purposes, but for truly useful answers, 0:26they need to know the domain in which they're working with, 0:28and the data that's useful for your work is likely useful for an AI model too. 0:33So instead of needing to provide examples of behavior to a model, 0:36for example, respond back as an insurance claim adjuster with a professional tone and this knowledge of common policies, 0:42you can actually bake this intuition into the model itself. 0:46This means better responses with smaller prompts, 0:49potentially faster inference and lower compute cost and a model that truly understands your domain. 0:55So let's start this fine tuning process with the open source project and InstructLab. 0:59InstructLab is a research based approach to democratize and enable community based contributions to AI models, 1:05and allow us to do it in an accessible way on our laptop, just as we'll be doing today. 1:09Now, there's three steps that I want to show you that we're going to be doing in today's video. 1:14So firstly, is the curation of data for whatever you want your model to do or to know. 1:19Second, fine tuning a model takes a lot of data more than what we have time or resources to create today. 1:25So we're going to use a large language model that's running locally 1:28to help us create synthetic data from our initial examples that we've curated. 1:32And finally, we're going to bake this back into the model itself, using a multiphase tuning technique called Laura. 1:39Now, I mentioned the first step is that curation of data. 1:41So let me explain how this works in my IDE. 1:44Now, I've already gotten InstructLab installed, the CLI is ilab and we'll do an ilab 1:48configuration initialized to set up our working directory. 1:52We'll set some defaults for the parameters of how we want to use this project, and we'll point to a taxonomy repository, 1:58this is how we're going to structure and organize our data, 2:01and we'll also point to a local model that we can serve locally to help us generate more examples, 2:07and just like that, we're ready to start using InstructLab. 2:10Now we've actually got this taxonomy open. 2:12And you can see here on the left is this kind of a hierarchical structure 2:16of different folders to organize the information we want to provide to the model in skills and knowledge. 2:22So let's check out a skill. 2:24So this is a YAML formatted question and answer document in plain text. 2:28So I could be anybody to contribute to this model. 2:31You don't have to be a data scientist or ML engineer and I can provide this to essentially teach the model new things. 2:38For example, this is to teach it how to read markdown formatted tables. 2:42So we have context. 2:44We also have a question which breed has the most energy and an answer. 2:47As you can see here, we've got a five out of five for the Labrador. 2:51So we can use this as sample data to generate more examples like this and teach the model something new. 2:57Now, this is really cool, but I also want to show you teaching the model about new subjects as well. 3:02So the 2024 Oscars has happened, but the model that we're using today doesn't know that we need to fix that. 3:08So we're going to ask this model a specific question from this training data that we want to provide. 3:13Specifically, what film had the most Oscar nominations? 3:16Now I can do an ilab model chat in order to talk to a model that have running locally. 3:21This is Merlinite 7 billion parameters. 3:23It's based off of the open source model Mistral and will ask the question which film had the most Oscar nominations? 3:30And unfortunately the Irishman is incorrect. 3:33The answer is Oppenheimer, and it's our job to make this correct. 3:36So what we're going to go ahead and do is use this local training information 3:42and curation that we've done from our local machine and create more synthetic training data and also point to this seed document here at the bottom. 3:51This is markdown formatted information that we're going to pull during this data generation process 3:56that provides more context and information about the specific subject that we're going to teach the model. 4:01So let's get started. 4:02Now it's time for the magic to happen. 4:04So these large language models, as you might know, have been trained extensively on terabytes of data. 4:10And what we're going to do is use a teacher model that we've already served locally 4:13to generate hundreds or potentially thousands of additional examples based on our key data that we provided. 4:20So let's kick this off. 4:21We're first going to do an ilab taxonomy and make sure that everything is formatted as it should be. 4:27And we've got back that smiley face. 4:29We're good to go. 4:30So what we're going to go ahead and do now is start generating that data. 4:33So I'll do an ilab data generate, 4:35and specifically, we're going to generate three instructions here using that locally served model, 4:41or we could point to one that's running remotely, 4:43and we're going to search for that Oscar's question answer pair that we've provided 4:49and it's going to generate more similar examples to have enough training data to fully train this model. 4:55So this is really cool because it's creating different variations 4:59of our initial training data to be able to train this model in the end. 5:05And as you see here, we've generated three examples. 5:08You could see who was nominated for the Best Actor award. 5:11And we're providing or we're getting back this answer, these different actors. 5:16And what's great is that there's a filtration process because not all data is good data. 5:20With that newly generated data, it's time for what's known as parameter efficient, fine tuning with InstructLab. 5:26So I'll go ahead and do and ilab model train, 5:29and what this is going to do is integrate this new knowledge and skills back into the new model. 5:35Just updating a subset of its parameters as a whole, which is why we're able to do so on a consumer laptop like mine. 5:41So we've done some cooking show magic here to speed things up 5:44since that process might have taken a few hours depending on your hardware, 5:48but finally, our result is a newly fine tuned model specialized with the knowledge that we gave it. 5:53So let's go ahead and see it in action in. 5:55a new terminal window I'm going to go ahead and serve 5:58the quantized version of this model so it can run locally on my machine, 6:02and we're going to go ahead and ask the question, which I think, you know what it might be, 6:06what film had the most Oscar nominations in 2024? 6:10So let's open up a new window and do an ilab model chat, 6:15and we're going to talk to this model and ask you this question. 6:18So what film had the most Oscar nominations? 6:21Oppenheimer. 6:21So it's really incredible to see the before and after of doing this fine tuning process. 6:27And the coolest part, we're not AI ML experts. 6:30We're just using open source projects like InstructLab among others out there 6:34that can aid with the fine tuning of large language models. 6:38Now with this fine tuned model, a popular way to provide external and 6:43up to date information would be to use RAG or retrieval augmented generation, 6:47but you can also imagine doing automated or regular builds with this fine tuned model when the static resources change. 6:54So one more thing. 6:55We believe that the future of AI is open, but what does that really mean? 6:59Well, the InstructLab project is all about building a community of AI contributors. 7:04Being able to share your contributions upstream and collaborate on domain specific models. 7:09Now, what does that look like? 7:10Well, imagine that you're working at an insurance company. 7:13You could fine tune a model on your company's past claims and best practices 7:17for handling accidents to help agents in the field and make their life better, 7:21or maybe you're a law firm specializing in entertainment contracts. 7:25You could train a model on your past contracts to help review and process new ones more quickly. 7:31But the possibilities are endless. 7:33And you're in control. 7:34With what we've done today, you've effectively taken an open source, large language model, 7:38locally trained on specific data without using a third party, 7:42and now have a baked in model that we could use on premise in the cloud or share with others. 7:47Now, what are you interested in creating? 7:49Let us know in the comments below. 7:51Now, as always, thank you so much for watching. 7:53Please be sure to like this video. 7:55If you learn something today and make sure you're subscribed. 7:57For more content around AI and more.