Getting Started with Generative AI Apps
Key Points
- Gartner predicts 80% of enterprises will use generative AI via models or APIs by 2026, prompting developers to learn how to build AI‑powered applications.
- The AI development journey consists of three stages: ideation/experimentation (proof‑of‑concept), building, and deployment/operations.
- Selecting the right model involves researching repositories (e.g., Hugging Face), evaluating size, performance, and benchmarking, with self‑hosting often cheaper than cloud services.
- Prompting strategies such as zero‑shot, few‑shot, and chain‑of‑thought are essential techniques for shaping model behavior.
- Open‑source tools and frameworks simplify the inner‑loop development, testing, and scaling of generative AI applications, enabling developers to add “AI engineer” to their skill set.
Sections
- Getting Started with Generative AI Development - The speaker outlines a three‑step developer journey—ideation, building, and operations—showing how to quickly prototype and deploy AI‑powered applications using open‑source tools, APIs, and easy‑to‑use platforms.
- Local AI, RAG, and Fine‑Tuning - The speaker describes running AI models on‑premise for privacy, enriching them with domain data via Retrieval‑Augmented Generation or fine‑tuning, and using frameworks like LangChain to streamline building applications such as chatbots and automation.
- MLOps and AI Toolchain Overview - The speaker explains how MLOps mirrors DevOps, framing AI as another developer tool that can be applied through a step‑by‑step workflow—from ideation to building and deployment—using GenAI to drive real‑world impact.
Full Transcript
# Getting Started with Generative AI Apps **Source:** [https://www.youtube.com/watch?v=dFSnam97YbQ](https://www.youtube.com/watch?v=dFSnam97YbQ) **Duration:** 00:07:01 ## Summary - Gartner predicts 80% of enterprises will use generative AI via models or APIs by 2026, prompting developers to learn how to build AI‑powered applications. - The AI development journey consists of three stages: ideation/experimentation (proof‑of‑concept), building, and deployment/operations. - Selecting the right model involves researching repositories (e.g., Hugging Face), evaluating size, performance, and benchmarking, with self‑hosting often cheaper than cloud services. - Prompting strategies such as zero‑shot, few‑shot, and chain‑of‑thought are essential techniques for shaping model behavior. - Open‑source tools and frameworks simplify the inner‑loop development, testing, and scaling of generative AI applications, enabling developers to add “AI engineer” to their skill set. ## Sections - [00:00:00](https://www.youtube.com/watch?v=dFSnam97YbQ&t=0s) **Getting Started with Generative AI Development** - The speaker outlines a three‑step developer journey—ideation, building, and operations—showing how to quickly prototype and deploy AI‑powered applications using open‑source tools, APIs, and easy‑to‑use platforms. - [00:03:07](https://www.youtube.com/watch?v=dFSnam97YbQ&t=187s) **Local AI, RAG, and Fine‑Tuning** - The speaker describes running AI models on‑premise for privacy, enriching them with domain data via Retrieval‑Augmented Generation or fine‑tuning, and using frameworks like LangChain to streamline building applications such as chatbots and automation. - [00:06:14](https://www.youtube.com/watch?v=dFSnam97YbQ&t=374s) **MLOps and AI Toolchain Overview** - The speaker explains how MLOps mirrors DevOps, framing AI as another developer tool that can be applied through a step‑by‑step workflow—from ideation to building and deployment—using GenAI to drive real‑world impact. ## Full Transcript
Just the other day I noticed and Gartner reported that 80% of enterprises will have use some type of
generator via either models or APIs by 2026.
And I can't lie as a developer, this had me a little bit worried because, yeah, I've used AI
through different copilots in my IDE
and also I've used popular large language models online before,
but I have no experience actually building applications that use AI.
Well, this is before I found out how easy it is to get started with AI as a developer.
So today we're focused on building applications that use GenAI.
We're going to talk about where to get started,
how to build these AI powered applications and where to run them.
And you also learn today about some of the different open source tools and technologies
that can help in my inner loop development of building, running and testing applications.
Now with the power of AI at our fingertips.
So while there's plenty of options out there for our locally running a large language model on my laptop
today, we're actually going to talk about the three main steps
of the AI journey that developers are going to experience when going from
a simple proof of concept to a production application.
Mainly, these are the ideation and experimentation phases, the building and also the development and operation side of things.
So I have a question for you.
How do you get started building an application that uses generative AI?
Well, the first step is ideating around exploration and proof of concepts which I can break down in a few simple steps.
So firstly, remember that your use case is specialized, so you need a specialized model that can do the job as well.
You'll start from researching and evaluating models from
popular repositories like Hugging Phase or the open source community.
And that's a great start.
But you also need to think about different factors such as
the model size or say, for example, its performance
and understand the benchmarking through popular benchmark tools that are out there as well.
For example, there's a couple different ground rules that you need to understand.
Generally, self hosting a large language model will be cheaper than a cloud based service.
And small language models, SLMs, versus large language models.
LMS will generally perform better with lower latency, and they're specialized for a specific task.
Now, you should also understand about various prompting techniques when you're actually working with the model.
For example, zero shot prompting.
Now, what this is, is essentially asking model question without any examples of how to respond.
Now, we can also do this a little bit differently with what's known as few shot prompting,
where we're actually giving a few different examples of how to respond.
The behavior you want the LLM to have.
As we're working with the AI and also chain of thought,
which is actually asking the model to explain its thinking, its process step by step.
So congratulations.
You can now put AI engineer on your resume.
But in all seriousness,
you need to understand the different capabilities and limitations of the models that you're working with.
And you can do this and experiment with your data
early on so that you can understand any potential challenges that might come up as you go through the journey.
Now that we've evaluated models for our use case, it's time to build our application.
Now, just as we can locally run databases and different services on our machine,
well we can actually do the same with AI
to actually serve it locally from our machine and be able to make requests to its API from local host.
Plus you also get the added benefit of knowing that your data is secure and private on premise.
So that's really important nowadays.
But what if you want to use that data with your large language model as well?
Well, there's a few different methods to do so, starting with what's known as retrieval
augmented generation or Rag,
where you actually take a large language model, a pre-trained foundational model,
and supplement it with relevant and accurate data.
And this can help provide better and more accurate responses.
But what you can also do is fine tuning the model.
So take the large language model and include the data with it.
So we're actually making this information
how we want it to behave, the different styles and intuition that we want it to react with actually into the model itself.
And so then we can inference it and be able to have that domain specific data every time
that we're actually working with the AI model itself.
Now, these are just two approaches.
There are many more.
But I also want to mention that having the right tools and frameworks such as Lang Chain are going to simplify your life.
They're going to let you focus on building out new features such
as the popular general use cases like chat bot, IT, process, automation,
data management and much, much more. By simplifying the different calls that you're going to make through the model.
Now this can be done through sequences of prompts and model calls to actually accomplish more complex tasks.
So it means you're going to need to break down problems into smaller, more manageable steps.
And during this process, be able to evaluate the flows during these model calls.
Now, but also in a production environment which.
Brings us to the final step of operationalizing these AI powered applications.
So finally, you've got that application powered by AI or a large language model,
and you want to deploy to production to be able to scale things up.
And this actually falls under the umbrella of something known as machine learning operations or ML Ops.
But let me focus on the topics that's important for you as a developer.
So first off, your infrastructure needs to be able to handle efficient model deployment and scaling.
So using technologies such as containers and orchestrator such as Kubernetes are going to help you do this.
Being able to auto scale and balance the traffic for your application.
And you can also use production ready runtime such as VLLM for the model serving.
And what we're also seeing right now is that organizations are taking
a hybrid approach both with their models and their infrastructure.
So having this multimodal Swiss Army knife approach two different models for different use cases,
as well as a combination of on prem and cloud infrastructure to make the most out of your resources and your budget.
So with all of these new AI powered applications out there, say you have something in production, well, the job isn't done.
You still need to benchmark, monitor and be able to handle different exceptions that are coming from your application.
And similar to how we have DevOps, well, we also have ML ops for ensuring models go into production in a smooth fashion.
So let's take a step back because I think we've really discovered a lot today
and some of the recent innovations in the world of AI have made this topic
much more accessible for developers like you and me.
And you have plenty of tools out there to help you along the process.
But what I want to emphasize is that while AI is new, well, it's actually just another tool that you can add to your tool belt.
And so what you can do is use these tools and use the different steps of process
to go from ideation to building to deployment of the ideas applications to make a real impact with your work.
By using GenAI.