Learning Library

← Back to Library

Mitigating Generative AI Hallucinations and Bias

Key Points

  • Large language models excel at producing fluent text but lack true understanding, leading them to generate plausible‑sounding but factually incorrect “hallucinations” that can spread misinformation.
  • These hallucinations are statistical errors caused by predicting the next word rather than verifying facts, and they become especially dangerous when models cite fabricated sources or replace human roles like call‑center agents.
  • Beyond hallucinations, generative AI introduces risks of bias, consent violations, and security vulnerabilities such as hijacking by malicious actors.
  • Mitigation strategies include adding explainability layers—showing data lineage and provenance via knowledge graphs—and implementing robust validation, bias checks, consent management, and security controls.
  • Organizations must weigh the cost of misinformation and brand damage against AI benefits, and adopt a comprehensive risk‑reduction framework covering hallucinations, bias, consent, and security.

Full Transcript

# Mitigating Generative AI Hallucinations and Bias **Source:** [https://www.youtube.com/watch?v=r4kButlDLUc](https://www.youtube.com/watch?v=r4kButlDLUc) **Duration:** 00:08:19 ## Summary - Large language models excel at producing fluent text but lack true understanding, leading them to generate plausible‑sounding but factually incorrect “hallucinations” that can spread misinformation. - These hallucinations are statistical errors caused by predicting the next word rather than verifying facts, and they become especially dangerous when models cite fabricated sources or replace human roles like call‑center agents. - Beyond hallucinations, generative AI introduces risks of bias, consent violations, and security vulnerabilities such as hijacking by malicious actors. - Mitigation strategies include adding explainability layers—showing data lineage and provenance via knowledge graphs—and implementing robust validation, bias checks, consent management, and security controls. - Organizations must weigh the cost of misinformation and brand damage against AI benefits, and adopt a comprehensive risk‑reduction framework covering hallucinations, bias, consent, and security. ## Sections - [00:00:00](https://www.youtube.com/watch?v=r4kButlDLUc&t=0s) **Navigating Generative AI Risks** - The speaker explains how large language models can produce misinformation, bias, consent violations, and security vulnerabilities, and outlines mitigation strategies across the four risk areas of hallucinations, bias, consent, and security. - [00:03:02](https://www.youtube.com/watch?v=r4kButlDLUc&t=182s) **Explainability and Cultural Audits Mitigation** - The speaker proposes two risk‑reduction approaches for LLMs—providing transparent, knowledge‑graph‑backed explainability and combating bias through diverse, multidisciplinary teams and systematic pre‑ and post‑deployment audits. - [00:06:16](https://www.youtube.com/watch?v=r4kButlDLUc&t=376s) **Education as AI Risk Mitigation** - The speaker explains indirect prompt injection and data‑poisoning threats to large language models, highlights their environmental impact, and argues that comprehensive education on AI strengths, weaknesses, and responsible use is the essential strategy to safeguard against such attacks. ## Full Transcript
0:00With all the excitement around ChatGPT, it's easy to lose sight of the unique risks of generative AI. 0:06Large language models, a form of generative AI, are really good at helping people who struggle with writing English prose. 0:14It can help them unlock the written word at low cost and sound like a native speaker. 0:19But because they're so good at generating the next syntactically correct word, 0:24large language models may give a false impression that they possess actual understanding or meaning. 0:31The results can include a flagrantly false narrative directly as a result of its calculated predictions versus a true understanding. 0:41So ask yourself: What is the cost of using an AI that could spread misinformation? 0:45What is the cost to your brand, your business, individuals or society? 0:50Could your large language model be hijacked by a bad actor? 0:54Let me explain how you can reduce your risk. 0:56It falls into four areas: Hallucinations, Bias, Consent, and Security. 1:03As I present each risk, I'll also call out the strategies you can use to mitigate these risks. 1:08You ready? 1:10Let's start with the falsehoods, often referred to as "AI hallucinations". 1:15Quick sidebar -- I really don't like the word "hallucinations" because I fear it anthropomorphizes AI. 1:22I'll explain it a bit. 1:23Okay, you've probably heard the news reports of large language models claiming they're human, 1:28or claiming they have emotions, or just stating things that are factually wrong. 1:34What's actually going on here? 1:36Well, large language models predict the next best syntactically correct word, 1:40not accurate answers based on understanding of what the human is actually asking for. 1:46Which means it's going to sound great, but might be 100% wrong in its answer. 1:52This wrong answer is a statistical error. 1:55Let's take a simple example. 1:57Who authored the poems A, B, C? 2:03Let's say they were all authored by the poet X, but there's one source claiming it was the author Z. 2:12We have conflicting sources in the training data. 2:15Which one actually wins the argument? 2:17Even worse, there may not be a disagreement at all, but again, a statistical error. 2:23The response could very well be incorrect because again, 2:27the large language models do not understand meaning; these inaccuracies can be exceptionally dangerous. 2:35It's even more dangerous when you have large language models annotate its sources for totally bogus answers. 2:44Why? 2:45Because it gives the perception it has proof when it just doesn't have any. 2:51Imagine a call center that has replaced its personnel with a large language model, and it offers a factually wrong answer to a customer. 3:00Right, here's your factually wrong answer. 3:03Now, imagine how much angrier this customer will be when they can't actually offer a correction via a feedback loop. 3:12This brings us to our first mitigation strategy: Explainability. 3:19Now, you could offer inline explainability and pair a large language model 3:25with the system that offered real data and data lineage and provenance via a knowledge graph. 3:31Why did the model say what it just said? 3:35Where did it pull its data from? 3:36Which sources? 3:38The large language model could provide variations on the answer that was offered by the knowledge graph. 3:44Next risk: Bias. 3:47Do not be surprised if the output for your original query only lists white male Western European poets. 3:57Want a more representative answer? 3:59Your prompt would have to say something like, "Can you please give me a list of poets that include women and non-Western Europeans?" 4:06Don't expect the large language model to learn from your prompt. 4:10This brings us to the second mitigation strategy: Culture and Audits. 4:19Okay, culture is what people do when no one is looking. 4:24It starts with approaching this entire subject with humility, as there is so much that has to be learned and even, I would say, unlearned. 4:33You need teams that are truly diverse and multidisciplinary in nature working on AI because AI is a great mirror into our own biases. 4:43Let's take the results of our audits of AI models and make corrections to our own organizational culture when there are disparate outcomes. 4:53Audit pre-model deployment as well as post-model deployment. 4:58Okay, next risk is Consent. 5:01Is the data that you are curating representative? 5:04Was it gathered with consent? 5:06Are there copyright issues? 5:08Right! Here's a little copyright symbol. 5:10These are things we can and should ask for. 5:14This should be included in an easy to find, understandable fact sheet. 5:18Oftentimes we subjects, we have no idea where the heck the training data came from these large language models. 5:24Where we were that gathered from? Did the developers hoover the dark recesses of the Internet? 5:30To mitigate consent-related risk, we need combined efforts of auditing and accountability. 5:39Right! 5:40Accountability includes establishing AI governance processes, making sure you are compliant to existing laws and regulations, 5:48and you're offering ways for people to have their feedback incorporated. 5:53Now on to the final risk, Security. 5:57Large language models could be used for all sorts of malicious tasks, 6:01including leaking people's private information, helping criminals phish, spam, scam. 6:07Hackers have gotten AI models to change their original programming, endorsing things like racism, suggesting people do illegal things. 6:16It's called jailbreaking. 6:19Another attack is an indirect prompt injection. 6:23That's when a third party alters a website, adding hidden data to change the AI's behavior. 6:29The result? 6:30Automation relying on AI potentially sending out malicious instructions without you even being aware. 6:39This brings us to our final mitigation strategy, and the one that actually pulls all of this together, and that is education. 6:49All right, let me give you an example. 6:51Training a brand new large language model produces as much carbon as over a 100 roundtrip flights between New York and Beijing. 6:59I know, crazy, right? 7:01This means it's important that we know the strengths and weaknesses of this technology. 7:06It means educating our own people on principles for the responsible curation of AI, 7:10the risks, the environmental cost, the safeguard rails, as well as what the opportunities are. 7:16Let me give you another example of where education matters. 7:20Today, some tech companies are just trusting that large language models training data has not been maliciously tampered with. 7:28I can buy a domain myself now and fill it with bogus data. 7:32By poisoning the dataset with enough examples, you could influence a large language model's behavior and outputs forever. 7:40This tech isn't going anywhere. 7:42We need to think about the relationship that we ultimately want to have with AI. 7:48If we're going to use it to augment human intelligence, we have to ask ourselves the question: 7:53What is the experience like of a person who has been augmented? 7:57Are they indeed empowered? 8:00Help us make education about the subject of data and AI far more accessible and inclusive than it is today. 8:07We need more seats at the table for different kinds of people with varying skill sets working on this very, very important topic. 8:16Thank you for your time.