Build software better, together

cvs-health / uqlm

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection

uncertainty-quantification uncertainty-estimation ai-safety confidence-score hallucination confidence-estimation ai-evaluation llm llm-evaluation llm-safety hallucination-evaluation hallucination-detection hallucination-mitigation llm-hallucination

Updated Dec 15, 2025
Python

KRLabsOrg / LettuceDetect

Star

LettuceDetect is a hallucination detection framework for RAG applications.

python nlp pytorch information-extraction bert token-classification hallucination-evaluation hallucination-detection

Updated Sep 9, 2025
Python

NishilBalar / Awesome-LVLM-Hallucination

Star

up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources

mlm hallucination large-language-models llm mllm large-vision-language-models multimodal-large-language-models hallucination-evaluation hallucination-detection vision-language-models lvlm hallucination-mitigation hallucination-survey hallucination-research hallucination-benchmark multimodal-language-model

Updated Oct 3, 2025

IAAR-Shanghai / UHGEval

Star

[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.

benchmark evaluation dataset openai hallucination huggingface huggingface-transformers ceval gpt-3 openai-api hallucinations gpt-4 large-language-models llm chatgpt qwen hallucination-evaluation hallucination-detection

Updated Jun 7, 2025
Python

MemTensor / HaluMem

Star

HaluMem is the first operation level hallucination evaluation benchmark tailored to agent memory systems.

benchmark ai memory memos hallucination long-term-memory memzero llm hallucination-evaluation llm-memory mem0 memory-system memobase

Updated Dec 13, 2025
Python

Ruiyang-061X / VL-Uncertainty

Star

🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".

uncertainty uncertainty-quantification multi-modal uncertainty-estimation uncertainty-analysis hallucination vision-language vision-language-model large-vision-language-model hallucination-evaluation hallucination-detection multi-modal-large-language-model

Updated Mar 18, 2025
Python

AikyamLab / hallucinogen

Star

A benchmark for evaluating hallucinations in large visual language models

ai aisafety visual-language-models hallucination-evaluation hallucination-detection medical-safety medical-visual-language-model

Updated Mar 18, 2025
Python

Unofficial implementation of Microsoft’s Claimify Paper: extracts specific, verifiable, decontextualized claims from LLM Q&A to be used for Hallucination, Groundedness, Relevancy and Truthfulness detection

nugget fact-checking factoids relevancy factoid hallucination fact-verification truthfulness hallucination-evaluation hallucination-detection hallucination-mitigation

Updated Aug 25, 2025
Python

dataaispark-spec / TrustScoreEval

Star

TrustScoreEval: Trust Scores for AI/LLM Responses — Detect hallucinations, flags misinformation & Validate outputs. Build trustworthy AI.

ai ml chatbots agents hallucination rag hallucinations trustworthy-ai llm finetuning-llms hallucination-evaluation hallucination-detection aiagents hallucination-mitigation hallucination-grader trustscore hallucination-hunting hallucination-prevention hallucination-quantification

Updated Oct 13, 2025
Python

amazon-science / THRONE

Star

Code release for THRONE, a CVPR 2024 paper on measuring object hallucinations in LVLM generated text.

benchmark hallucination hallucinations large-language-models large-language-model vision-language-model large-vision-language-model large-vision-language-models cvpr2024 hallucination-evaluation vision-language-models

Updated Aug 6, 2025
Python

rkhokhla / kakeya

Star

When AI makes $10M decisions, hallucinations aren't bugs—they're business risks. We built the verification infrastructure that makes AI agents accountable without slowing them down.

platform iot distributed-systems multi-tenant ai health-check saas compliance blockchain-technology anti-fraud mlops llm llms llmops llm-training llm-inference hallucination-evaluation hallucination-detection hallucination-mitigation

Updated Oct 25, 2025
Go

meghajbhat / Reducing-Hallucinations-in-LLMs-using-Prompt-Engineering-Strategies

Star

A comprehensive study on reducing hallucinations in Large Language Models through strategic prompt engineering techniques. (COV + COT + Hybrid)

gpu python3 kaggle hallucinations prompt-engineering generative-ai chainofthought hallucination-evaluation hallucination-detection chainofverification

Updated Nov 15, 2025
Jupyter Notebook

Workofarttattoo / BaseX-Coding-Language

Star

HALLUCINATED BY CURSOR WITh CODEX PLUGIN:::BEWARE:::::BaseX Coding Language - Revolutionary Base 5.10 Quantum Teleportation & Infinite Storage System by Joshua Hendricks Cole

hallucination hallucination-evaluation

Updated Oct 11, 2025
Python

vasisthasinghal / uqlm

Star

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection

uncertainty-quantification uncertainty-estimation ai-safety confidence-score llm llm-evaluation hallucination-evaluation hallucination-detection hallucination-mitigation llm-hallucination

Updated Jul 18, 2025
Python

Vikranth3140 / Citation-Hallucination-Detection

Star

A robust hybrid pipeline for detecting hallucinated citations in academic papers and research documents. The system combines exact bibliographic lookup, fuzzy matching, and optional LLM verification to classify citations as valid, partially valid, or hallucinated.

llm-evaluation hallucination-evaluation hallucination-detection citation-hallucination

Updated Dec 7, 2025
Python

ashioyajotham / Value-Aligned-Confabulation-VAC-Research

Star

Driving away from the binary "hallucinations" evals to a more nuanced and context-dependent eval technique.

evaluation-metrics ai-safety value-alignment llm-evaluation hallucination-evaluation confabulations

Updated Dec 6, 2025
Python

Rakin061 / RAG-Domain-Adaptation-Hotel-Domain

Star

Dataset Generation and Pre-processing Scripts for the Research titled: Leveraging the Domain Adaptation of Retrieval Augmented Generation (RAG) Models in Conversational AI for Enhanced Customer Service

domain-adaptation rag hallucination-evaluation

Updated Sep 28, 2024
Jupyter Notebook

emrehannn / contextual-hallucination-detector

Star

An interactive Python chatbot demonstrating real-time contextual hallucination detection in Large Language Models using the "Lookback Lens" method. This project implements the attention-based ratio feature extraction and a trained classifier to identify when an LLM deviates from the provided context during generation.

python nlp machine-learning machine-learning-algorithms chatbot attention-mechanism ai-tools large-language-models llm hallucination-evaluation hallucination-detection hallucination-mitigation

Updated May 16, 2025
Python

stadiello / ChatBotBRMS

Star

This project integrates business rules management systems (BRMS) and a RAG, to offer an automated text generation solution, applicable in different contexts and significantly reducing LLM hallucinations. It's a complete architecture available in a chatBot and fully scalable according to needs

api machine-learning automation brms rag llm generative-ai hallucination-evaluation

Updated May 13, 2025
Python

Skwert001 / hlft-legality-engine

Star

Legality-gated evaluation for LLMs, a structural fix for hallucinations that penalizes confident errors more than abstentions.

Updated Sep 20, 2025
Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hallucination-evaluation

Here are 20 public repositories matching this topic...

cvs-health / uqlm

KRLabsOrg / LettuceDetect

NishilBalar / Awesome-LVLM-Hallucination

IAAR-Shanghai / UHGEval

MemTensor / HaluMem

Ruiyang-061X / VL-Uncertainty

AikyamLab / hallucinogen

deshwalmahesh / claimify

dataaispark-spec / TrustScoreEval

amazon-science / THRONE

rkhokhla / kakeya

meghajbhat / Reducing-Hallucinations-in-LLMs-using-Prompt-Engineering-Strategies

Workofarttattoo / BaseX-Coding-Language

vasisthasinghal / uqlm

Vikranth3140 / Citation-Hallucination-Detection

ashioyajotham / Value-Aligned-Confabulation-VAC-Research

Rakin061 / RAG-Domain-Adaptation-Hotel-Domain

emrehannn / contextual-hallucination-detector

stadiello / ChatBotBRMS

Skwert001 / hlft-legality-engine

Improve this page

Add this topic to your repo