LLM Hallucination Index

A ranking & evaluation framework for LLM hallucinations

Our Index evaluated how well 22 of the leading models adhere to given context, helping developers make informed decisions about balancing price and performance. We conducted rigorous testing of top LLMs with input ranging from 1,000 to 100,000 tokens to answer the question of how well they perform across short, medium, and long context lengths. So let's dive into the insights.