TruthfulQA
ai_benchmark
Overview
Open source✓ Open Source
Use casemeasuring whether LLMs generate truthful answers to questions that invite misconceptions
Also see
Alternative to
Knowledge graph stats
Claims6
Avg confidence97%
Avg freshness99%
Last updatedUpdated yesterday
Trust distribution
100% unverified
Governance
Not assessed
TruthfulQA
concept
Benchmark measuring whether language models generate truthful answers to questions humans would answer incorrectly
Compare with...alternative to
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| SimpleQA | ○Unverified | High | Fresh | 1 |
evaluates
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| truthfulness and resistance to common human misconceptions | ○Unverified | High | Fresh | 1 |
open source
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| true | ○Unverified | High | Fresh | 1 |
primary use case
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| measuring whether LLMs generate truthful answers to questions that invite misconceptions | ○Unverified | High | Fresh | 1 |
first released
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| 2021 | ○Unverified | High | Fresh | 1 |
created by
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| Stephanie Lin, Jacob Hilton, and Owain Evans | ○Unverified | High | Fresh | 1 |