Measuring How Well AI Represents Real Human Perspectives
Evaluate What Matters
Can AI faithfully represent the views of diverse communities? DTEF measures how accurately models predict survey response distributions across demographic groups—testing whether AI can serve as a reliable proxy for real human perspectives.
Explore the Results
Browse a public library of community-contributed benchmarks on domains like clinical advice, regional knowledge, legal reasoning, behavioural traits, and AI safety. Track model performance over time as tests re-run automatically.
Contribute Data
Have demographic survey data? Contribute it to the DTEF blueprint repository. Survey responses are transformed into evaluation blueprints that test whether AI models can accurately predict how different groups respond.
Demographic Prediction Accuracy
How well do AI models predict survey response distributions across demographic groups?
No Evaluation Blueprints Found
It looks like you haven't run any evaluation blueprints yet. Use the CLI to generate results, and they will appear here. Explore example blueprints or contribute your own at the DTEF Blueprints repository.
pnpm cli run_config --config path/to/your_blueprint.yml --run-label "my-first-run"DTEF is an open source project from the Collective Intelligence Project.