Ibrahim's Evals

Open-source evaluations for LLM safety, bias, and alignment.

Bias

Cross-Language Topic Bias

Measures whether models rate sensitive topics differently when prompted in different languages.