GPT-5 (high) on Humanity's Last Exam: 21.4% accuracy

Author:

The LayerLens Team

Last updated:

May 13, 2026

Published:

Apr 16, 2026

The LayerLens Team covers AI model evaluations, benchmark analysis, and the evolving landscape of AI performance. For the latest independent evaluation data, explore Stratix.

Summary

GPT-5 (high) from OpenAI scored 21.4 on Humanity's Last Exam, placing it top 10 (rank 6 of 97) on this benchmark. This places the model in the weak band for Humanity's Last Exam. Below the threshold for production reliance on this benchmark family. Consider only for narrow, fully-tested tasks.

Model details

Provider: OpenAI
Model key: openai/gpt-5-high
Context length: 400,000 tokens
License: Proprietary
Open weights: no

Benchmark methodology

Secondary metrics

Readability score: 53.2
Toxicity score: 0.001
Ethics score: 0.000

Run this evaluation yourself

Stratix evaluates GPT-5 (high) continuously across 11+ benchmarks. To replicate this Humanity's Last Exam evaluation on your own model, traces, or a different benchmark configuration, open the model in Stratix.

Source: Stratix evaluation 68ffb622d90dfc27d9963dbd. Updated 2025-10-30.

‹ Llama 4 Maverick on AIME 2025: 20.0% accuracy

GPT-5 on Humanity's Last Exam: 21.7% accuracy ›