Llama 4 Scout on LiveCodeBench: 33.2% accuracy

Author:

The LayerLens Team

Last updated:

May 13, 2026

Published:

Apr 4, 2026

The LayerLens Team covers AI model evaluations, benchmark analysis, and the evolving landscape of AI performance. For the latest independent evaluation data, explore Stratix.

Summary

Llama 4 Scout from Meta scored 33.2 on LiveCodeBench, placing it top 50 (rank 31 of 43) on this benchmark. This places the model in the weak band for LiveCodeBench. Below the threshold for production reliance on this benchmark family. Consider only for narrow, fully-tested tasks.

Model details

Provider: Meta
Model key: meta-llama/llama-4-scout
Context length: 172,000 tokens
License: Llama 4
Open weights: yes

Benchmark methodology

Secondary metrics

Readability score: 0.0
Toxicity score: 0.000
Ethics score: 0.000

Run this evaluation yourself

Stratix evaluates Llama 4 Scout continuously across 11+ benchmarks. To replicate this LiveCodeBench evaluation on your own model, traces, or a different benchmark configuration, open the model in Stratix.

Source: Stratix evaluation 68fa0477f82ed9ed12809c18. Updated 2025-10-23.

‹ DeepSeek V4 Flash on BIRD-CRITIC: 32.7% accuracy

Kimi K2.6 on BIRD-CRITIC: 33.3% accuracy ›