Leanstral: Open-Source Foundation for Trustworthy AI Code Agents

2026-03-17T18:10:25.000Z·2 min read
Leanstral: Open-Source Foundation for Trustworthy AI Code Agents

Mistral AI releases Leanstral, the first open-source code agent for Lean 4 — a proof assistant capable of expressing complex mathematical objects and software specifications. With just 6B active parameters using a highly sparse architecture, Leanstral outperforms much larger models on formal proof engineering tasks.

The Problem: Human Review Bottleneck

AI agents have proven highly capable at code generation. Yet as these models are pushed into high-stakes domains — from frontier research mathematics to mission-critical software — the human review process becomes the primary bottleneck. The time and specialized expertise required to manually verify AI-generated code limits engineering velocity.

Enter Leanstral

Mistral AI's answer is Leanstral, a code agent specifically designed for Lean 4 that can both carry out tasks and formally prove implementations against strict specifications. Instead of humans debugging machine-generated logic, they simply dictate what they want.

Key technical details:

Performance: Beating Giants at a Fraction of the Cost

Leanstral-120B-A6B demonstrates significant efficiency advantages over much larger open-source models:

ModelActive ParamsFLTEval ScorePasses Needed
Leanstral6B38.9+1
GLM540B16.6Multiple
Kimi-K2.532B20.1Multiple
Qwen3.517B~384

Against closed-source competitors like Claude Opus 4.6 and Sonnet 4.6, Leanstral remains competitive while being dramatically more cost-efficient through parallel inference with Lean as a perfect verifier.

New Evaluation: FLTEval

Rather than testing isolated competition math problems, Mistral introduces FLTEval — an evaluation suite that tests completion of all formal proofs and correct definition of new mathematical concepts in real pull requests to the FLT (Fermat's Last Theorem) project. This reflects actual proof engineering scenarios rather than toy problems.

What This Means

Leanstral represents a paradigm shift: AI agents that don't just generate code, but can formally verify it against mathematical specifications. This could fundamentally change how we think about AI-assisted software development — moving from "trust but verify" to "prove it works."


Source: Mistral AI Blog | HN: 695 points

↗ Original source
← Previous: I built an open-source MCP server/ AI web app for real-time flight and satellite tracking — ask Claude "what's flying over Europe right now?Next: Speed at the Cost of Quality: How Cursor AI Impacts Open Source Development →
Comments0