Use when working with Lean 4 (.lean files), writing mathematical proofs, seeing "failed to synthesize instance" errors, managing sorry/axiom elimination, or searching mathlib for lemmas - provides build-first workflow, haveI/letI patterns, compiler-guided repair, and LSP integration
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
skills listSkill Instructions
name: lean4-theorem-proving description: Use when working with Lean 4 (.lean files), writing mathematical proofs, seeing "failed to synthesize instance" errors, managing sorry/axiom elimination, or searching mathlib for lemmas - provides build-first workflow, haveI/letI patterns, compiler-guided repair, and LSP integration
Lean 4 Theorem Proving
Core Principle
Build incrementally, structure before solving, trust the type checker. Lean's type checker is your test suite.
Success = lake build passes + zero sorries + zero custom axioms. Theorems with sorries/axioms are scaffolding, not results.
Quick Reference
| Resource | What You Get | Where to Find |
|---|---|---|
| Interactive Commands | 10 slash commands for search, analysis, optimization, repair | Type /lean in Claude Code (full guide) |
| Automation Scripts | 19 tools for search, verification, refactoring, repair | Plugin scripts/ directory (scripts/README.md) |
| Subagents | 4 specialized agents for batch tasks (optional) | subagent-workflows.md |
| LSP Server | 30x faster feedback with instant proof state (optional) | lean-lsp-server.md |
| Reference Files | 18 detailed guides (phrasebook, tactics, patterns, errors, repair, performance) | List below |
When to Use
Use for ANY Lean 4 development: pure/applied math, program verification, mathlib contributions.
Critical for: Type class synthesis errors, sorry/axiom management, mathlib search, measure theory/probability work.
Tools & Workflows
7 slash commands for search, analysis, and optimization - type /lean in Claude Code. See COMMANDS.md for full guide with examples and workflows.
16 automation scripts for search, verification, and refactoring. See scripts/README.md for complete documentation.
Lean LSP Server (optional) provides 30x faster feedback with instant proof state and parallel tactic testing. See lean-lsp-server.md for setup and workflows.
Subagent delegation (optional, Claude Code users) enables batch automation. See subagent-workflows.md for patterns.
Build-First Principle
ALWAYS compile before committing. Run lake build to verify. "Compiles" ≠ "Complete" - files can compile with sorries/axioms but aren't done until those are eliminated.
The 4-Phase Workflow
- Structure Before Solving - Outline proof strategy with
havestatements and documented sorries before writing tactics - Helper Lemmas First - Build infrastructure bottom-up, extract reusable components as separate lemmas
- Incremental Filling - Fill ONE sorry at a time, compile after each, commit working code
- Type Class Management - Add explicit instances with
haveI/letIwhen synthesis fails, respect binder order for sub-structures
Finding and Using Mathlib Lemmas
Philosophy: Search before prove. Mathlib has 100,000+ theorems.
Use /search-mathlib slash command, LSP server search tools, or automation scripts. See mathlib-guide.md for detailed search techniques, naming conventions, and import organization.
Essential Tactics
Key tactics: simp only, rw, apply, exact, refine, by_cases, rcases, ext/funext. See tactics-reference.md for comprehensive guide with examples and decision trees.
Domain-Specific Patterns
Analysis & Topology: Integrability, continuity, compactness patterns. Tactics: continuity, fun_prop.
Algebra: Instance building, quotient constructions. Tactics: ring, field_simp, group.
Measure Theory & Probability (emphasis in this skill): Conditional expectation, sub-σ-algebras, a.e. properties. Tactics: measurability, positivity. See measure-theory.md for detailed patterns.
Complete domain guide: domain-patterns.md
Managing Incomplete Proofs
Standard mathlib axioms (acceptable): Classical.choice, propext, quot.sound. Check with #print axioms theorem_name or /check-axioms.
CRITICAL: Sorries/axioms are NOT complete work. A theorem that compiles with sorries is scaffolding, not a result. Document every sorry with concrete strategy and dependencies. Search mathlib exhaustively before adding custom axioms.
When sorries are acceptable: (1) Active work in progress with documented plan, (2) User explicitly approves temporary axioms with elimination strategy.
Not acceptable: "Should be in mathlib", "infrastructure lemma", "will prove later" without concrete plan.
Compiler-Guided Proof Repair
When proofs fail to compile, use iterative compiler-guided repair instead of blind resampling.
Quick repair: /lean4-theorem-proving:repair-file FILE.lean
How it works:
- Compile → extract structured error (type, location, goal, context)
- Try automated solver cascade first (many simple cases handled mechanically, zero LLM cost)
- Order:
rfl → simp → ring → linarith → nlinarith → omega → exact? → apply? → aesop
- Order:
- If solvers fail → call
lean4-proof-repairagent:- Stage 1: Haiku (fast, most common cases) - 6 attempts
- Stage 2: Sonnet (precise, complex cases) - 18 attempts
- Apply minimal patch (1-5 lines), recompile, repeat (max 24 attempts)
Key benefits:
- Low sampling budget (K=1 per attempt, not K=100)
- Error-driven action selection (specific fix per error type, not random guessing)
- Fast model first (Haiku), escalate only when needed (Sonnet)
- Solver cascade handles simple cases mechanically (zero LLM cost)
- Early stopping prevents runaway costs (bail after 3 identical errors)
Expected outcomes: Success improves over time as structured logging enables learning from attempts. Cost optimized through solver cascade (free) and multi-stage escalation.
Commands:
/repair-file FILE.lean- Full file repair/repair-goal FILE.lean LINE- Specific goal repair/repair-interactive FILE.lean- Interactive with confirmations
Detailed guide: compiler-guided-repair.md
Inspired by: APOLLO (https://arxiv.org/abs/2505.05758) - compiler-guided repair with multi-stage models and low sampling budgets.
Common Compilation Errors
| Error | Fix |
|---|---|
| "failed to synthesize instance" | Add haveI : Instance := ... |
| "maximum recursion depth" | Provide manually: letI := ... |
| "type mismatch" | Use coercion: (x : ℝ) or ↑x |
| "unknown identifier" | Add import |
See compilation-errors.md for detailed debugging workflows.
Documentation Conventions
- Write timeless documentation (describe what code is, not development history)
- Don't highlight "axiom-free" status after proofs are complete
- Mark internal helpers as
privateor in dedicated sections - Use
examplefor educational code, notlemma/theorem
Quality Checklist
Before commit:
-
lake buildsucceeds on full project - All sorries documented with concrete strategy
- No new axioms without elimination plan
- Imports minimal
Doing it right: Sorries/axioms decrease over time, each commit completes one lemma, proofs build on mathlib.
Red flags: Sorries multiply, claiming "complete" with sorries/axioms, fighting type checker for hours, monolithic proofs (>100 lines), long have blocks (>30 lines should be extracted as lemmas - see proof-refactoring.md).
Reference Files
Core references: lean-phrasebook.md, mathlib-guide.md, tactics-reference.md, compilation-errors.md
Domain-specific: domain-patterns.md, measure-theory.md, instance-pollution.md, calc-patterns.md
Incomplete proofs: sorry-filling.md, axiom-elimination.md
Optimization & refactoring: performance-optimization.md, proof-golfing.md, proof-refactoring.md, mathlib-style.md
Automation: compiler-guided-repair.md, lean-lsp-server.md, lean-lsp-tools-api.md, subagent-workflows.md
More by benchflow-ai
View allRepair an (often imperfect) Flexible Job Shop Scheduling baseline into a downtime-feasible, precedence-correct schedule while staying within policy budgets and matching the evaluator’s exact metrics and “local minimal right-shift” checks.
Test Temporal workflows with pytest, time-skipping, and mocking strategies. Covers unit testing, integration testing, replay testing, and local development setup. Use when implementing Temporal workflow tests or debugging test failures.
Extract locational marginal prices (LMPs) from DC-OPF solutions using dual values. Use when computing nodal electricity prices, reserve clearing prices, or performing price impact analysis.
This skill should be used when the user asks to "design package structure", "create managed package", "configure 2GP", "set up namespace", "version management", or mentions managed package topics like "LMA", "subscriber orgs", or "package versioning". Provides comprehensive guidance for second-generation managed package (2GP) architecture, ISV development patterns, and package lifecycle management.
