Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX

Chess puzzle explanation model optimized for Apple Silicon 🍎

This is an MLX-format merged model fine-tuned to generate educational explanations for chess puzzles from the Lichess database. The LoRA adapter has been fused with the base model weights for easy deployment.

🎯 Model Overview

Base Model: Qwen/Qwen3-4B-Instruct-2507 (4-bit quantized)
Fine-tuning: LoRA adapter trained on 5,020 Lichess puzzles (fused)
Format: MLX (Apple Silicon optimized)
Best Checkpoint: Iteration 3900 (validation loss: 0.596)
Framework: MLX + mlx-lm

📊 Training Details

Training Data: 5,020 high-quality Lichess chess puzzles with Claude-generated explanations
LoRA Config: rank=32, alpha=64
Training Iterations: 6,000 (best @ 3900)
Quality Metrics: 96% completeness, avg 659 chars/explanation
Coverage: 1000-2500 rated puzzles, 20+ tactical themes

🚀 Quick Start

Installation

pip install mlx mlx-lm

Usage

from mlx_lm import load, generate

# Load the fused model (adapter already merged)
model, tokenizer = load("felixmanojh/Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX")

# Example puzzle
prompt = """Explain this chess puzzle:

Position (FEN): r1bqkb1r/pppp1ppp/2n2n2/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq - 4 4
Solution: Nxe5 Nxe5 d4
Themes: fork pin
Rating: 1500"""

# Generate explanation
response = generate(
    model,
    tokenizer,
    prompt=prompt,
    max_tokens=512,
    verbose=True
)

print(response)

Chat Format

from mlx_lm import load, generate

model, tokenizer = load("felixmanojh/Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX")

messages = [
    {
        "role": "user", 
        "content": "Explain this chess puzzle:\n\nPosition (FEN): r1bqkb1r/pppp1ppp/2n2n2/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq - 4 4\nSolution: Nxe5 Nxe5 d4\nThemes: fork pin\nRating: 1500"
    }
]

# Apply chat template
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)

💡 Why Use the MLX Version?

✅ Apple Silicon optimized — Runs efficiently on M1/M2/M3 Macs
✅ Fused adapter — No need to load base + adapter separately
✅ Fast inference — Optimized for Metal GPU acceleration
✅ Low memory — 4-bit quantization keeps memory usage low
✅ Local deployment — Perfect for Mac-based applications

📚 Training Data

Data Source

Puzzles: Lichess puzzle database (CC0 1.0 Universal)
Explanations: Generated using Claude API (Anthropic)
Size: 5,020 training + 502 validation puzzles

Data Quality

Puzzles filtered for:

Popularity ≥ 90th percentile
Rating deviation ≤ 80 (consistent difficulty)
Minimum plays ≥ 500
Balanced across tactical themes (fork, pin, skewer, discovered attack, mate patterns, sacrifice, deflection, etc.)

Explanations generated using Claude API with consistent prompting for educational quality, focusing on:

Clear explanation of the tactical pattern
Step-by-step move analysis
Why alternatives don't work
Key learning points

Coverage

Rating Range: 1000-2500
Themes: 20+ tactical patterns
Format: FEN position + UCI solution + themes + rating → educational explanation

🎓 Intended Use

✅ Recommended

Educational chess puzzle explanations
Learning tactical patterns
Automated puzzle commentary
Interactive chess tutoring systems
Mac-based chess applications

❌ Not Recommended

Full game analysis (puzzle-focused only)
Opening theory (not in training data)
Endgame tablebase analysis

⚙️ System Requirements

Hardware: Apple Silicon (M1/M2/M3/M4) Mac
RAM: 8GB minimum, 16GB recommended
Storage: ~4GB for model weights
OS: macOS 12.0 or later

🔄 Model Variants

Model	Format	Size	Use Case
Qwen3-4B-Lichess-Chess-Puzzle-Tutor	LoRA adapter	~150MB	Training, fine-tuning
Qwen3-4B-Lichess-Chess-Puzzle-Tutor-Merged	HuggingFace merged	~8GB	Deployment, Spaces, GPU
This model	MLX merged	~4GB	Apple Silicon, local inference

🌐 Live Demo

Try it live: Chess Puzzle Tutor Space

📝 Citation

@software{qwen3_chess_tutor_mlx_2025,
  author = {Felix Manojh},
  title = {Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX},
  year = {2025},
  url = {https://huggingface.co/felixmanojh/Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX},
  note = {MLX-optimized merged model for chess puzzle explanations with Claude-generated training data}
}

📄 License

Model: Apache 2.0
Puzzle Data: Lichess puzzle database (CC0 1.0 Universal)
Explanations: Generated using Claude API for training purposes

🙏 Acknowledgments

Lichess for the comprehensive puzzle database
Anthropic for Claude API used to generate training explanations
Qwen Team for the excellent Qwen3-4B base model
Apple MLX Team for the MLX framework

Built by Felix Manojh | Optimized for Apple Silicon 🍎

Downloads last month: 2

Safetensors

Model size

0.6B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for felixmanojh/Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX

Base model

Qwen/Qwen3-4B-Instruct-2507

Quantized

(209)

this model