Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX

Chess puzzle explanation model optimized for Apple Silicon 🍎

This is an MLX-format merged model fine-tuned to generate educational explanations for chess puzzles from the Lichess database. The LoRA adapter has been fused with the base model weights for easy deployment.

🎯 Model Overview

  • Base Model: Qwen/Qwen3-4B-Instruct-2507 (4-bit quantized)
  • Fine-tuning: LoRA adapter trained on 5,020 Lichess puzzles (fused)
  • Format: MLX (Apple Silicon optimized)
  • Best Checkpoint: Iteration 3900 (validation loss: 0.596)
  • Framework: MLX + mlx-lm

πŸ“Š Training Details

  • Training Data: 5,020 high-quality Lichess chess puzzles with Claude-generated explanations
  • LoRA Config: rank=32, alpha=64
  • Training Iterations: 6,000 (best @ 3900)
  • Quality Metrics: 96% completeness, avg 659 chars/explanation
  • Coverage: 1000-2500 rated puzzles, 20+ tactical themes

πŸš€ Quick Start

Installation

pip install mlx mlx-lm

Usage

from mlx_lm import load, generate

# Load the fused model (adapter already merged)
model, tokenizer = load("felixmanojh/Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX")

# Example puzzle
prompt = """Explain this chess puzzle:

Position (FEN): r1bqkb1r/pppp1ppp/2n2n2/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq - 4 4
Solution: Nxe5 Nxe5 d4
Themes: fork pin
Rating: 1500"""

# Generate explanation
response = generate(
    model,
    tokenizer,
    prompt=prompt,
    max_tokens=512,
    verbose=True
)

print(response)

Chat Format

from mlx_lm import load, generate

model, tokenizer = load("felixmanojh/Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX")

messages = [
    {
        "role": "user", 
        "content": "Explain this chess puzzle:\n\nPosition (FEN): r1bqkb1r/pppp1ppp/2n2n2/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq - 4 4\nSolution: Nxe5 Nxe5 d4\nThemes: fork pin\nRating: 1500"
    }
]

# Apply chat template
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)

πŸ’‘ Why Use the MLX Version?

βœ… Apple Silicon optimized β€” Runs efficiently on M1/M2/M3 Macs
βœ… Fused adapter β€” No need to load base + adapter separately
βœ… Fast inference β€” Optimized for Metal GPU acceleration
βœ… Low memory β€” 4-bit quantization keeps memory usage low
βœ… Local deployment β€” Perfect for Mac-based applications

πŸ“š Training Data

Data Source

  • Puzzles: Lichess puzzle database (CC0 1.0 Universal)
  • Explanations: Generated using Claude API (Anthropic)
  • Size: 5,020 training + 502 validation puzzles

Data Quality

Puzzles filtered for:

  • Popularity β‰₯ 90th percentile
  • Rating deviation ≀ 80 (consistent difficulty)
  • Minimum plays β‰₯ 500
  • Balanced across tactical themes (fork, pin, skewer, discovered attack, mate patterns, sacrifice, deflection, etc.)

Explanations generated using Claude API with consistent prompting for educational quality, focusing on:

  • Clear explanation of the tactical pattern
  • Step-by-step move analysis
  • Why alternatives don't work
  • Key learning points

Coverage

  • Rating Range: 1000-2500
  • Themes: 20+ tactical patterns
  • Format: FEN position + UCI solution + themes + rating β†’ educational explanation

πŸŽ“ Intended Use

βœ… Recommended

  • Educational chess puzzle explanations
  • Learning tactical patterns
  • Automated puzzle commentary
  • Interactive chess tutoring systems
  • Mac-based chess applications

❌ Not Recommended

  • Full game analysis (puzzle-focused only)
  • Opening theory (not in training data)
  • Endgame tablebase analysis

βš™οΈ System Requirements

  • Hardware: Apple Silicon (M1/M2/M3/M4) Mac
  • RAM: 8GB minimum, 16GB recommended
  • Storage: ~4GB for model weights
  • OS: macOS 12.0 or later

πŸ”„ Model Variants

Model Format Size Use Case
Qwen3-4B-Lichess-Chess-Puzzle-Tutor LoRA adapter ~150MB Training, fine-tuning
Qwen3-4B-Lichess-Chess-Puzzle-Tutor-Merged HuggingFace merged ~8GB Deployment, Spaces, GPU
This model MLX merged ~4GB Apple Silicon, local inference

🌐 Live Demo

Try it live: Chess Puzzle Tutor Space

πŸ“ Citation

@software{qwen3_chess_tutor_mlx_2025,
  author = {Felix Manojh},
  title = {Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX},
  year = {2025},
  url = {https://huggingface.co/felixmanojh/Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX},
  note = {MLX-optimized merged model for chess puzzle explanations with Claude-generated training data}
}

πŸ“„ License

  • Model: Apache 2.0
  • Puzzle Data: Lichess puzzle database (CC0 1.0 Universal)
  • Explanations: Generated using Claude API for training purposes

πŸ™ Acknowledgments

  • Lichess for the comprehensive puzzle database
  • Anthropic for Claude API used to generate training explanations
  • Qwen Team for the excellent Qwen3-4B base model
  • Apple MLX Team for the MLX framework

Built by Felix Manojh | Optimized for Apple Silicon 🍎

Downloads last month
2
Safetensors
Model size
0.6B params
Tensor type
BF16
Β·
U32
Β·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for felixmanojh/Qwen3-4B-Chess-Puzzle-Tutor-Fused-MLX

Quantized
(209)
this model