DistilQwen
H100 BF16. 30B→1.7B/0.6B TKD. Three teachers. 15 models + DISC paper. 10K+ downloads. DOI: 10.57967/hf/8165 & 10.57967/hf/8194
Text Generation • 2B • Updated • 745 •Note 30B teacher, 1.7B student. Proof-weighted KD at 2.25× on reasoning.
reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF
Text Generation • 2B • Updated • 1.77kNote Most downloaded GGUF in the collection. CPU-friendly.
reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B-SFT
2B • Updated • 223Note Second stage: distil → SFT on instruction-following data.
reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B
Text Generation • 0.8B • Updated • 762 •Note 0.6B student. Proves the methodology works at extreme scales.
reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT
Text Generation • 0.8B • Updated • 781 • • 2Note Higher-entropy teacher distributions → richer student representations.
reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF
Text Generation • 0.8B • Updated • 1.72kNote mradermacher also auto-quantized this one — 420+ shadow downloads.
reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT
Text Generation • 2B • Updated • 796 • 1Note Coder teacher. Structured decomposition → STEM derivation.
reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT-GGUF
Text Generation • 2B • Updated • 2.39k • 1Note Coder pipeline quantized. F16/Q4/Q5/Q8.
reaperdoesntknow/DistilQwen3-1.7B-uncensored
Text Generation • 2B • Updated • 679 •Note Starting point for custom SFT pipelines.
reaperdoesntknow/TopologicalQwen
Text Generation • 2B • Updated • 778 •Note The model that proved ghost imprinting — literary from physics data.
reaperdoesntknow/DiStil-Qwen3-1.7B-uncensored
2B • Updated • 210 • 1Note Bridge between base distil and topology-aware models.
reaperdoesntknow/Disctil-Qwen3-1.7B
Text Generation • 2B • Updated • 668 •Note Key link in the chain: DiStil → Disctil → TopologicalQwen.
reaperdoesntknow/DistilQwen3-1.7B-uncensored-GGUF
2B • Updated • 1.93k • 1Note Edge deployment for research. No alignment filtering. Apache 2.0.
reaperdoesntknow/Qwen3-1.7B-Thinking-Distil
Text Generation • 2B • Updated • 925 • • 1Note Extended deliberation from 30B-Thinking → 1.7B student.
reaperdoesntknow/LFM2.5-1.2B-Distilled-SFT
Text Generation • 1B • Updated • 625Note Proves TKD works across architecture families, not just within Qwen.
reaperdoesntknow/Discrepancy_Calculus
UpdatedNote Continuous Thought Dynamics — mathematical backbone of DualMind.