Running Featured 41 Distilling 100B+ Models 40x Faster with TRL 📝 41 TRL distillation for 100B+ teachers, 40x faster
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 Mar 10 • 124
Running on CPU Upgrade 219 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 219 Explore synthetic data experiments on a virtual bookshelf
view article Article 🪄 Interpreto: A Unified Toolkit for Interpretability of Transformer Models Jan 20 • 37