Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published Feb 11 • 192
📋 Twinkle Eval Logs Collection Benchmark log generated with Twinkle Eval, recording the model's outputs for each prompt, see more in https://github.com/ai-twinkle/Eval • 21 items • Updated 4 days ago • 1
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated 4 days ago • 36.8k • 246
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated 1 day ago • 215
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 8 days ago • 63
📋 Twinkle Eval Logs Collection Benchmark log generated with Twinkle Eval, recording the model's outputs for each prompt, see more in https://github.com/ai-twinkle/Eval • 21 items • Updated 4 days ago • 1
🤏 Smol-Data Collection Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing • 14 items • Updated 16 days ago • 12
Running on CPU Upgrade 191 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 191 Explore synthetic data benchmarks in a visual bookshelf
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 26 days ago • 487
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark Paper • 2409.02813 • Published Sep 4, 2024 • 33
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI Paper • 2404.16006 • Published Apr 24, 2024 • 2