Gecko: An Efficient Neural Architecture Inherently Processing Sequences with Arbitrary Lengths Paper • 2601.06463 • Published 18 days ago • 2
LYNX: Learning Dynamic Exits for Confidence-Controlled Reasoning Paper • 2512.05325 • Published Dec 5, 2025 • 3
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published Oct 20, 2025 • 123
Simple Guidance Mechanisms for Discrete Diffusion Models Paper • 2412.10193 • Published Dec 13, 2024 • 1
Remasking Discrete Diffusion Models with Inference-Time Scaling Paper • 2503.00307 • Published Mar 1, 2025 • 11