view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 150
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 890
What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time? Paper • 2603.19017 • Published Mar 19 • 3
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts Paper • 2602.13367 • Published Feb 13 • 35
view article Article RexRerankers: SOTA Rankers for Product Discovery and AI Assistants thebajajra • Jan 24 • 44
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 378
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA Paper • 2510.04849 • Published Oct 6, 2025 • 117
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models Paper • 2510.06107 • Published Oct 7, 2025 • 3
view article Article 🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation moonshotai • Jun 21, 2025 • 77
NileChat Collection A collection of all the resources that we built for the NileChat LLM project. • 10 items • Updated 1 day ago • 4
Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning Paper • 2505.16088 • Published May 22, 2025 • 3
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning Paper • 2504.07128 • Published Apr 2, 2025 • 87
Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance Paper • 2502.18772 • Published Feb 26, 2025 • 32
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published Feb 25, 2025 • 75
view article Article DABStep: Data Agent Benchmark for Multi-step Reasoning +5 eggie5, martinigoyanes, frisokingma, andreumora, lvwerra, thomwolf, m-ric • Feb 4, 2025 • 130
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization Paper • 2410.09302 • Published Oct 11, 2024 • 1