CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 13 days ago • 94
MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants Paper • 2603.09652 • Published 29 days ago • 15
Nemotron-Terminal Collection We are releasing Nemotron-Terminal models and training datasets. • 5 items • Updated 1 day ago • 34
SkillOrchestra: Learning to Route Agents via Skill Transfer Paper • 2602.19672 • Published Feb 23 • 57
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference Paper • 2602.21548 • Published Feb 25 • 49
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published Jan 13 • 150
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper • 2601.09667 • Published Jan 14 • 93
Towards Comprehensive Stage-wise Benchmarking of Large Language Models in Fact-Checking Paper • 2601.02669 • Published Jan 6 • 4
DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs Paper • 2601.03559 • Published Jan 7 • 14
DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs Paper • 2601.03559 • Published Jan 7 • 14
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents Paper • 2512.22047 • Published Dec 26, 2025 • 30