OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data Paper • 2603.15594 • Published 4 days ago • 137
From Perception to Action: An Interactive Benchmark for Vision Reasoning Paper • 2602.21015 • Published 24 days ago • 23
AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines Paper • 2602.14296 • Published Feb 15 • 51
Evaluating Gemini Robotics Policies in a Veo World Simulator Paper • 2512.10675 • Published Dec 11, 2025 • 20
Dyna-Mind: Learning to Simulate from Experience for Better AI Agents Paper • 2510.09577 • Published Oct 10, 2025 • 8
SIMA 2: A Generalist Embodied Agent for Virtual Worlds Paper • 2512.04797 • Published Dec 4, 2025 • 25
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published Nov 20, 2025 • 110
LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions Paper • 2508.18321 • Published Aug 24, 2025 • 2
Running on CPU Upgrade Featured 3.05k The Smol Training Playbook 📚 3.05k The secrets to building world-class LLMs
Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics Paper • 2510.05137 • Published Oct 1, 2025 • 6
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent Paper • 2508.05748 • Published Aug 7, 2025 • 142