Can LLMs Learn to Reason Robustly under Noisy Supervision? Paper • 2604.03993 • Published 3 days ago • 10
The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning Paper • 2603.29025 • Published 9 days ago • 12
Meta-Harness: End-to-End Optimization of Model Harnesses Paper • 2603.28052 • Published 9 days ago • 15
AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents Paper • 2604.02947 • Published 5 days ago • 16
Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory Paper • 2604.01007 • Published 6 days ago • 24
LightThinker++: From Reasoning Compression to Memory Management Paper • 2604.03679 • Published 4 days ago • 26
HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 7 days ago • 27
Reasoning Shift: How Context Silently Shortens LLM Reasoning Paper • 2604.01161 • Published 7 days ago • 29
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 7 days ago • 30
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery Paper • 2604.01658 • Published 6 days ago • 45
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 6 days ago • 126
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 2 days ago • 77
MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome Paper • 2603.28407 • Published 9 days ago • 68
InCoder-32B-Thinking: Industrial Code World Model for Thinking Paper • 2604.03144 • Published 5 days ago • 88
GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published 9 days ago • 84
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook Paper • 2604.02029 • Published 6 days ago • 132