Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models Paper • 2603.18002 • Published 5 days ago • 9
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors Paper • 2602.21778 • Published 26 days ago • 14
CoPE-VideoLM: Codec Primitives For Efficient Video Language Models Paper • 2602.13191 • Published Feb 13 • 30
CoPE-VideoLM: Codec Primitives For Efficient Video Language Models Paper • 2602.13191 • Published Feb 13 • 30
Running Featured 560 Vision Arena (Testing VLMs side-by-side) 🖼 560 Explore Vision Arena’s computer‑vision tools online
GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Paper • 2510.16136 • Published Oct 17, 2025 • 5
GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Paper • 2510.16136 • Published Oct 17, 2025 • 5
GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Paper • 2510.16136 • Published Oct 17, 2025 • 5 • 2
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images Paper • 2504.08727 • Published Apr 11, 2025 • 12
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published Feb 12, 2025 • 59