video
updated
Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation
Paper
•
2512.24271
•
Published
•
62
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation
Paper
•
2512.24724
•
Published
•
7
Pretraining Frame Preservation in Autoregressive Video Memory Compression
Paper
•
2512.23851
•
Published
•
24
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation
Paper
•
2512.24551
•
Published
•
19
JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation
Paper
•
2512.22905
•
Published
•
20
Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems
Paper
•
2512.24385
•
Published
•
8
Factorized Learning for Temporally Grounded Video-Language Models
Paper
•
2512.24097
•
Published
•
7
SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling
Paper
•
2512.23162
•
Published
•
11
Video-BrowseComp: Benchmarking Agentic Video Research on Open Web
Paper
•
2512.23044
•
Published
•
10
Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation
Paper
•
2512.21734
•
Published
•
5