World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
Paper • 2604.24764 • Published • 110
None defined yet.
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
Mind's Eye: A Benchmark of Visual Abstraction, Transformation and Composition for Multimodal LLMs