-
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Paper • 2601.01425 • Published • 52 -
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment
Paper • 2601.20218 • Published • 15 -
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space
Paper • 2602.02092 • Published • 17 -
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation
Paper • 2602.03796 • Published • 48
Collections
Discover the best community collections!
Collections including paper arxiv:2601.01425
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 10 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 17 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 17 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 25
-
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
Paper • 2404.15275 • Published -
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models
Paper • 2403.13535 • Published • 23 -
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper • 2309.05793 • Published • 50 -
GHOST 2.0: generative high-fidelity one shot transfer of heads
Paper • 2502.18417 • Published • 67
-
Textbooks Are All You Need
Paper • 2306.11644 • Published • 153 -
Self-Improving VLM Judges Without Human Annotations
Paper • 2512.05145 • Published • 20 -
FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing
Paper • 2601.01720 • Published • 6 -
MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Paper • 2511.09067 • Published • 2
-
Arrexel/pattern-diffusion
Text-to-Image • Updated • 434 • 110 -
flymy-ai/qwen-image-realism-lora
Text-to-Image • Updated • 2.43k • • 128 -
QuantStack/Wan2.2-Fun-A14B-Control-GGUF
Text-to-Video • 14B • Updated • 5.33k • 32 -
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Paper • 2601.01425 • Published • 52
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 18 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 9 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
-
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Paper • 2601.01425 • Published • 52 -
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment
Paper • 2601.20218 • Published • 15 -
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space
Paper • 2602.02092 • Published • 17 -
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation
Paper • 2602.03796 • Published • 48
-
Textbooks Are All You Need
Paper • 2306.11644 • Published • 153 -
Self-Improving VLM Judges Without Human Annotations
Paper • 2512.05145 • Published • 20 -
FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing
Paper • 2601.01720 • Published • 6 -
MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Paper • 2511.09067 • Published • 2
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 10 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 17 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 17 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 25
-
Arrexel/pattern-diffusion
Text-to-Image • Updated • 434 • 110 -
flymy-ai/qwen-image-realism-lora
Text-to-Image • Updated • 2.43k • • 128 -
QuantStack/Wan2.2-Fun-A14B-Control-GGUF
Text-to-Video • 14B • Updated • 5.33k • 32 -
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Paper • 2601.01425 • Published • 52
-
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
Paper • 2404.15275 • Published -
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models
Paper • 2403.13535 • Published • 23 -
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper • 2309.05793 • Published • 50 -
GHOST 2.0: generative high-fidelity one shot transfer of heads
Paper • 2502.18417 • Published • 67
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 18 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 9 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13