Collections
Discover the best community collections!
Collections including paper arxiv:2512.15745
-
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone
Paper • 2512.22615 • Published • 48 -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 84 -
On the Role of Discreteness in Diffusion LLMs
Paper • 2512.22630 • Published • 18 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 51
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 126 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 129 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 84
-
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 84 -
inclusionAI/LLaDA2.0-flash
Text Generation • 103B • Updated • 711 • 63 -
inclusionAI/LLaDA2.0-mini
Text Generation • 16B • Updated • 33.5k • 53 -
inclusionAI/LLaDA2.0-flash-preview
Text Generation • 103B • Updated • 35 • 68
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 126 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 97 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 223 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 24 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 5 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 2
-
Fast-dLLM v2: Efficient Block-Diffusion LLM
Paper • 2509.26328 • Published • 57 -
Attention Is All You Need for KV Cache in Diffusion LLMs
Paper • 2510.14973 • Published • 42 -
Attention Sinks in Diffusion Language Models
Paper • 2510.15731 • Published • 49 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 129
-
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 45 -
DINGO: Constrained Inference for Diffusion LLMs
Paper • 2505.23061 • Published • 31 -
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 44
-
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone
Paper • 2512.22615 • Published • 48 -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 84 -
On the Role of Discreteness in Diffusion LLMs
Paper • 2512.22630 • Published • 18 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 51
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 223 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 24 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 5 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 2
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 126 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 129 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 84
-
Fast-dLLM v2: Efficient Block-Diffusion LLM
Paper • 2509.26328 • Published • 57 -
Attention Is All You Need for KV Cache in Diffusion LLMs
Paper • 2510.14973 • Published • 42 -
Attention Sinks in Diffusion Language Models
Paper • 2510.15731 • Published • 49 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 129
-
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 84 -
inclusionAI/LLaDA2.0-flash
Text Generation • 103B • Updated • 711 • 63 -
inclusionAI/LLaDA2.0-mini
Text Generation • 16B • Updated • 33.5k • 53 -
inclusionAI/LLaDA2.0-flash-preview
Text Generation • 103B • Updated • 35 • 68
-
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 45 -
DINGO: Constrained Inference for Diffusion LLMs
Paper • 2505.23061 • Published • 31 -
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 44
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 126 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 97 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55