ZhengQi Wan's picture

ZhengQi Wan

Vanqi

·

42111058

AI & ML interests

None yet

Recent Activity

updated a collection 3 days ago

From Vision to Motion

upvoted a paper 3 days ago

Cosmos 3: Omnimodal World Models for Physical AI

upvoted a paper 5 days ago

Representation Forcing for Bottleneck-Free Unified Multimodal Models

View all activity

Organizations

None yet

upvoted a paper 3 days ago

Cosmos 3: Omnimodal World Models for Physical AI

Paper • 2606.02800 • Published 6 days ago • 87

upvoted a paper 5 days ago

Representation Forcing for Bottleneck-Free Unified Multimodal Models

Paper • 2605.31604 • Published 9 days ago • 57

upvoted 4 papers 10 days ago

Agent Explorative Policy Optimization for Multimodal Agentic Reasoning

Paper • 2605.28774 • Published 11 days ago • 89

Self-Improving Language Models with Bidirectional Evolutionary Search

Paper • 2605.28814 • Published 11 days ago • 59

LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence

Paper • 2605.25979 • Published 13 days ago • 27

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini

Paper • 2605.27295 • Published 12 days ago • 23

upvoted a paper 11 days ago

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Paper • 2605.27365 • Published 12 days ago • 138

upvoted a paper 13 days ago

PiD: Fast and High-Resolution Latent Decoding with Pixel Diffusion

Paper • 2605.23902 • Published 16 days ago • 45

upvoted a paper 23 days ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published 25 days ago • 159

upvoted 3 papers 25 days ago

Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization

Paper • 2605.10780 • Published 26 days ago • 33

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published 26 days ago • 191

δ-mem: Efficient Online Memory for Large Language Models

Paper • 2605.12357 • Published 26 days ago • 125

upvoted 4 papers about 1 month ago

Prox-E: Fine-Grained 3D Shape Editing via Primitive-Based Abstractions

Paper • 2604.23774 • Published Apr 29 • 17

Let ViT Speak: Generative Language-Image Pre-training

Paper • 2605.00809 • Published May 1 • 33

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

Paper • 2604.28130 • Published Apr 30 • 22

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 247

upvoted 4 papers about 2 months ago

CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation

Paper • 2604.19636 • Published Apr 21 • 87

OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Paper • 2604.18486 • Published Apr 20 • 95

Introspective Diffusion Language Models

Paper • 2604.11035 • Published Apr 13 • 25

The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping

Paper • 2604.11297 • Published Apr 13 • 144