HanSaem Kim

kensaem

AI & ML interests

None yet

Recent Activity

upvoted a paper 23 minutes ago

SAMTok: Representing Any Mask with Two Words

upvoted a paper about 2 hours ago

Qwen3-TTS Technical Report

upvoted a paper about 2 hours ago

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

View all activity

Organizations

None yet

upvoted a paper 23 minutes ago

SAMTok: Representing Any Mask with Two Words

Paper • 2601.16093 • Published about 14 hours ago • 22

upvoted 4 papers about 2 hours ago

Qwen3-TTS Technical Report

Paper • 2601.15621 • Published 1 day ago • 6

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published 1 day ago • 6

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Paper • 2601.16208 • Published about 12 hours ago • 19

LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR

Paper • 2601.14251 • Published 3 days ago • 18

upvoted a paper about 22 hours ago

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Paper • 2601.14250 • Published 3 days ago • 35

upvoted a paper 3 days ago

CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation

Paper • 2601.11096 • Published 7 days ago • 8

upvoted 4 papers 4 days ago

Action100M: A Large-scale Video Action Dataset

Paper • 2601.10592 • Published 8 days ago • 26

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Paper • 2601.10611 • Published 8 days ago • 26

Transition Matching Distillation for Fast Video Generation

Paper • 2601.09881 • Published 8 days ago • 31

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published 9 days ago • 182

upvoted a collection 4 days ago

FLUX.2

Collection

Our second generation of FLUX • 17 items • Updated 4 days ago • 107

upvoted a collection 8 days ago

Z-Image

Collection

4 items • Updated Dec 1, 2025 • 109

upvoted a paper 8 days ago

Ministral 3

Paper • 2601.08584 • Published 10 days ago • 46

upvoted 5 papers 9 days ago

Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

Paper • 2601.04720 • Published 15 days ago • 47

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Paper • 2601.05432 • Published 14 days ago • 160

SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices

Paper • 2601.08303 • Published 10 days ago • 16

Motion Attribution for Video Generation

Paper • 2601.08828 • Published 10 days ago • 68

Solar Open Technical Report

Paper • 2601.07022 • Published 12 days ago • 62

upvoted a paper 11 days ago

Yume-1.5: A Text-Controlled Interactive World Generation Model

Paper • 2512.22096 • Published 28 days ago • 60

HanSaem Kim

AI & ML interests

Recent Activity

Organizations

kensaem's activity