1 138 10

SAMBIT CHAKRABORTY

sambitchakhf03

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Self-Improving Pretraining: using post-trained models to pretrain better models

upvoted a paper 1 day ago

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

upvoted a paper 5 days ago

Endless Terminals: Scaling RL Environments for Terminal Agents

View all activity

Organizations

upvoted 2 papers 1 day ago

Self-Improving Pretraining: using post-trained models to pretrain better models

Paper • 2601.21343 • Published 3 days ago • 9

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published 4 days ago • 19

upvoted 2 papers 5 days ago

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published 9 days ago • 16

VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents

Paper • 2601.16973 • Published 8 days ago • 40

upvoted 2 papers 6 days ago

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published 9 days ago • 169

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published 13 days ago • 186

liked a model 7 days ago

microsoft/VibeVoice-ASR

Automatic Speech Recognition • 9B • Updated 5 days ago • 125k • 756

upvoted a paper 16 days ago

Aligning Text, Code, and Vision: A Multi-Objective Reinforcement Learning Framework for Text-to-Visualization

Paper • 2601.04582 • Published 24 days ago • 10

upvoted 2 papers 18 days ago

SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices

Paper • 2601.08303 • Published 19 days ago • 16

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published 23 days ago • 211

upvoted 3 papers 20 days ago

MiMo-V2-Flash Technical Report

Paper • 2601.02780 • Published 26 days ago • 33

Plenoptic Video Generation

Paper • 2601.05239 • Published 23 days ago • 12

The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Paper • 2601.03425 • Published 25 days ago • 16

New activity in sambitchakhf03/SwiFTeDLM 30 days ago

Adding `safetensors` variant of this model

#1 opened 10 months ago by

SFconvertbot

liked a model 30 days ago

sambitchakhf03/SwiFTeDLM

Text Generation • 7B • Updated 30 days ago • 15 • 1

updated a model about 1 month ago

sambitchakhf03/gemma-3-270m-classifier

0.3B • Updated Dec 31, 2025 • 1

published a model about 1 month ago

sambitchakhf03/gemma-3-270m-classifier

0.3B • Updated Dec 31, 2025 • 1

upvoted an article about 1 month ago

Article

The Optimal Architecture for Small Language Models

Dec 26, 2025

•

111

upvoted 2 papers about 1 month ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published Dec 22, 2025 • 64

GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation

Paper • 2512.17495 • Published Dec 19, 2025 • 20