2 14 1

shawnxzhu

AI & ML interests

None yet

Recent Activity

upvoted a paper 15 days ago

Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning

upvoted a paper 22 days ago

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

authored a paper about 1 month ago

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL

View all activity

Organizations

Collections 2

Papers 3

arxiv:2605.18703

arxiv:2602.17684

arxiv:2504.10045

models 4

datasets 10

shawnxzhu/DSAA6000Q-Mistral-7B-Instruct-v0.2-lima-dpo

Viewer • Updated May 11, 2025 • 1.03k • 8

shawnxzhu/CHARM-preference20K

Viewer • Updated Apr 12, 2025 • 20k • 5

shawnxzhu/CHARM-preference20K-Qwen2.5-72B-Instruct

Viewer • Updated Apr 12, 2025 • 20k • 10

shawnxzhu/CHARM-preference20K-Llama-3.1-70B-Instruct

Viewer • Updated Apr 12, 2025 • 20k • 8

shawnxzhu/CHARM-preference20K-Llama-3.1-8B-Instruct

Viewer • Updated Apr 12, 2025 • 20k • 6

shawnxzhu/CHARM-preference20K-GPT-4o-mini-2024-07-18

Viewer • Updated Apr 12, 2025 • 20k • 18

shawnxzhu/CHARM-preference20K-gemma-2-27b-it

Viewer • Updated Apr 12, 2025 • 20k • 8

shawnxzhu/CHARM-preference20K-gemma-2-9b-it

Viewer • Updated Apr 12, 2025 • 20k • 43

shawnxzhu/CHARM-preference20K-gemma-2-9b-it-SimPO

Viewer • Updated Apr 12, 2025 • 20k • 6

shawnxzhu/backward-curation

Preview • Updated Apr 8, 2025 • 1

shawnxzhu

AI & ML interests

Recent Activity

Organizations

Collections 2

Papers 3

models 4 Sort: Recently updated

datasets 10 Sort: Recently updated

models 4

datasets 10