arxiv:2407.01231
Mingyu Derek Ma
derekma
AI & ML interests
Generative Language Model, Scientific LM, Clinical LM, Decoding
Recent Activity
liked
a model
about 2 hours ago
karina-zadorozhny/ume
upvoted
an
article
about 2 hours ago
A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond
liked
a model
12 months ago
deepseek-ai/DeepSeek-R1