Yassine Ennaour
Lyte
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 4 hours ago
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs
reacted
to
danielhanchen's
post
with ❤️
about 4 hours ago
You can now do reinforcement learning training with 7× longer context and no accuracy loss, via our new batching algorithms.
Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU.
Blog: https://unsloth.ai/docs/new/grpo-long-context
reacted
to
danielhanchen's
post
with 🚀
about 4 hours ago
You can now do reinforcement learning training with 7× longer context and no accuracy loss, via our new batching algorithms.
Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU.
Blog: https://unsloth.ai/docs/new/grpo-long-context