AI & ML interests

None defined yet.

Recent Activity

AdinaY 
posted an update 4 days ago
view post
Post
979
What a week 🤯

Following DeepSeek, Kimi, Qwen, Baidu, and Ant Group, Unitree Robotics
has now released a VLA model on the hub too!

unitreerobotics/UnifoLM-VLA-Base
sergiopaniego 
posted an update 4 days ago
view post
Post
232
Meet the Post-Training Toolkit (PTT), which easily integrates with TRL via a single callback, by Aditya Challapally ( @microsoft ):

🔍 Detects training issues early
🛠 Lets you intervene safely
📊 Keeps long training runs stable, auditable & efficient

Microsoft blog: https://devblogs.microsoft.com/engineering-at-microsoft/diagnosing-instability-in-production-scale-agent-rl/

Integration guide: https://huggingface.co/docs/trl/main/en/ptt_integration

Code: https://github.com/microsoft/post-training-toolkit
victor 
posted an update 4 days ago
AdinaY 
posted an update 5 days ago
view post
Post
216
LongCat-Flash-Lite🔥 a non-thinking MoE model released by Meituan LongCat team.

meituan-longcat/LongCat-Flash-Lite

✨ Total 68.5B / 3B active - MIT license
✨ 256k context
✨ Faster inference with N-gram embeddings
sergiopaniego 
posted an update 5 days ago
AdinaY 
posted an update 5 days ago
view post
Post
205
Ant Group is going big on robotics 🤖

They just dropped their first VLA and depth perception foundation model on huggingface.

✨ LingBot-VLA :
- Trained on 20k hours of real-world robot data
- 9 robot embodiments
- Clear no-saturation scaling laws
- Apache 2.0

Model: https://huggingface.co/collections/robbyant/lingbot-vla
Paper:
A Pragmatic VLA Foundation Model (2601.18692)

✨ LingBot-Depth:
- Metric-accurate 3D from noisy, incomplete depth
- Masked Depth Modeling (self-supervised)
- RGB–depth alignment, works with <5% sparse depth
- Apache 2.0

Model: https://huggingface.co/collections/robbyant/lingbot-depth
Paper:
Masked Depth Modeling for Spatial Perception (2601.17895)
AdinaY 
posted an update 6 days ago
AdinaY 
posted an update 6 days ago
AdinaY 
posted an update 6 days ago
sergiopaniego 
posted an update 7 days ago
AdinaY 
posted an update 12 days ago
view post
Post
485
AgentCPM-report 🔥 local DeepResearch agent released from OpenBMB

openbmb/AgentCPM-Report

✨ 8B - Apache 2.0
✨ Gemini-2.5-Pro level DeepResearch report generation
✨ Fully offline, privacy-first local deployment
✨ + GGUF version
  • 1 reply
·
AdinaY 
posted an update 13 days ago
AdinaY 
posted an update 14 days ago
view post
Post
2370
Z.ai just released a powerful lightweight option of GLM 4.7

✨ 30B total/3B active - MoE

zai-org/GLM-4.7-Flash
  • 1 reply
·
sergiopaniego 
posted an update 14 days ago
view post
Post
1561
FunctionGemma Tuning Lab is a new no-code tool by @google that lets you fine-tune a model directly from the browser, with no coding knowledge required, using TRL behind the scenes.

blog: https://developers.googleblog.com/a-guide-to-fine-tuning-functiongemma/

try it out: google/functiongemma-tuning-lab

This example builds on a more advanced one for learning fine-tuning with SFT using TRL: https://ai.google.dev/gemma/docs/functiongemma/finetuning-with-functiongemma
  • 1 reply
·
AdinaY 
posted an update 14 days ago
view post
Post
249
Another Chinese model fully trained on domestic chips, released by China Telecom 👀

Tele-AI/TeleChat3-36B-Thinking

TeleChat3-36B-Thinking:
✨ Native support for the Ascend + MindSpore ecosystem
✨ Inspired by DeepSeek’s architecture design, bringing training stability and efficiency gains.
  • 2 replies
·