LLaVa-NeXT Collection LLaVa-NeXT (also known as LLaVa-1.6) improves upon the 1.5 series by incorporating higher image resolutions and more reasoning/OCR datasets. • 8 items • Updated Jul 19, 2024 • 34
view article Article Multimodal Embedding & Reranker Models with Sentence Transformers 28 days ago • 57
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 10 items • Updated Mar 2 • 562
view article Article Making automatic speech recognition work on large files with Wav2Vec2 in 🤗 Transformers Feb 1, 2022 • 15
view article Article Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies Feb 17, 2025 • 29
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7, 2025 • 291
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 318
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 670
view article Article Introducing Command A Vision: Multimodal AI built for Business Jul 31, 2025 • 64