Hugging Face Party @ PyTorch Conference

community

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Jackmin108 authored a paper 16 days ago

Arcee Trinity Large Technical Report

1aurent authored a paper 25 days ago

Ministral 3

JingzeShi authored a paper 27 days ago

Towards Automated Kernel Generation in the Era of LLMs

View all activity

1aurent

authored a paper 25 days ago

Ministral 3

Paper • 2601.08584 • Published Jan 13 • 55

JingzeShi

authored 2 papers 27 days ago

Towards Automated Kernel Generation in the Era of LLMs

Paper • 2601.15727 • Published Jan 22 • 19

OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale

Paper • 2602.05711 • Published about 1 month ago • 10

JingzeShi

submitted a paper to Daily Papers 27 days ago

OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale

Paper • 2602.05711 • Published about 1 month ago • 10

julien-c

submitted a paper to Daily Papers about 1 month ago

Shaping capabilities with token-level data filtering

Paper • 2601.21571 • Published Jan 29 • 27

csabakecskemeti

posted an update 2 months ago

Post

3280

Just sharing a result of a homelab infrastructure experiment:

I've managed to setup a distributed inference infra at home using a DGX Spark (128GB unified gddr6) and a linux workstation with an RTX 6000 Pro (96GB gddr7) connected via 100Gbps RoCEv2. The model I've used (https://lnkd.in/gx6J7YuB) is about 140GB so could not fit either of the GPU. Full setup and tutorial soon on devquasar.com

Screen recording:
https://lnkd.in/gKM9H5GJ

3 replies

kenobi

authored 3 papers 3 months ago

On Invariance Penalties for Risk Minimization

Paper • 2106.09777 • Published Jun 17, 2021

Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization

Paper • 2008.11293 • Published Aug 25, 2020

Bayesian Deep Learning for Exoplanet Atmospheric Retrieval

Paper • 1811.03390 • Published Nov 8, 2018

woojun-jung

authored a paper 3 months ago

Visual Funnel: Resolving Contextual Blindness in Multimodal Large Language Models

Paper • 2512.10362 • Published Dec 11, 2025 • 1

Aurelien-Morgan

posted an update 3 months ago

Post

355

Hey, I went to Hangzhou to talk about retrain-pipelines at the GOSIM Foundation's conference last september.
The recording just got released. Go check it out !
https://www.youtube.com/watch?v=nmrMachM5aM
Slides are there :
https://docs.google.com/presentation/d/1hnAzHJ0SbeAOtGJir-iH84RBtXT1OxVT/

2 replies

csabakecskemeti

posted an update 3 months ago

Post

1350

FYI: Mistral.Ministral-3 dequantizer FP8->BF16

https://github.com/csabakecskemeti/ministral-3_dequantizer_fp8-bf16

(The instruct model weights are in FP8)

csabakecskemeti

posted an update 3 months ago

Post

2094

Looking for some help to test an INT8 Deepseek 3.2:
SGLang supports Channel wise INT8 quants on CPUs with AMX instructions (Xeon 5 and above AFAIK)
https://lmsys.org/blog/2025-07-14-intel-xeon-optimization/

Currently uploading an INT8 version of Deepseek 3.2 Speciale:
DevQuasar/deepseek-ai.DeepSeek-V3.2-Speciale-Channel-INT8

I cannot test this I'm on AMD
"AssertionError: W8A8Int8LinearMethod on CPU requires that CPU has AMX support"
(I assumed it can fall back to some non optimized kernel but seems not)

If anyone with the required resources (Intel Xeon 5/6 + ~768-1TB ram) can help to test this that would be awesome.

If you have hints how to make this work on AMD Threadripper 7000 Pro series please guide me.

Thanks all!

8 replies

csabakecskemeti

posted an update 4 months ago

Post

313

Recently there are so much activity on token efficient formats, I've also build a package (inspired by toon).

Deep-TOON

My goal was to token efficiently handle json structures with complex embeddings.

So this is what I've built on the weekend. Feel free try:

https://pypi.org/project/deep-toon/0.1.0/

csabakecskemeti

posted an update 5 months ago

Post

2654

Christmas came early this year

3 replies

mbrack

authored a paper 5 months ago

UniFusion: Vision-Language Model as Unified Encoder in Image Generation

Paper • 2510.12789 • Published Oct 14, 2025 • 19

lysandre

posted an update 6 months ago

Post

8110

We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !

v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.

Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago!

6 replies

teknium

authored 2 papers 6 months ago

Hermes 3 Technical Report

Paper • 2408.11857 • Published Aug 15, 2024 • 56

Hermes 4 Technical Report

Paper • 2508.18255 • Published Aug 25, 2025 • 45

jeffboudier

posted an update 6 months ago

Post

3206

Quick 30s demo of the new Hub > Azure AI integration to deploy HF models in your own Azure account. Now with Py and CLI!

GG @alvarobartt @kramp @pagezyhf

AI & ML interests

Recent Activity

Team members 196

HF-Party's activity