Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
5
21
18
Shwai He
shwai-he
Follow
s1ghhh's profile picture
haichaozhang's profile picture
Vfrz's profile picture
7 followers
·
11 following
https://shwai-he.github.io/
Shwai-He
AI & ML interests
Deep Learning, Mechine Learning, Natural Language Processing.
Recent Activity
authored
a paper
9 days ago
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
authored
a paper
9 days ago
What Matters in Transformers? Not All Attention is Needed
authored
a paper
9 days ago
Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts
View all activity
Organizations
shwai-he
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
2 models
12 days ago
LLM-Drop/BAGEL-MoE-7B-GEN-16to8
Text-to-Image
•
Updated
11 days ago
•
39
•
2
LLM-Drop/BAGEL-MoE-7B-GEN-32to16
Text-to-Image
•
Updated
11 days ago
•
33
•
2
liked
a dataset
7 months ago
haichaozhang/DenseVideoEvaluation
Preview
•
Updated
Sep 18, 2025
•
10
•
3
liked
a model
about 1 year ago
Qwen/QwQ-32B
Text Generation
•
33B
•
Updated
Mar 11, 2025
•
71.5k
•
•
2.91k
liked
14 models
over 1 year ago
s1ghhh/Mistral-7B-v0.1-Drop4Attn
7B
•
Updated
Sep 8, 2024
•
5
•
2
s1ghhh/Llama-2-13b-Drop8Block
13B
•
Updated
Sep 8, 2024
•
26
•
2
s1ghhh/Llama-2-13b-Drop4Block
13B
•
Updated
Sep 8, 2024
•
4
•
2
s1ghhh/Llama-2-13b-Drop8Attn
13B
•
Updated
Sep 8, 2024
•
4
•
2
s1ghhh/Llama-2-13b-Drop4Attn
13B
•
Updated
Sep 8, 2024
•
5
•
2
s1ghhh/Llama-2-13b-Drop4MLP
13B
•
Updated
Sep 8, 2024
•
4
•
2
s1ghhh/Llama-2-13b-Drop8MLP
13B
•
Updated
Sep 8, 2024
•
3
•
2
s1ghhh/Mistral-7B-v0.1-Drop4Block
7B
•
Updated
Sep 8, 2024
•
4
•
2
s1ghhh/Mistral-7B-v0.1-Drop8Block
7B
•
Updated
Sep 8, 2024
•
3
•
2
s1ghhh/Mistral-7B-v0.1-Drop8Attn
7B
•
Updated
Sep 8, 2024
•
4
•
2
s1ghhh/Mistral-7B-v0.1-Drop4MLP
7B
•
Updated
Sep 8, 2024
•
2
•
2
s1ghhh/Mistral-7B-v0.1-Drop8MLP
7B
•
Updated
Sep 8, 2024
•
3
•
2
s1ghhh/Llama-3-70b-Drop
Text Generation
•
71B
•
Updated
Oct 23, 2024
•
17
•
4
s1ghhh/Llama-2-70b-Drop
Text Generation
•
Updated
Oct 23, 2024
•
3
•
2