lucylq
/

qwen3_06B_lora_math

Text Generation

Model card Files Files and versions

Model Card for Model ID

Qwen0.6B trained on the MetaMathQA dataset using Unsloth. Used to test ExecuTorch LoRA capabilities.

Training Data

Dataset: https://huggingface.co/datasets/meta-math/MetaMathQA

Training Configuration

OUTPUT_DIR = "./outputs"
BATCH_SIZE = 2  # Smaller batch for longer sequences
GRADIENT_ACCUMULATION_STEPS = 8  # Effective batch = 16
LEARNING_RATE = 2e-4
NUM_EPOCHS = 1  # MetaMathQA is large, 1 epoch is often enough
WARMUP_RATIO = 0.03
LOGGING_STEPS = 25
SAVE_STEPS = 500
MAX_SAMPLES = 50000  # Limit samples for faster training (set None for full dataset)

Training Hyperparameters

Using bf16, which is what the original Qwen0.6B checkpoint it.

Framework versions

PEFT 0.18.0

ExecuTorch Files

These are Qwen3 0.6B models, lowered to XNNPACK, quantized with torchao 8da4w and embedding quantization following the export script in: https://github.com/meta-pytorch/executorch-examples/blob/main/program-data-separation/export_lora.sh

See the corresponding README in: https://github.com/meta-pytorch/executorch-examples/tree/main/program-data-separation/cpp/lora_example

qwen3_06B_q.ptd: foundation weights
qwen3_06B_q.pte: base model
qwen3_06B_lora_q.ptd: lora weights
qwen3_06B_lora_q.pte: lora model

To run the model, please download the Qwen tokenizer from: https://huggingface.co/Qwen/Qwen-tokenizer/tree/main

Downloads last month: 1,786

Model tree for lucylq/qwen3_06B_lora_math

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Finetuned

unsloth/Qwen3-0.6B

Adapter

(23)

this model