NekoQwen-9B

Qwen3.5-9B finetuned by NekoQA-30K

Model Details

  • Architecture: Qwen3_5ForConditionalGeneration
  • Processor: Qwen3VLProcessor
  • Precision: float16
  • Format: sharded safetensors
  • Parameter count: about 9.41B
  • Repository size: about 18 GB
  • Modalities: text, image, and video inputs with text generation output
  • Max position embeddings: 262144
  • Transformers version in config: 5.3.0

Fine-Tuning Summary

  • Base model: Qwen/Qwen3.5-9B
  • Tuning method: LoRA merged into full weights
  • Epochs: 1.0
  • Learning rate: 1e-4
  • Per-device batch size: 1
  • Gradient accumulation: 16
  • Sequence length: 768
  • Precision during training: fp16

Usage

import torch
from transformers import AutoProcessor, Qwen3_5ForConditionalGeneration

model_id = "your-username/your-repo"

processor = AutoProcessor.from_pretrained(model_id)
model = Qwen3_5ForConditionalGeneration.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe the main characteristics of this model in one paragraph."},
        ],
    }
]

text = processor.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

inputs = processor(text=[text], padding=True, return_tensors="pt").to(model.device)
generated_ids = model.generate(**inputs, max_new_tokens=128)

print(processor.batch_decode(generated_ids, skip_special_tokens=True)[0])

For image or video inputs, use the same chat-template message structure with Qwen3VLProcessor.

Notes

This folder contains the merged checkpoint, tokenizer, processor configuration, and chat template needed to load the model with Transformers.

Training data provenance, evaluation results, and intended-use notes are not documented in this folder yet. Add those details before making the repository public if you want a complete public model card.

Downloads last month
11
Safetensors
Model size
9B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Sfever/NekoQwen-9B

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(156)
this model
Quantizations
2 models