gemma-3-4b-it-slipstream-sft

Gemma 3 4B IT fine-tuned on the Slipstream-TQT dataset to speak the Slipstream inter-agent protocol.

Training

Base model: google/gemma-3-4b-it
Method: SFT with LoRA (r=8, alpha=16)
Dataset: anthonym21/slipstream-tqt
Epochs: 1

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("anthonym21/gemma-3-4b-it-slipstream-sft")
tokenizer = AutoTokenizer.from_pretrained("anthonym21/gemma-3-4b-it-slipstream-sft")

# Generate SLIP message
prompt = "Request a code review for PR #42"
# ... (use chat template)

Next Steps

This model is stage 1 of a 3-stage pipeline:

SFT (this model) - Learn protocol format
GRPO - RLHF alignment via slipstream-gov-env for safe usage
Trim - Quantize/distill the aligned model

Downloads last month: 74

Safetensors

Model size

4B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anthonym21/gemma-3-4b-it-slipstream-sft

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Finetuned

(541)

this model

Finetunes

1 model