YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Gemma-3-270M Threat Classifier
Model Description
Fine-tuned version of google/gemma-3-270m for binary threat classification (Safe vs Unsafe prompts).
Training Details
- Base Model: google/gemma-3-270m
- Task: Binary Text Classification
- Training Date: 2025-12-31
- Training Framework: Hugging Face Transformers
Hyperparameters
- Learning Rate: 2e-05
- Batch Size: 16
- Epochs: 10
- Max Length: 512
- Optimizer: adamw_torch
Performance (Test Set)
- Accuracy: 0.8363
- Precision: 0.8232
- Recall: 0.8882
- F1 Score: 0.8544
- AUC-ROC: 0.9101
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("path/to/model")
tokenizer = AutoTokenizer.from_pretrained("path/to/model")
text = "Your prompt here"
inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True)
outputs = model(**inputs)
prediction = outputs.logits.argmax(-1).item()
label = "unsafe" if prediction == 1 else "safe"
Labels
- 0: Safe
- 1: Unsafe (Threat/Jailbreak)
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support