IntentGuard β Healthcare & Clinical
Production-ready vertical intent classifier for LLM chatbot guardrails. Classifies user messages as allow, deny, or abstain to keep healthcare chatbots on-topic and compliant.
Research Article | perfecXion.ai | Finance Model | Healthcare Model | Legal Model
IntentGuard Model Family
| Model | Vertical | Accuracy | Off-Topic Pass Rate | Link |
|---|---|---|---|---|
| intentguard-finance | Financial Services | 99.6% | 0.00% | perfecXion/intentguard-finance |
| intentguard-healthcare | Healthcare & Clinical | 98.9% | 0.98% | This model |
| intentguard-legal | Legal & Compliance | 97.9% | 0.50% | perfecXion/intentguard-legal |
Overview
The Problem
Healthcare chatbots face unique regulatory risks β HIPAA compliance, patient safety, and liability exposure demand that AI assistants stay strictly within clinical domains. Users asking about sports scores, celebrity news, or relationship advice should be blocked, not answered.
The Solution
IntentGuard uses a tiny DeBERTa-v3-xsmall model (22M parameters, 2.5MB quantized) to classify user intent in <30ms on CPU with three-way classification:
- Allow β On-topic healthcare query, pass to the LLM
- Deny β Off-topic, block with a polite redirect
- Abstain β Ambiguous, escalate to secondary classifier or human review
Performance
| Metric | Value |
|---|---|
| Overall Accuracy | 98.9% |
| Legitimate Block Rate | 0.00% |
| Off-Topic Pass Rate | 0.98% |
| p99 Latency (CPU) | <30ms |
| Model Size (ONNX INT8) | 2.5MB |
| Base Parameters | 22M (DeBERTa-v3-xsmall) |
| Expected Calibration Error | <0.03 |
Model Details
| Property | Value |
|---|---|
| Architecture | DeBERTa-v3-xsmall (fine-tuned for 3-way classification) |
| Format | ONNX (INT8 quantized) |
| Version | 1.0 |
| Vertical | Healthcare (Clinical & Wellness) |
| GPU Required | No β runs on CPU |
| Publisher | perfecXion.ai |
Core Topics (Allow)
Symptoms, diagnosis, treatment, medications, preventive care, nutrition, mental health, fitness, chronic conditions, surgery, emergency care, health insurance, patient rights, telemedicine
Hard Exclusions (Deny)
Sports, entertainment, cooking, gaming, celebrity gossip, fashion, travel/leisure, fiction writing, relationship advice
Usage
Python (ONNX Runtime)
import onnxruntime as ort
from transformers import AutoTokenizer
import numpy as np
tokenizer = AutoTokenizer.from_pretrained("perfecXion/intentguard-healthcare")
session = ort.InferenceSession("model.onnx")
text = "What are the symptoms of Type 2 diabetes?"
inputs = tokenizer(text, return_tensors="np", max_length=128, truncation=True, padding="max_length")
logits = session.run(None, {
"input_ids": inputs["input_ids"],
"attention_mask": inputs["attention_mask"]
})[0]
labels = ["allow", "deny", "abstain"]
prediction = labels[np.argmax(logits)]
confidence = float(np.max(np.exp(logits) / np.sum(np.exp(logits))))
print(f"Intent: {prediction} (confidence: {confidence:.3f})")
# Output: Intent: allow (confidence: 0.997)
Docker
docker pull ghcr.io/perfecxion/intentguard:healthcare-1.0
docker run -p 8080:8080 ghcr.io/perfecxion/intentguard:healthcare-1.0
curl -X POST http://localhost:8080/v1/classify \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "What medications interact with metformin?"}]}'
pip
pip install intentguard
from intentguard import IntentGuard
guard = IntentGuard.load("healthcare")
result = guard.classify("What are the symptoms of Type 2 diabetes?")
print(result) # Intent(label='allow', confidence=0.997)
Example Classifications
| User Message | Predicted | Confidence | Correct? |
|---|---|---|---|
| "What are the symptoms of Type 2 diabetes?" | allow | 0.997 | β |
| "What medications interact with metformin?" | allow | 0.996 | β |
| "Who won the Super Bowl?" | deny | 0.999 | β |
| "Tell me a joke" | deny | 0.996 | β |
| "Is telemedicine covered by Medicare?" | allow | 0.981 | β |
| "What's the best recipe for pasta?" | deny | 0.998 | β |
Citation
@misc{thornton2025intentguard,
title={IntentGuard: A Production-Grade Vertical Intent Classifier for LLM Guardrails},
author={Thornton, Scott},
year={2025},
publisher={perfecXion.ai},
url={https://perfecxion.ai/articles/intentguard-vertical-intent-classifier-llm-guardrails.html},
note={Model: https://huggingface.co/perfecXion/intentguard-healthcare}
}
Quality Metrics
| Metric | Result |
|---|---|
| Accuracy (Healthcare vertical) | 98.9% |
| Legitimate Block Rate | 0.00% |
| Off-Topic Pass Rate | 0.98% |
| Expected Calibration Error | <0.03 |
| ONNX INT8 Quantization | Validated |
| CPU Inference (p99) | <30ms |
License
Apache 2.0
Links
- Research Article: IntentGuard: A Production-Grade Vertical Intent Classifier
- Publisher: perfecXion.ai
- Finance Model: perfecXion/intentguard-finance
- Legal Model: perfecXion/intentguard-legal
- Docker Image:
ghcr.io/perfecxion/intentguard:healthcare-1.0
- Downloads last month
- 21
Evaluation results
- Accuracyself-reported98.900
- Legitimate Block Rateself-reported0.000
- Off-Topic Pass Rateself-reported0.980