IntentGuard — Healthcare & Clinical

Production-ready vertical intent classifier for LLM chatbot guardrails. Classifies user messages as allow, deny, or abstain to keep healthcare chatbots on-topic and compliant.

Research Article | perfecXion.ai | Finance Model | Healthcare Model | Legal Model

IntentGuard Model Family

Model	Vertical	Accuracy	Off-Topic Pass Rate	Link
intentguard-finance	Financial Services	99.6%	0.00%	perfecXion/intentguard-finance
intentguard-healthcare	Healthcare & Clinical	98.9%	0.98%	This model
intentguard-legal	Legal & Compliance	97.9%	0.50%	perfecXion/intentguard-legal

Overview

The Problem

Healthcare chatbots face unique regulatory risks — HIPAA compliance, patient safety, and liability exposure demand that AI assistants stay strictly within clinical domains. Users asking about sports scores, celebrity news, or relationship advice should be blocked, not answered.

The Solution

IntentGuard uses a tiny DeBERTa-v3-xsmall model (22M parameters, 2.5MB quantized) to classify user intent in <30ms on CPU with three-way classification:

Allow — On-topic healthcare query, pass to the LLM
Deny — Off-topic, block with a polite redirect
Abstain — Ambiguous, escalate to secondary classifier or human review

Performance

Metric	Value
Overall Accuracy	98.9%
Legitimate Block Rate	0.00%
Off-Topic Pass Rate	0.98%
p99 Latency (CPU)	<30ms
Model Size (ONNX INT8)	2.5MB
Base Parameters	22M (DeBERTa-v3-xsmall)
Expected Calibration Error	<0.03

Model Details

Property	Value
Architecture	DeBERTa-v3-xsmall (fine-tuned for 3-way classification)
Format	ONNX (INT8 quantized)
Version	1.0
Vertical	Healthcare (Clinical & Wellness)
GPU Required	No — runs on CPU
Publisher	perfecXion.ai

Core Topics (Allow)

Symptoms, diagnosis, treatment, medications, preventive care, nutrition, mental health, fitness, chronic conditions, surgery, emergency care, health insurance, patient rights, telemedicine

Hard Exclusions (Deny)

Sports, entertainment, cooking, gaming, celebrity gossip, fashion, travel/leisure, fiction writing, relationship advice

Usage

Python (ONNX Runtime)

import onnxruntime as ort
from transformers import AutoTokenizer
import numpy as np

tokenizer = AutoTokenizer.from_pretrained("perfecXion/intentguard-healthcare")
session = ort.InferenceSession("model.onnx")

text = "What are the symptoms of Type 2 diabetes?"
inputs = tokenizer(text, return_tensors="np", max_length=128, truncation=True, padding="max_length")

logits = session.run(None, {
    "input_ids": inputs["input_ids"],
    "attention_mask": inputs["attention_mask"]
})[0]

labels = ["allow", "deny", "abstain"]
prediction = labels[np.argmax(logits)]
confidence = float(np.max(np.exp(logits) / np.sum(np.exp(logits))))

print(f"Intent: {prediction} (confidence: {confidence:.3f})")
# Output: Intent: allow (confidence: 0.997)

Docker

docker pull ghcr.io/perfecxion/intentguard:healthcare-1.0
docker run -p 8080:8080 ghcr.io/perfecxion/intentguard:healthcare-1.0

curl -X POST http://localhost:8080/v1/classify \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What medications interact with metformin?"}]}'

pip

pip install intentguard

from intentguard import IntentGuard
guard = IntentGuard.load("healthcare")
result = guard.classify("What are the symptoms of Type 2 diabetes?")
print(result)  # Intent(label='allow', confidence=0.997)

Example Classifications

User Message	Predicted	Confidence	Correct?
"What are the symptoms of Type 2 diabetes?"	allow	0.997	✅
"What medications interact with metformin?"	allow	0.996	✅
"Who won the Super Bowl?"	deny	0.999	✅
"Tell me a joke"	deny	0.996	✅
"Is telemedicine covered by Medicare?"	allow	0.981	✅
"What's the best recipe for pasta?"	deny	0.998	✅

Citation

@misc{thornton2025intentguard,
  title={IntentGuard: A Production-Grade Vertical Intent Classifier for LLM Guardrails},
  author={Thornton, Scott},
  year={2025},
  publisher={perfecXion.ai},
  url={https://perfecxion.ai/articles/intentguard-vertical-intent-classifier-llm-guardrails.html},
  note={Model: https://huggingface.co/perfecXion/intentguard-healthcare}
}

Quality Metrics

Metric	Result
Accuracy (Healthcare vertical)	98.9%
Legitimate Block Rate	0.00%
Off-Topic Pass Rate	0.98%
Expected Calibration Error	<0.03
ONNX INT8 Quantization	Validated
CPU Inference (p99)	<30ms

License

Apache 2.0

Evaluation results

Accuracy
self-reported

98.900
Legitimate Block Rate
self-reported

0.000
Off-Topic Pass Rate
self-reported

0.980

perfecXion
/

intentguard-healthcare