Fiddler Protect provides comprehensive AI safety through real-time guardrails, continuous monitoring, and intelligent alerting—all powered by the Fiddler Trust Service. Built on purpose-optimized evaluation models that are 10-100x faster than general-purpose LLMs, Fiddler Protect helps you prevent harmful outputs, detect privacy violations, ensure factual accuracy, and maintain compliance across your AI applications.Documentation Index
Fetch the complete documentation index at: https://handbook.fiddler.ai/llms.txt
Use this file to discover all available pages before exploring further.
Protection Layers
Fiddler Protect operates through multiple complementary layers of defense:Real-Time Guardrails
Fast, pre-deployment protection that evaluates and filters AI inputs and outputs before they reach users.Safety Guardrails
Detect and prevent harmful content across 11 safety dimensions:- Harmful Behaviors: Jailbreaking attempts, prompt injection, illegal content promotion
- Offensive Content: Hate speech, harassment, racism, sexism
- Inappropriate Content: Violence, explicit sexual content, unethical scenarios
- Risk Categories: Toxic language, dangerous information, inappropriate roleplaying
PII/PHI Detection
Protect user privacy by automatically detecting sensitive information in model inputs and outputs:- Personal Identifiers: Names, dates of birth, email addresses, phone numbers
- Financial Data: Credit card numbers, bank accounts, tax IDs
- Government IDs: Social security numbers, passport numbers, driver’s licenses
- Healthcare Information: Medical record numbers, health insurance IDs (HIPAA compliance)
- Custom Entities: Organization-specific sensitive patterns (employee IDs, API keys, internal codes)
Faithfulness & Accuracy
Prevent hallucinations and ensure AI responses stay grounded in source material:- Hallucination Detection: Evaluate whether AI responses are factually consistent with provided context
- RAG Validation: Verify that generated content accurately reflects retrieved documents
- Source Grounding: Ensure answers don’t introduce information not present in reference materials
Performance Advantage
All guardrail models are 10-100x faster than general-purpose LLMs like GPT-4 for evaluation tasks, enabling:- Real-time filtering without noticeable latency
- High-volume production deployment
- Cost-effective safety at scale
- No external API dependencies for enhanced security
Continuous Monitoring
Post-deployment protection through ongoing analysis of production traffic.Safety Enrichments
Monitor your production AI systems for safety and quality issues:- Toxicity Detection: Identify toxic language patterns using advanced classification models
- Profanity Filtering: Detect offensive language in both inputs and outputs
- PII Monitoring: Continuously scan for privacy violations in production data
- Sentiment Analysis: Track emotional tone and user experience signals
- Custom Classification: Apply organization-specific categorization rules
Data Integrity & Drift
Protect against data quality issues and distribution changes:- Missing Value Detection: Identify incomplete inputs that may cause unpredictable behavior
- Type Validation: Catch data type mismatches (e.g., strings where numbers expected)
- Range Monitoring: Detect out-of-range values that violate expected constraints
- Distribution Drift: Track when production data diverges from training or baseline data
- Embedding Visualization: Use 3D UMAP plots to visually identify anomalies in high-dimensional data
Alerting & Response
Automated notification system for proactive risk management:- Drift Alerts: Detect when production data or model behavior changes significantly
- Data Integrity Alerts: Flag missing values, type mismatches, or range violations
- Performance Alerts: Monitor for model accuracy degradation over time
- Custom Metric Alerts: Define formula-based alerts for business-specific KPIs
- Traffic Alerts: Track system volume for capacity planning and anomaly detection
Fiddler Trust Service
All protection capabilities are powered by the Fiddler Trust Service—a platform of purpose-built evaluation models optimized for safety, quality, and accuracy assessment. Unlike general-purpose LLMs repurposed for evaluation, Trust Service models are specifically designed for these tasks, delivering:- Speed: 10-100x faster evaluation than GPT-4
- Security: Air-gapped deployment options with no external API dependencies
- Privacy: Full data sovereignty for GDPR, HIPAA, and CCPA compliance
- Reliability: Consistent, deterministic evaluation at scale
Key Use Cases
Content Safety
Prevent your AI applications from generating harmful, offensive, or inappropriate content:- Filter toxic language and hate speech in real-time
- Block jailbreaking attempts and prompt injection attacks
- Detect violent, sexual, or otherwise inappropriate outputs before they reach users
- Maintain brand reputation by ensuring responsible AI behavior
Privacy Protection
Safeguard user privacy and maintain compliance with data protection regulations:- Automatically detect and redact PII in both inputs and outputs
- Support HIPAA compliance through PHI detection
- Configure custom entity detection for organization-specific sensitive data
- Monitor for privacy violations in production traffic
Accuracy & Truthfulness
Ensure your AI systems provide accurate, grounded information:- Detect hallucinations in RAG applications before presenting to users
- Validate that generated content reflects source documents accurately
- Monitor for factual consistency across your AI responses
- Maintain trust by preventing fabricated or misleading information
Regulatory Compliance
Meet compliance requirements while maintaining comprehensive audit trails:- GDPR compliance through PII detection and data sovereignty options
- HIPAA compliance with PHI detection and air-gapped deployment
- Complete audit logging of all safety events and policy enforcement
- Bias and fairness monitoring for regulatory reporting
Getting Started
Quick Start Guides
Get up and running with Fiddler Protect in minutes:- Guardrails Quick Start - Set up real-time protection
- Safety Guardrails Quick Start - Implement content safety filters
- PII Detection Quick Start - Protect user privacy
- Faithfulness Quick Start - Prevent hallucinations
Documentation & References
Dive deeper into Fiddler Protect capabilities:- Guardrails API Reference - Complete API documentation
- LLM-Based Metrics - Quality and safety metrics
- Enrichments Guide - Continuous monitoring enrichments
- Alerts Platform - Configure alerting and notifications
- Guardrails FAQ - Common questions and answers
Additional Resources
Learn more about the underlying technology:- Trust Service Overview - Learn about the evaluation platform
- Guardrails Glossary - Key concepts and terminology
Ready to get started? Try the Guardrails Quick Start to implement your first safety guardrail in minutes.