Skip to main content

Documentation Index

Fetch the complete documentation index at: https://handbook.fiddler.ai/llms.txt

Use this file to discover all available pages before exploring further.

Fiddler provides a comprehensive set of enrichments for monitoring LLM applications in production. Enrichments augment your application data with automatically generated trust, safety, and quality metrics during model onboarding. These metrics integrate directly with Fiddler’s monitoring dashboards, alerting systems, and analytics tools. Configure enrichments using the fdl.Enrichment() class in the Python Client SDK. For detailed configuration examples, see the Enrichments Guide. For help choosing the right enrichment, see Selecting Enrichments.
For ML model metrics (performance, drift, data integrity), see the ML Metrics Reference.

Safety metrics

Safety enrichments detect and flag unsafe, harmful, or policy-violating content in your LLM application’s inputs and outputs.
MetricEnrichment KeyLLM Required?Output TypeDescription
Fast Safetyftl_prompt_safetyYes (Fiddler FTL)bool + float per dimensionEvaluates text safety across 11 dimensions using Fiddler’s Fast Trust Model
PII DetectionpiiNobool + matches + entitiesDetects personally identifiable information using Presidio
ProfanityprofanityNoboolFlags offensive or inappropriate language
Banned Keywordsbanned_keywordsNoboolDetects user-defined restricted terms
Regex Matchregex_matchNocategoryMatches text against a user-defined regular expression
Language Detectionlanguage_detectionNostring + floatIdentifies the language of the source text
Topic Classificationtopic_modelNolist[float] + stringClassifies text into user-defined topics using zero-shot classification

Fast Safety

The Fast Safety enrichment evaluates text safety across 11 dimensions using Fiddler’s proprietary Fast Trust Model. Each dimension produces a boolean flag and a confidence probability score. Enrichment key: ftl_prompt_safety
DimensionOutput ColumnsScore RangeDescription
illegalillegal, illegal score0.0 — 1.0Content promoting illegal activities
hatefulhateful, hateful score0.0 — 1.0Hateful or discriminatory content
harassingharassing, harassing score0.0 — 1.0Harassing or bullying content
racistracist, racist score0.0 — 1.0Racist content
sexistsexist, sexist score0.0 — 1.0Sexist content
violentviolent, violent score0.0 — 1.0Content promoting violence
sexualsexual, sexual score0.0 — 1.0Sexually explicit content
harmfulharmful, harmful score0.0 — 1.0Generally harmful content
unethicalunethical, unethical score0.0 — 1.0Unethical content
jailbreakingjailbreaking, jailbreaking score0.0 — 1.0Jailbreaking or prompt injection attempts
roleplayingroleplaying, roleplaying score0.0 — 1.0Roleplaying attempts to bypass safety
An aggregate max_risk_prob output is also generated, representing the maximum probability across all 11 dimensions. For configuration details, see Enrichments: Fast Safety.

PII Detection

Detects and flags personally identifiable information using Presidio. Generates a boolean flag, matched text spans, and detected entity types. Enrichment key: pii Commonly used entity types: CREDIT_CARD, CRYPTO, DATE_TIME, EMAIL_ADDRESS, IBAN_CODE, IP_ADDRESS, LOCATION, PERSON, PHONE_NUMBER, URL, US_SSN, US_DRIVER_LICENSE, US_ITIN, US_PASSPORT Fiddler supports 32 entity types in total, including international identifiers for Australia, India, Singapore, and the UK. For the full list, see the Presidio supported entities. For configuration details, see Enrichments: PII.

Profanity

Flags offensive or inappropriate language using curated word lists from SurgeAI and Google. Enrichment key: profanity For configuration details, see Enrichments: Profanity.

Banned Keywords

Detects user-defined restricted terms in text inputs. The list of banned keywords is specified in the enrichment configuration. Enrichment key: banned_keywords For configuration details, see Enrichments: Banned Keywords.

Regex Match

Matches text against a user-defined regular expression pattern. Produces a categorical output of “Match” or “No Match”. Enrichment key: regex_match For configuration details, see Enrichments: Regex Match.

Language Detection

Identifies the language of the source text using fasttext models. Produces the detected language and a confidence probability. Enrichment key: language_detection For configuration details, see Enrichments: Language Detection.

Topic Classification

Classifies text into user-defined topics using a zero-shot classification model. Produces per-topic probability scores and the top-scoring topic. Enrichment key: topic_model For configuration details, see Enrichments: Topic.

Quality and hallucination metrics

Quality enrichments assess the accuracy, groundedness, and relevance of LLM-generated responses.
MetricEnrichment KeyLLM Required?Output TypeDescription
Fast Faithfulnessftl_response_faithfulnessYes (Fiddler FTL)bool + floatEvaluates factual groundedness using Fiddler’s Fast Trust Model
RAG FaithfulnessfaithfulnessYes (OpenAI)boolEvaluates factual accuracy of responses against provided context
Answer Relevanceanswer_relevanceYes (OpenAI)boolEvaluates whether responses address the input prompt
CoherencecoherenceYes (OpenAI)boolAssesses logical flow and clarity of responses
ConcisenessconcisenessYes (OpenAI)boolEvaluates brevity and clarity of responses

Fast Faithfulness

Evaluates the factual groundedness of AI-generated responses against provided context using Fiddler’s proprietary Fast Trust Model. Produces a boolean faithfulness flag and a confidence probability score. Enrichment key: ftl_response_faithfulness
The faithfulness threshold defaults to 0.5 and can be adjusted in the configuration to control scoring sensitivity. Higher thresholds result in stricter faithfulness detection (fewer responses labeled as faithful).
For configuration details, see Enrichments: Fast Faithfulness.

RAG Faithfulness

Evaluates the accuracy and reliability of facts presented in AI-generated responses by checking whether the information aligns with the provided context documents. Uses an OpenAI LLM for evaluation. Enrichment key: faithfulness
RAG Faithfulness vs Fast Faithfulness: This enrichment uses OpenAI for evaluation. Fast Faithfulness uses Fiddler’s Fast Trust Model for lower latency. See LLM-Based Metrics for a detailed comparison.
For configuration details, see Enrichments: Faithfulness.

Answer Relevance

Evaluates whether AI-generated responses address the input prompt. Produces a binary relevant/not-relevant result. Enrichment key: answer_relevance For configuration details, see Enrichments: Answer Relevance.

Coherence

Assesses the logical flow and clarity of AI-generated responses, checking whether the content maintains a consistent theme and argument structure. Enrichment key: coherence For configuration details, see Enrichments: Coherence.

Conciseness

Evaluates whether AI-generated responses communicate their message efficiently without unnecessary elaboration or redundancy. Enrichment key: conciseness For configuration details, see Enrichments: Conciseness.

Text statistics metrics

Text statistics enrichments provide quantitative analysis of text properties, including readability, length, and n-gram-based evaluation scores.
MetricEnrichment KeyLLM Required?Output TypeDescription
TextstattextstatNofloatGenerates up to 19 text readability and complexity statistics
EvaluateevaluateNofloatComputes n-gram-based evaluation scores (BLEU, ROUGE, METEOR)
SentimentsentimentNofloat + stringProvides sentiment analysis using VADER
Token Counttoken_countNointCounts the number of tokens in a string

Textstat

Generates text readability and complexity statistics using the textstat library. You can select specific statistics or use all 19 available metrics. Enrichment key: textstat
Sub-metricRangeDescription
char_count0 — 64,000Character count
letter_count0 — 64,000Letter count (alphabetical characters)
miniword_count0 — 64,000Count of short words
words_per_sentence0 — 1,000Average words per sentence
polysyllabcount0 — 64,000Polysyllabic word count
lexicon_count0 — 64,000Word count
syllable_count0 — 96,000Total syllable count
sentence_count0 — 32,000Sentence count
flesch_reading_ease-100 — 100Flesch Reading Ease score (higher = easier to read)
smog_index0 — 30SMOG readability index
flesch_kincaid_grade-3.4 — 100Flesch-Kincaid Grade Level
coleman_liau_index0 — 20Coleman-Liau readability index
automated_readability_index-3.4 — 100Automated Readability Index
dale_chall_readability_score0 — 10Dale-Chall readability score
difficult_words0 — 64,000Count of difficult words
linsear_write_formula0 — 20Linsear Write readability formula
gunning_fog0 — 20Gunning Fog readability index
long_word_count0 — 64,000Count of long words
monosyllabcount0 — 64,000Monosyllabic word count
If no statistics are specified in the configuration, the default statistic is flesch_kincaid_grade.
For configuration details, see Enrichments: Textstat.

Evaluate

Computes n-gram-based evaluation metrics for comparing two text passages, such as an AI-generated response and a reference answer. These metrics score highest when the reference and generated texts contain overlapping sequences. Enrichment key: evaluate
Sub-metricOutput ColumnScore RangeDescription
BLEUbleu0.0 — 1.0Precision of word n-grams between generated and reference text
ROUGE-1rouge10.0 — 1.0Unigram recall between generated and reference text
ROUGE-2rouge20.0 — 1.0Bigram recall between generated and reference text
ROUGE-LrougeL0.0 — 1.0Longest common subsequence between generated and reference text
ROUGE-LsumrougeLsum0.0 — 1.0ROUGE-L applied at the summary level
METEORmeteor0.0 — 1.0Combines precision, recall, and semantic matching
For configuration details, see Enrichments: Evaluate.

Sentiment

Provides sentiment analysis using NLTK’s VADER lexicon. Produces a compound score and a categorical sentiment label. Enrichment key: sentiment
Output ColumnTypeDescription
compoundfloatRaw compound sentiment score
sentimentstringOne of positive, negative, or neutral
For configuration details, see Enrichments: Sentiment.

Token Count

Counts the number of tokens in a string using the tiktoken library. Enrichment key: token_count For configuration details, see Enrichments: Token Count.

Text validation metrics

Text validation enrichments verify the structural correctness of generated text outputs such as SQL queries and JSON payloads.
MetricEnrichment KeyLLM Required?Output TypeDescription
SQL Validationsql_validationNobool + stringValidates SQL syntax for a specified dialect
JSON Validationjson_validationNobool + stringValidates JSON syntax and optionally against a schema

SQL Validation

Validates SQL query syntax for a specified dialect. Supports 25+ SQL dialects including MySQL, PostgreSQL, Snowflake, BigQuery, and others. Enrichment key: sql_validation
Query validation is syntax-based and does not check against any existing schema or databases for validity.
For configuration details, see Enrichments: SQL Validation.

JSON Validation

Validates JSON for correctness and optionally against a user-defined JSON Schema. Enrichment key: json_validation For configuration details, see Enrichments: JSON Validation.

Embedding metrics

Embedding enrichments convert text into vector representations for drift detection and visualization.
MetricEnrichment KeyLLM Required?Output TypeDescription
Text EmbeddingTextEmbeddingNovector + floatGenerates text embeddings for UMAP visualization and drift detection
Centroid Distance(auto-generated)NofloatDistance from the nearest cluster centroid

Text Embedding

Converts unstructured text into high-dimensional vector representations for semantic analysis. Enables Fiddler’s 3D UMAP visualizations and embedding-based drift detection. Class: fdl.TextEmbedding()
TextEmbedding is configured using fdl.TextEmbedding() rather than fdl.Enrichment(). See the Enrichments Guide for usage examples.

Centroid Distance

Measures the distance between a data point’s embedding and the nearest cluster centroid. This metric is automatically generated when a TextEmbedding enrichment is created. For configuration details, see Enrichments: Centroid Distance.