Skip to main content

Documentation Index

Fetch the complete documentation index at: https://handbook.fiddler.ai/llms.txt

Use this file to discover all available pages before exploring further.

Fiddler provides 35 built-in metrics for monitoring ML models in production. These metrics cover model performance, data drift, data integrity, traffic, and statistics. You can also define custom metrics using the Fiddler Query Language.
For LLM and GenAI application metrics, see the LLM Observability Metrics Reference.

Performance metrics

Performance metrics measure how well a model performs on its task. The available metrics depend on the model task type. For more details on performance monitoring workflows, see Performance Tracking.

Binary classification

MetricAPI IDScore RangeDescription
Accuracyaccuracy0 — 1(TP + TN) / (TP + TN + FP + FN)
Log Losslog_loss0 — infinityMeasures the difference between the predicted probability distribution and the true distribution
Precisionprecision0 — 1TP / (TP + FP). Requires a decision threshold.
Recall / True Positive Raterecall0 — 1TP / (TP + FN). Requires a decision threshold.
F1 Scoref1_score0 — 12 * (Precision * Recall) / (Precision + Recall). Requires a decision threshold.
False Positive Ratefpr0 — 1FP / (FP + TN). Requires a decision threshold.
AUCauc0 — 1Area Under the ROC Curve (histogram-based calculation). See also AUROC.
AUROCauroc0 — 1Area Under the Receiver Operating Characteristic curve, plotting true positive rate against false positive rate
Expected Calibration Errorexpected_calibration_error0 — 1Measures the difference between predicted probabilities and empirical probabilities
Geometric Meangeometric_mean0 — 1Square root of (Precision * Recall). Requires a decision threshold.
Calibrated Thresholdcalibrated_threshold0 — 1A threshold that balances precision and recall at a particular operating point
Data Countdata_count0 — infinityThe number of events where target and output are both not NULL. Used as the denominator for accuracy calculations.

Multi-class classification

MetricAPI IDScore RangeDescription
Accuracyaccuracy0 — 1(Number of correctly classified samples) / Data Count
Log Losslog_loss0 — infinityMeasures the difference between the predicted probability distribution and the true distribution, on a logarithmic scale
Log Loss Countlog_loss_count0 — infinityCount of events used in the Log Loss calculation

Regression

MetricAPI IDScore RangeDescription
Mean Absolute Error (MAE)mae0 — infinityAverage of the absolute differences between predicted and true values
Mean Squared Error (MSE)mse0 — infinityAverage of the squared differences between predicted and true values
Mean Absolute Percentage Error (MAPE)mape0 — infinityAverage of the absolute percentage differences between predicted and true values
Weighted Mean Absolute Percentage Error (WMAPE)wmape0 — infinityWeighted average of the absolute percentage differences between predicted and true values
R-squared (R²)r2-infinity — 1Proportion of variance in the dependent variable explained by the independent variables

Ranking

MetricAPI IDScore RangeDescription
Mean Average Precision (MAP)map0 — 1Average precision of relevant items in the top-k results. For binary relevance ranking only. Supports configurable top_k.
Normalized Discounted Cumulative Gain (NDCG)ndcg_mean0 — 1Quality of the ranking by discounting relevance scores at lower ranks. Supports configurable top_k.
Query Countquery_count0 — infinityNumber of ranking queries in the time period

Drift metrics

Drift metrics measure distributional changes between your baseline dataset and production data. High drift can indicate data pipeline issues or genuine shifts in the data distribution. Both metrics require a baseline dataset. For more details, see Data Drift.
MetricAPI IDScore RangeDescription
Jensen-Shannon Distance (JSD)jsd0 — 1Distance between the baseline distribution and the production distribution for a given field, using configurable bins for numerical columns
Population Stability Index (PSI)psi0 — infinityDrift metric based on multinomial classification of a variable into configurable bins, comparing baseline and production distributions
The drift analytics table also provides Feature Impact, Feature Drift, and Prediction Drift Impact as derived values to help identify which features contribute most to prediction drift.

Data integrity metrics

Data integrity metrics detect violations in production data compared to the schema established during model onboarding. Fiddler tracks three violation types: missing values, type mismatches, and range violations. Both raw counts and percentages are available. For more details, see Data Integrity.

Count-based

MetricAPI IDDescription
Any Violationany_violation_countCount of any data integrity violation across all features
Missing Value Violationnull_violation_countCount of missing value violations across all features
Range Violationrange_violation_countCount of range violations across all features
Type Violationtype_violation_countCount of data type violations across all features

Percentage-based

MetricAPI IDDescription
% Any Violationany_violation_percentagePercentage of events with any data integrity violation
% Missing Value Violationnull_violation_percentagePercentage of events with missing value violations
% Range Violationrange_violation_percentagePercentage of events with range violations
% Type Violationtype_violation_percentagePercentage of events with data type violations

Traffic metrics

Traffic metrics provide visibility into the operational health of your model service. For more details, see Traffic.
MetricAPI IDDescription
TraffictrafficVolume of inference requests received by the model over time

Statistics metrics

Statistics metrics provide basic aggregations over columns. These are useful for monitoring custom metadata fields over time. For more details, see Statistics.
MetricAPI IDApplies ToDescription
AverageaverageNumeric columnsArithmetic mean of a numeric column
SumsumNumeric columnsSum of a numeric column
FrequencyfrequencyCategorical / Boolean columnsCount of occurrences for each value

Custom metrics

In addition to the built-in metrics above, you can define custom metrics using the Fiddler Query Language (FQL). Custom metrics support aggregations, operators, and metric functions to create business-specific KPIs. For details on creating and managing custom metrics, see: