Documentation Index
Fetch the complete documentation index at: https://handbook.fiddler.ai/llms.txt
Use this file to discover all available pages before exploring further.
Fiddler provides 35 built-in metrics for monitoring ML models in production. These metrics cover model performance, data drift, data integrity, traffic, and statistics. You can also define custom metrics using the Fiddler Query Language.
Performance metrics measure how well a model performs on its task. The available metrics depend on the model task type. For more details on performance monitoring workflows, see Performance Tracking.
Binary classification
| Metric | API ID | Score Range | Description |
|---|
| Accuracy | accuracy | 0 — 1 | (TP + TN) / (TP + TN + FP + FN) |
| Log Loss | log_loss | 0 — infinity | Measures the difference between the predicted probability distribution and the true distribution |
| Precision | precision | 0 — 1 | TP / (TP + FP). Requires a decision threshold. |
| Recall / True Positive Rate | recall | 0 — 1 | TP / (TP + FN). Requires a decision threshold. |
| F1 Score | f1_score | 0 — 1 | 2 * (Precision * Recall) / (Precision + Recall). Requires a decision threshold. |
| False Positive Rate | fpr | 0 — 1 | FP / (FP + TN). Requires a decision threshold. |
| AUC | auc | 0 — 1 | Area Under the ROC Curve (histogram-based calculation). See also AUROC. |
| AUROC | auroc | 0 — 1 | Area Under the Receiver Operating Characteristic curve, plotting true positive rate against false positive rate |
| Expected Calibration Error | expected_calibration_error | 0 — 1 | Measures the difference between predicted probabilities and empirical probabilities |
| Geometric Mean | geometric_mean | 0 — 1 | Square root of (Precision * Recall). Requires a decision threshold. |
| Calibrated Threshold | calibrated_threshold | 0 — 1 | A threshold that balances precision and recall at a particular operating point |
| Data Count | data_count | 0 — infinity | The number of events where target and output are both not NULL. Used as the denominator for accuracy calculations. |
Multi-class classification
| Metric | API ID | Score Range | Description |
|---|
| Accuracy | accuracy | 0 — 1 | (Number of correctly classified samples) / Data Count |
| Log Loss | log_loss | 0 — infinity | Measures the difference between the predicted probability distribution and the true distribution, on a logarithmic scale |
| Log Loss Count | log_loss_count | 0 — infinity | Count of events used in the Log Loss calculation |
Regression
| Metric | API ID | Score Range | Description |
|---|
| Mean Absolute Error (MAE) | mae | 0 — infinity | Average of the absolute differences between predicted and true values |
| Mean Squared Error (MSE) | mse | 0 — infinity | Average of the squared differences between predicted and true values |
| Mean Absolute Percentage Error (MAPE) | mape | 0 — infinity | Average of the absolute percentage differences between predicted and true values |
| Weighted Mean Absolute Percentage Error (WMAPE) | wmape | 0 — infinity | Weighted average of the absolute percentage differences between predicted and true values |
| R-squared (R²) | r2 | -infinity — 1 | Proportion of variance in the dependent variable explained by the independent variables |
Ranking
| Metric | API ID | Score Range | Description |
|---|
| Mean Average Precision (MAP) | map | 0 — 1 | Average precision of relevant items in the top-k results. For binary relevance ranking only. Supports configurable top_k. |
| Normalized Discounted Cumulative Gain (NDCG) | ndcg_mean | 0 — 1 | Quality of the ranking by discounting relevance scores at lower ranks. Supports configurable top_k. |
| Query Count | query_count | 0 — infinity | Number of ranking queries in the time period |
Drift metrics
Drift metrics measure distributional changes between your baseline dataset and production data. High drift can indicate data pipeline issues or genuine shifts in the data distribution. Both metrics require a baseline dataset. For more details, see Data Drift.
| Metric | API ID | Score Range | Description |
|---|
| Jensen-Shannon Distance (JSD) | jsd | 0 — 1 | Distance between the baseline distribution and the production distribution for a given field, using configurable bins for numerical columns |
| Population Stability Index (PSI) | psi | 0 — infinity | Drift metric based on multinomial classification of a variable into configurable bins, comparing baseline and production distributions |
The drift analytics table also provides Feature Impact, Feature Drift, and Prediction Drift Impact as derived values to help identify which features contribute most to prediction drift.
Data integrity metrics
Data integrity metrics detect violations in production data compared to the schema established during model onboarding. Fiddler tracks three violation types: missing values, type mismatches, and range violations. Both raw counts and percentages are available. For more details, see Data Integrity.
Count-based
| Metric | API ID | Description |
|---|
| Any Violation | any_violation_count | Count of any data integrity violation across all features |
| Missing Value Violation | null_violation_count | Count of missing value violations across all features |
| Range Violation | range_violation_count | Count of range violations across all features |
| Type Violation | type_violation_count | Count of data type violations across all features |
Percentage-based
| Metric | API ID | Description |
|---|
| % Any Violation | any_violation_percentage | Percentage of events with any data integrity violation |
| % Missing Value Violation | null_violation_percentage | Percentage of events with missing value violations |
| % Range Violation | range_violation_percentage | Percentage of events with range violations |
| % Type Violation | type_violation_percentage | Percentage of events with data type violations |
Traffic metrics
Traffic metrics provide visibility into the operational health of your model service. For more details, see Traffic.
| Metric | API ID | Description |
|---|
| Traffic | traffic | Volume of inference requests received by the model over time |
Statistics metrics
Statistics metrics provide basic aggregations over columns. These are useful for monitoring custom metadata fields over time. For more details, see Statistics.
| Metric | API ID | Applies To | Description |
|---|
| Average | average | Numeric columns | Arithmetic mean of a numeric column |
| Sum | sum | Numeric columns | Sum of a numeric column |
| Frequency | frequency | Categorical / Boolean columns | Count of occurrences for each value |
Custom metrics
In addition to the built-in metrics above, you can define custom metrics using the Fiddler Query Language (FQL). Custom metrics support aggregations, operators, and metric functions to create business-specific KPIs.
For details on creating and managing custom metrics, see: