F-beta Score
Created | |
---|---|
Tags | Metrics |
The F-beta score is a metric used to evaluate the performance of a binary classification model, taking into account both precision and recall. It's a generalization of the F1 score, allowing you to give more or less emphasis to precision or recall depending on the value of the beta parameter.
Formula:
The F-beta score is calculated using the following formula:
where:
- Precision =
- Recall =
- \(\beta\) is a parameter that determines the weight of recall in the calculation.
- If \(\beta = 1\), it is the same as the F1 score.
- If \(\beta > 1\), recall is given more emphasis (favoring sensitivity).
- If \(0 < \beta < 1\), precision is given more emphasis (favoring precision).
Interpretation:
- When, it's the same as the F1 score, which balances precision and recall equally.
- When , the emphasis is on recall, which is useful when false negatives are more costly than false positives.
- When , the emphasis is on precision, which is useful when false positives are more costly than false negatives.
Python Implementation (using scikit-learn):
from sklearn.metrics import fbeta_score, precision_score, recall_score
# Example predictions and ground truth labels
y_true = [0, 1, 1, 0, 1]
y_pred = [0, 1, 0, 1, 1]
# Calculate F1 score (beta=1, same as F1 score)
f1 = fbeta_score(y_true, y_pred, beta=1)
print("F1 Score:", f1)
# Calculate F2 score (favoring recall)
f2 = fbeta_score(y_true, y_pred, beta=2)
print("F2 Score (favoring recall):", f2)
# Calculate F0.5 score (favoring precision)
f05 = fbeta_score(y_true, y_pred, beta=0.5)
print("F0.5 Score (favoring precision):", f05)
# Calculate precision and recall separately
precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)
print("Precision:", precision)
print("Recall:", recall)
F1 Score: 0.6666666666666666
F2 Score (favoring recall): 0.6666666666666666
F0.5 Score (favoring precision): 0.6666666666666666
Precision: 0.6666666666666666
Recall: 0.6666666666666666
Use Cases:
- Medical Diagnosis: In medical diagnosis, false negatives (missing a disease) may be more costly than false positives (incorrectly diagnosing a healthy patient). Thus, a higher weight on recall (higher \(\beta\) value) may be appropriate.
- Spam Detection: In spam detection, false positives (marking a legitimate email as spam) are more acceptable than false negatives (missing a spam email). Thus, a higher weight on precision (lower \(\beta\) value) may be appropriate.
The choice of \(\beta\) depends on the specific context and the relative importance of precision and recall in the problem domain.