F1 score
Created | |
---|---|
Tags | Metrics |
The F1 score is a commonly used metric for evaluating the performance of classification models. It considers both precision and recall to provide a single score that balances between them. The F1 score is the harmonic mean of precision and recall and is calculated using the following formula:
where:
- Precision is the ratio of true positive predictions to the total number of positive predictions, indicating the accuracy of positive predictions among all predicted positives.
- Recall (also known as sensitivity) is the ratio of true positive predictions to the total number of actual positives, indicating the proportion of actual positives that were correctly identified.
Interpretation:
- The F1 score ranges from 0 to 1, where a higher score indicates better model performance.
- A perfect F1 score of 1 means perfect precision and recall, indicating that all positive predictions are correct, and all actual positives are correctly identified.
- A lower F1 score indicates that either precision or recall (or both) is low, suggesting that the model's performance needs improvement.
- The F1 score provides a balance between precision and recall, making it useful when there is an uneven class distribution or when both false positives and false negatives are equally important.
Python Implementation (using scikit-learn):
from sklearn.metrics import f1_score
# Example ground truth and predicted labels
y_true = [0, 1, 1, 0, 1]
y_pred = [0, 1, 0, 0, 1]
# Calculate F1 score
f1 = f1_score(y_true, y_pred)
print("F1 Score:", f1)
In this example, y_true
contains the true labels of the samples, and y_pred
contains the predicted labels. We calculate the F1 score using the f1_score
function from scikit-learn's metrics module.