F1 score

Created	@August 28, 2021
Tags	Metrics

The F1 score is a commonly used metric for evaluating the performance of classification models. It considers both precision and recall to provide a single score that balances between them. The F1 score is the harmonic mean of precision and recall and is calculated using the following formula:

$F_1 = 2 \times \frac{{\text{{precision}} \times \text{{recall}}}}{{\text{{precision}} + \text{{recall}}}}$

where:

Precision is the ratio of true positive predictions to the total number of positive predictions, indicating the accuracy of positive predictions among all predicted positives.

Recall (also known as sensitivity) is the ratio of true positive predictions to the total number of actual positives, indicating the proportion of actual positives that were correctly identified.

Interpretation:

The F1 score ranges from 0 to 1, where a higher score indicates better model performance.

A perfect F1 score of 1 means perfect precision and recall, indicating that all positive predictions are correct, and all actual positives are correctly identified.

A lower F1 score indicates that either precision or recall (or both) is low, suggesting that the model's performance needs improvement.

The F1 score provides a balance between precision and recall, making it useful when there is an uneven class distribution or when both false positives and false negatives are equally important.

Python Implementation (using scikit-learn):

from sklearn.metrics import f1_score

# Example ground truth and predicted labels
y_true = [0, 1, 1, 0, 1]
y_pred = [0, 1, 0, 0, 1]

# Calculate F1 score
f1 = f1_score(y_true, y_pred)

print("F1 Score:", f1)

In this example, y_true contains the true labels of the samples, and y_pred contains the predicted labels. We calculate the F1 score using the f1_score function from scikit-learn's metrics module.