Confusion matrix

Created
TagsMetrics

It is a table layout that allows visualization of the performance of an algorithm. Each row represents the instances in an actual class while each column represents the instances in a predicted class, or vice versa.

A confusion matrix is a table that is often used to describe the performance of a classification model on a set of test data for which the true values are known. It allows visualization of the performance of an algorithm, particularly in terms of binary classification, where predictions are classified into two categories (e.g., positive and negative).

Components of a Confusion Matrix:

A confusion matrix is composed of four different combinations of predicted and actual classes:

  1. True Positive (TP): The number of samples that were correctly predicted as positive.
  1. True Negative (TN): The number of samples that were correctly predicted as negative.
  1. False Positive (FP): Also known as Type I error, it represents the number of samples that were incorrectly predicted as positive (false alarm).
  1. False Negative (FN): Also known as Type II error, it represents the number of samples that were incorrectly predicted as negative (miss).

Interpretation:

Python Implementation (using scikit-learn):

from sklearn.metrics import confusion_matrix

# Example ground truth and predicted labels
y_true = [0, 1, 1, 0, 1]
y_pred = [0, 1, 0, 0, 1]

# Calculate confusion matrix
conf_matrix = confusion_matrix(y_true, y_pred)

print("Confusion Matrix:")
print(conf_matrix)

In this example, y_true contains the true labels of the samples, and y_pred contains the predicted labels. We calculate the confusion matrix using the confusion_matrix function from scikit-learn's metrics module. The resulting confusion matrix is a 2x2 array representing the counts of true positive, false positive, true negative, and false negative predictions.