Confusion matrix, precision, and recall

Intermediate Python for Data Science
Created by Best · 24.06.2026 at 14:03 UTC

A model is only as trustworthy as your ability to judge it. Every prediction falls into one of four boxes. Fix one class as the positive class — the one you care about, say "spam" or "fraud" — and lay out the confusion matrix:

Predicted positive Predicted negative
Actually positive TP (true positive) FN (false negative -- a miss)
Actually negative FP (false positive -- a false alarm) TN (true negative)

Naming these precisely is what lets you ask sharp questions instead of trusting a single headline number. A false negative is a real positive you failed to catch; a false positive is a false alarm.

From the four counts come the two metrics that matter most:

  • Precision = TP / (TP + FP) — of everything we flagged, how much was right? Low precision means crying wolf.
  • Recall = TP / (TP + FN) — of all the real positives, how many did we catch? Low recall means missing the thing you cared about.

They pull in opposite directions, and which one matters more depends entirely on the problem: a spam filter that hides real email (low precision) is worse than one that lets a little spam through, while a cancer screen wants high recall even at the cost of false alarms.
and leads into “Thresholds, accuracy's lie, and computing metrics”.*

University approvals: 0
Related cards
Builds on Handling missing data and validating input · Python for Data Science
Next Thresholds, accuracy's lie, and computing metrics · Python for Data Science
Tasks
Question 1

In a spam classifier (positive = spam), what is a false negative?

Question 2

Recall = TP / (TP + FN) measures...

Question 3

Select the quantities the confusion matrix counts.

Select all that apply.
Card Info
  • Topic: Python for Data Science
  • Difficulty: Intermediate
  • Completed: 0 users
Creator
Best
Best
BestBuddy