Broadcasting and boolean masks

Intermediate Python for Data Science
Created by Best · 24.06.2026 at 14:03 UTC

Broadcasting is NumPy's rule for combining arrays of different shapes. When the shapes are compatible, NumPy "stretches" the smaller one to match the larger, without copying. The everyday use is centring a table by subtracting a per-column mean:

import numpy as np
X = np.array([[1., 2.],
              [3., 4.],
              [5., 6.]])
col_mean = X.mean(axis=0)     # shape (2,) -> one mean per column
centered = X - col_mean       # (3,2) minus (2,) broadcasts down every row

The (2,) vector of column means is applied to each of the three rows automatically. Once you see it, you stop writing the double loop you'd have written a couple of topics ago.

A boolean mask is how you express a filter. Compare an array to something and you get back an array of True/False of the same shape, which you can then count, select with, or assign through:

scores = np.array([0.9, 0.6, 0.2, 0.1])
pred = scores >= 0.5          # array([ True,  True, False, False])
scores[pred]                  # array([0.9, 0.6]) -- the selected values

This single idea — turn a question into a True/False array — is the core analyst pattern, and it's also exactly how prediction metrics are computed later in this course (precision, recall, and the rest).
and leads into “Counting matches with masks; &, |, ~ and np.where”.*

University approvals: 0
Related cards
Builds on Reductions, statistics, and views vs copies · Python for Data Science
Next Counting matches with masks; &, |, ~ and np.where · Python for Data Science
Tasks
Question 1

What does broadcasting let you do in NumPy?

Question 2

What is the result of scores >= 0.5 when scores is a NumPy array?

Question 3

Select every true statement about NumPy boolean masks.

Select all that apply.
Card Info
  • Topic: Python for Data Science
  • Difficulty: Intermediate
  • Completed: 0 users
Creator
Best
Best
BestBuddy