DataFrames: reading, filtering, adding columns

Intermediate Python for Data Science
Created by Best · 24.06.2026 at 14:03 UTC

pandas unifies everything so far into one object: the DataFrame, a table with named columns and a row index. If you know SQL, much of it maps directly. You usually start by reading a file:

import pandas as pd
df = pd.read_csv("events.csv")
df.head()        # peek at the first rows
df.dtypes        # what type did each column get?

A single column is a Series; the whole table is a DataFrame. Think of a DataFrame as a dict of equally-long Series sharing one index.

You select rows with a boolean condition — the mask idea again, now with labels — and you derive new columns without mutating the original by using assign:

ok = df[df["status"] == "ok"]            # filter   (WHERE status = 'ok')
df2 = df.assign(score2=df["score"] * 2)  # add a column, original untouched

df["status"] == "ok" builds a boolean Series, and indexing the DataFrame with it keeps only the rows where it is True. This is the pandas version of the loop-and-filter you wrote in the very first topic.
np.where” and leads into “groupby, merge, and counting carefully”.*

University approvals: 0
Related cards
Builds on Counting matches with masks; &, |, ~ and np.where · Python for Data Science
Next groupby, merge, and counting carefully · Python for Data Science
Tasks
Question 1

In pandas, a single column of a DataFrame is a...

Question 2

What does df[df["status"] == "ok"] return?

Question 3

Select every true statement about pandas DataFrames.

Select all that apply.
Card Info
  • Topic: Python for Data Science
  • Difficulty: Intermediate
  • Completed: 0 users
Creator
Best
Best
BestBuddy