Basic types, records, and reading a CSV
Behind everything are Python's basic types, the vocabulary the rest of the course is built from:
str— text, like"RND"int— whole numbers, like7float— decimals, like95000.0bool—TrueorFalseNone— a special value meaning missing / no value
That last one matters more than it looks. In real data a value is often simply absent — an unrecorded salary, a blank cell. Represent that with None, never with 0 or an empty string: a 0 would drag down an average, and an empty string pretends the value is a (blank) label. None says honestly that we don't know the value.
It helps to picture a small table before any library is involved. Each row is one record; each column holds one type, with None standing in for a missing entry:
| name | age | salary | is_fulltime | bonus |
|---|---|---|---|---|
| Ada | 31 | 95000.0 | True | 5000.0 |
| Lin | 27 | 81000.0 | False | None |
| Sam | 44 | 120000.0 | True | None |
Reading down a column, every value shares a type (str names, int ages, float salaries, bool flags). Reading across a row gives one person's record. That is exactly the shape the next section builds with dictionaries and lists.
Once you have values you can describe a whole record. A row of a table — one employee, one sale, one day — is naturally a dictionary mapping column names to values:
row = {"id": 1, "dept": "RND", "salary": 95000}
print(row["dept"]) # RND
A whole table, then, is just a list of dictionaries — and that is exactly how you'll hold tabular data until pandas arrives much later. This is the mental model the entire course rests on: a table is rows of records, and each value has a type.
Tables usually live in files, most commonly as CSV (comma-separated values). Python's csv module reads one straight into that list-of-dicts shape:
import csv
with open("data/employees.csv", newline="", encoding="utf-8") as f:
rows = list(csv.DictReader(f))
print(len(rows)) # how many records did we read?
Every piece earns its place: open(...) opens the file; encoding="utf-8" says how the text is stored; the with block guarantees the file is closed afterwards; and csv.DictReader uses the header line to turn each row into a dict keyed by column name. One catch: every value comes back as a string — "95000", not 95000 — because a text file has no idea what a number is. With the table in memory you can already answer questions with a for loop and an if. (One practical trap: make sure your editor's working directory is the project folder, or open won't find data/....)
and leads into “Measurement scales and your first function”.*
Related cards
Tasks
Card Info
- Topic: Python for Data Science
- Difficulty: Beginner
- Completed: 0 users