Reading JSON and the many faces of missing

Intermediate Python for Data Science

Created by Best · 24.06.2026 at 14:03 UTC

Most web data arrives as JSON, a text format of nested objects and arrays that maps cleanly onto Python's dicts and lists:

import json
with open("data/weather.json", encoding="utf-8") as f:
    doc = json.load(f)          # a tree of dicts and lists
temp = doc.get("temp_c")        # None if the key is absent

Notice .get(key) rather than doc["key"]: .get returns None for a missing key instead of crashing, which is the safer way to read data you didn't create. JSON's null becomes Python's None, an array becomes a list, an object becomes a dict — so reading nested JSON is really just walking dicts and lists.

The subtle part of real data is that the token for "missing" changes as a value moves through your stack, and each form has its own trap:

Where you are	"Missing" looks like	The trap
plain Python	`None`	`None + 1` raises a TypeError
a NumPy float array	`np.nan`	`nan != nan`, and it poisons `mean` -- use `np.nanmean`
pandas	`NaN` / `pd.NA`	`value_counts()` drops it silently
JSON	`null` -> `None`	an absent key and an explicit null differ

The headline trap: a single NaN turns an ordinary mean into nan. Use np.nanmean (or pandas' skip-na behaviour) when you mean "average the values that are present".
leads into “Handling missing data and validating input”.*

University approvals: 0

Related cards

Builds on Choosing charts and headless plotting · Python for Data Science

Next Handling missing data and validating input · Python for Data Science

Tasks

Question 1

stdin is one line of JSON: an array of objects, each with key score that may be missing or null. Parse it and print the mean of the present numeric scores, rounded to 2 decimals. If none are present, print None.

Example input:

[{"score": 10}, {"score": null}, {"x": 1}, {"score": 20}]

Expected output:

15.0

Runtime output (stdout/stderr)

3 test cases will be used for grading

Run checks runtime behavior only. Final correctness is evaluated when you submit.

Question 2

What does np.mean([1.0, np.nan, 3.0]) return, and what should you use instead to skip the missing value?

2.0; nothing else needed

nan; use np.nanmean to skip NaN

It raises an error; use try/except

0.0; use np.sum

Card Info

Topic: Python for Data Science
Difficulty: Intermediate
Completed: 0 users

Creator

Best

BestBuddy