Reading JSON and the many faces of missing

Intermediate Python for Data Science
Created by Best · 24.06.2026 at 14:03 UTC

Most web data arrives as JSON, a text format of nested objects and arrays that maps cleanly onto Python's dicts and lists:

import json
with open("data/weather.json", encoding="utf-8") as f:
    doc = json.load(f)          # a tree of dicts and lists
temp = doc.get("temp_c")        # None if the key is absent

Notice .get(key) rather than doc["key"]: .get returns None for a missing key instead of crashing, which is the safer way to read data you didn't create. JSON's null becomes Python's None, an array becomes a list, an object becomes a dict — so reading nested JSON is really just walking dicts and lists.

The subtle part of real data is that the token for "missing" changes as a value moves through your stack, and each form has its own trap:

Where you are "Missing" looks like The trap
plain Python None None + 1 raises a TypeError
a NumPy float array np.nan nan != nan, and it poisons mean -- use np.nanmean
pandas NaN / pd.NA value_counts() drops it silently
JSON null -> None an absent key and an explicit null differ

The headline trap: a single NaN turns an ordinary mean into nan. Use np.nanmean (or pandas' skip-na behaviour) when you mean "average the values that are present".
leads into “Handling missing data and validating input”.*

University approvals: 0
Related cards
Builds on Choosing charts and headless plotting · Python for Data Science
Next Handling missing data and validating input · Python for Data Science
Tasks
Question 1

stdin is one line of JSON: an array of objects, each with key score that may be missing or null. Parse it and print the mean of the present numeric scores, rounded to 2 decimals. If none are present, print None.

Example input:

[{"score": 10}, {"score": null}, {"x": 1}, {"score": 20}]

Expected output:

15.0
3 test cases will be used for grading
Run checks runtime behavior only. Final correctness is evaluated when you submit.
Question 2

What does np.mean([1.0, np.nan, 3.0]) return, and what should you use instead to skip the missing value?

Card Info
  • Topic: Python for Data Science
  • Difficulty: Intermediate
  • Completed: 0 users
Creator
Best
Best
BestBuddy