What you hand in and what good looks like

Advanced Python for Data Science
Created by Best · 24.06.2026 at 14:03 UTC

You hand in four things:

  • the data (checked in or a pinned snapshot, with its source and licence documented — no live-only API without an offline mirror);
  • the environment (a pinned requirements.txt, ideally a Dockerfile, and a fixed seed);
  • the code (a readable pipeline of typed helpers, no copy-paste, sensible structure);
  • the report that interprets the numbers instead of merely printing them.

The report is where honesty shows: if 96% of your cases are the negative class, reporting "97% accuracy" without precision, recall, and a baseline is misleading, because 96% is reachable by always predicting the majority.

The grading reflects what actually makes data science trustworthy. Does it run from a clean checkout? Are the data's types and missing values handled on purpose? Is the evaluation honest — held-out data, the right metric, uncertainty acknowledged? Is the code readable? Does the report interpret rather than just report?

Notice what's not on that list: the sophistication of the model. A capstone that uses a simple model but is reproducible and honestly evaluated beats a fancy one nobody can re-run or believe. That's the real lesson of the whole course: fluency is choosing the right phrasing for the question in front of you, and showing your work so others can trust it.

University approvals: 0
Related cards
Builds on The capstone and the full pipeline · Python for Data Science
Tasks
Question 1

Your capstone reports 'accuracy 97%' on a dataset where 96% of cases are the negative class. What should the report add to be honest?

Question 2

What does the capstone grade depend on most?

Card Info
  • Topic: Python for Data Science
  • Difficulty: Advanced
  • Completed: 0 users
Creator
Best
Best
BestBuddy