What you hand in and what good looks like
You hand in four things:
- the data (checked in or a pinned snapshot, with its source and licence documented — no live-only API without an offline mirror);
- the environment (a pinned
requirements.txt, ideally a Dockerfile, and a fixed seed); - the code (a readable pipeline of typed helpers, no copy-paste, sensible structure);
- the report that interprets the numbers instead of merely printing them.
The report is where honesty shows: if 96% of your cases are the negative class, reporting "97% accuracy" without precision, recall, and a baseline is misleading, because 96% is reachable by always predicting the majority.
The grading reflects what actually makes data science trustworthy. Does it run from a clean checkout? Are the data's types and missing values handled on purpose? Is the evaluation honest — held-out data, the right metric, uncertainty acknowledged? Is the code readable? Does the report interpret rather than just report?
Notice what's not on that list: the sophistication of the model. A capstone that uses a simple model but is reproducible and honestly evaluated beats a fancy one nobody can re-run or believe. That's the real lesson of the whole course: fluency is choosing the right phrasing for the question in front of you, and showing your work so others can trust it.
Related cards
Tasks
Card Info
- Topic: Python for Data Science
- Difficulty: Advanced
- Completed: 0 users