A config class that validates itself
The highest-value use of a class in data work is a validated configuration object. A pipeline has settings — paths, a threshold, a seed — and scattering them as loose variables invites silent errors. A class gathers them into one object that refuses to exist in an invalid state.
class PipelineConfig:
def __init__(self, threshold, seed=0):
if not 0.0 <= threshold <= 1.0:
raise ValueError("threshold must be between 0 and 1")
self.threshold = threshold
self.seed = seed
__init__ is the constructor: it runs when you create an instance, and it's the natural place to validate, so a bad value fails immediately rather than ten steps later.
Two methods make a config object pleasant to use. @property exposes a computed, read-only value that you access like an attribute — no parentheses — even though a calculation runs behind it:
@property
def is_strict(self):
return self.threshold >= 0.9
And __repr__ gives the object a readable string form, so printing it while debugging actually tells you something:
def __repr__(self):
return f"PipelineConfig(threshold={self.threshold}, seed={self.seed})"
leads into “slots, the walrus, and when to use dataclasses”.*
Related cards
Tasks
Card Info
- Topic: Python for Data Science
- Difficulty: Advanced
- Completed: 0 users