__slots__, the walrus, and when to use dataclasses
__slots__ fixes the exact set of attributes an instance may have:
class CompactFeature:
__slots__ = ("name", "value")
It drops the per-instance dictionary Python normally keeps (saving memory across many instances) and, just as usefully, turns a typo like obj.nmae = 1 into an immediate AttributeError instead of silently creating a junk attribute. It's a small optimisation — reach for it only when memory or typo-safety measurably help.
The walrus operator := assigns a value inside a larger expression, which occasionally makes a loop or comprehension clearer:
while (line := f.readline()): # assign and test in one step
process(line)
Here line is both assigned and tested in the same expression. Use it where it genuinely improves clarity, not just to be clever.
Object orientation is overkill for a one-off script, and Python's dataclasses give you most of this — attributes, a constructor, a __repr__ — with far less ceremony:
from dataclasses import dataclass
@dataclass
class PipelineConfig:
threshold: float
seed: int = 0
The principle that generalises is the important part: a configuration object that validates on construction fails loudly and early, and that loud early failure is exactly what makes a pipeline reproducible and trustworthy.
leads into “Graphs: relationships as data and algorithms”.*
Related cards
Tasks
Card Info
- Topic: Python for Data Science
- Difficulty: Advanced
- Completed: 0 users