Numba and better algorithms
If a genuinely loop-bound numeric kernel resists vectorisation, Numba compiles a Python function to machine code when you add a decorator:
from numba import njit
@njit
def pairwise_sum(a):
total = 0.0
for x in a:
total += x
return total
Numba shines on tight numeric loops over arrays, but it understands only a subset of Python and NumPy; it won't speed up pandas operations or arbitrary objects. It's a scalpel, not a hammer.
Remember that a better algorithm — turning an O(n^2) approach into O(n log n) — can dwarf any micro-optimisation. No amount of compiling saves a fundamentally wasteful method.
And the deeper warning is against premature optimisation: rewriting code that isn't actually hot wastes effort and adds bugs and risk for no real gain. That's the whole reason you profile first. A pipeline fine on ten thousand rows can be unusable on ten million, and knowing how to find the real bottleneck — then reaching for vectorisation before anything exotic — is what lets you scale up sensibly.
into “Lazy evaluation and yield”.*
Related cards
Tasks
Card Info
- Topic: Python for Data Science
- Difficulty: Advanced
- Completed: 0 users