Numba and better algorithms

Advanced Python for Data Science

Created by Best · 24.06.2026 at 14:03 UTC

If a genuinely loop-bound numeric kernel resists vectorisation, Numba compiles a Python function to machine code when you add a decorator:

from numba import njit

@njit
def pairwise_sum(a):
    total = 0.0
    for x in a:
        total += x
    return total

Numba shines on tight numeric loops over arrays, but it understands only a subset of Python and NumPy; it won't speed up pandas operations or arbitrary objects. It's a scalpel, not a hammer.

Remember that a better algorithm — turning an O(n^2) approach into O(n log n) — can dwarf any micro-optimisation. No amount of compiling saves a fundamentally wasteful method.

And the deeper warning is against premature optimisation: rewriting code that isn't actually hot wastes effort and adds bugs and risk for no real gain. That's the whole reason you profile first. A pipeline fine on ten thousand rows can be unusable on ten million, and knowing how to find the real bottleneck — then reaching for vectorisation before anything exotic — is what lets you scale up sensibly.
into “Lazy evaluation and yield”.*

University approvals: 0

Related cards

Builds on Measure first, then vectorise · Python for Data Science

Next Lazy evaluation and yield · Python for Data Science

Tasks

Question 1

What kind of code is Numba's @njit best suited to speed up?

pandas DataFrame operations

tight numeric loops over arrays that resist vectorisation

reading and writing files

string formatting

Question 2

Why is 'premature optimisation' considered a problem?

Optimising always makes code slower

You spend effort and add risk rewriting code that isn't actually the bottleneck

Profilers are unreliable

Vectorisation only works after optimisation

Card Info

Topic: Python for Data Science
Difficulty: Advanced
Completed: 0 users

Creator

Best

BestBuddy