Normalising calculators and tokens

Beginner Defensive APIs: validation, sanitization & exceptions

Created by Pavel · 29.04.2026 at 19:10 UTC

Human-entered expressions mix Unicode spaces, x for multiplication, and locale-specific decimals. Sanitisation canonicalises representation: strip, fold case where appropriate, replace x with *, collapse whitespace. Validation then checks grammatical rules (alternating operands and operators) on the cleaned string.

Order matters: validating length before stripping can accept ' ok ' incorrectly or reject valid short tokens—apply trimming first, then rules, as in the V06 boundary-first principle.

For ML feature engineering, the same pattern applies to messy categorical strings before hashing or embedding.

str.strip / str.replace docs: [1].

Sources

[1]https://docs.python.org/3/library/stdtypes.html#str.strip Return to text

University approvals: 0

Tasks

Question 1

You must validate that a hand-typed formula alternates operands and operators. Should you strip whitespace before or after checking token length rules?

Hint

Sanitise then validate.

After—raw whitespace is part of the semantic token

Before—sanitise first so validation sees the intended tokens

Only if the string contains digits

Whitespace does not affect validation

Question 2

Replacing multiplication symbol x with * before tokenising is primarily:

Hint

Transform, not reject.

Validation—it rejects bad formulas

Sanitisation—it repairs notation into a canonical form

Serialization—it writes JSON

Optimisation—it compiles Numba kernels

Question 3

canon_calc(s: str) -> str: replace lowercase x with *; strip ends; collapse internal runs of spaces to single spaces (no regex).

Hint

Manual space collapse after replace.

def canon_calc(s: str) -> str:
    pass

Starter code is prefilled; replace TODO blocks with your solution.

Runtime output (stdout/stderr)

1 test case will be used for grading

Run checks runtime behavior only. Final correctness is evaluated when you submit.

Card Info

Topic: Defensive APIs: validation, sanitization & exceptions
Difficulty: Beginner
Completed: 0 users

Creator

Pavel