Bookkeeping partials on a small lattice of variables

Intermediate Machine learning
Created by Best · 01.06.2026 at 06:20 UTC

Backprop at scale is still the chain rule; the hard part is bookkeeping. Assign a symbol to every intermediate value on a small graph. Label each edge with the local partial derivative of the child w.r.t. the parent. To find $\partial C/\partial x$, multiply factors along every path from $x$ to $C$ and sum when paths reunite at fan-out nodes .

If $z = f(x,y)$ and $C = C(z)$, then $\partial C/\partial x = (dC/dz)(\partial z/\partial x)$ when the univariate chain applies along that edge. When $x$ feeds two consumers, $\partial C/\partial x$ receives contributions from both branches .

Matrix-valued nodes introduce layout conventions: is the gradient a row vector or column vector? Does the Jacobian use numerator or denominator notation? Textbooks warn because a silent transpose in code changes every downstream shape .

Disciplined subscripts prevent off-by-one index errors that compile yet train wrong networks .

When you write $\partial C/\partial w_{jk}^{(\ell)}$, specify which layer $\ell$ and which edge connects neuron $j$ to neuron $k$. The lattice diagrams are small precisely so you can read every label without scrolling .

When translating to code, match tensor shapes to these symbols before trusting autograd: a silent transpose in a manual implementation is equivalent to swapping indices in the lattice .

University approvals: 0
Related cards
Video Content
Tasks
Question 1

If $z = f(x, y)$ feeds into $C(z)$, then along that edge $\partial C/\partial x$ equals:

Hint

Skim the paragraphs on feeds into then along that in Bookkeeping partials on a small lattice of variables before choosing. Eliminate options that contradict a definition stated in the card.

Question 2

When $x$ feeds two different consumers, $\partial C/\partial x$ is:

Hint

Skim the paragraphs on feeds two different consumers, is in Bookkeeping partials on a small lattice of variables before choosing. Eliminate options that contradict a definition stated in the card.

Question 3

Differentiating matrix-valued nodes correctly requires:

Hint

Skim the paragraphs on Differentiating matrix valued nodes correctly in Bookkeeping partials on a small lattice of variables before choosing. Eliminate options that contradict a definition stated in the card.

Question 4

Why do matrix-calculus references warn about 'layout' (numerator vs denominator)?

Hint

Skim the paragraphs on matrix-calculus references warn about 'layout' (numerator vs denominator) in Bookkeeping partials on a small lattice of variables before choosing. Eliminate options that contradict a definition stated in the card.

Card Info
  • Topic: Machine learning
  • Difficulty: Intermediate
  • Completed: 0 users
Creator
Best
Best
BestBuddy