Projections and least-squares bridge

Projecting $\mathbf{b}$ onto a subspace $W$ picks $\mathbf{p}\in W$ that minimizes $\|\mathbf{b}-\mathbf{p}\|$. The error $\mathbf{e}=\mathbf{b}-\mathbf{p}$ is orthogonal to every vector in $W$, which is the geometric heart of least squares .

When $W=\mathrm{Col}(A)$, the normal equations $A^T A\mathbf{x}=A^T\mathbf{b}$ encode that orthogonality in coordinates. An orthogonal projection matrix $P$ onto $W$ satisfies $P^2=P$ and sends $\mathbf{b}$ to its closest point in $W$.

Decompose $\mathbf{b}=\mathbf{p}+\mathbf{e}$ with $\mathbf{p}\in W$ and $\mathbf{e}\perp W$. The Pythagorean theorem shows $\|\mathbf{b}\|^2=\|\mathbf{p}\|^2+\|\mathbf{e}\|^2$, so shrinking the residual is the same as finding the best approximation inside $W$. In regression language, you keep only the part of $\mathbf{b}$ explainable as a column combination of $A$.

Example sketch: if $W$ is a line through $\mathbf{u}\neq\mathbf{0}$, then $\mathbf{p}=(\frac{\mathbf{u}\cdot\mathbf{b}}{\mathbf{u}\cdot\mathbf{u}})\mathbf{u}$. The residual $\mathbf{e}$ is perpendicular to $\mathbf{u}$, so $\mathbf{u}\cdot\mathbf{e}=0$.

Numerical note: forming $A^T A$ explicitly can square condition numbers. QR factorization stabilizes projections that naive normal equations handle poorly . The dot-product language from this chapter is what makes the normal equations feel inevitable rather than memorized.

Picture data points nearly lying on a line but not exactly: least squares finds the line minimizing total squared vertical error. Every step of that story is a projection statement written with dots.

Projections and least-squares bridge

Related cards

Video Content

Tasks

Question 1

Question 2

Question 3

Question 4

Card Info

Creator