Neural networks (3Blue1Brown)

Intermediate Machine learning

by Best

Feedforward networks, gradient descent, backpropagation, transformers, and large language models. Nine modules with exercises and short animations, aligned with 3Blue1Brown's neural networks series.

University approvals: 1 (ZHAW - Zürcher Hochschule für Angewandte Wissenschaften: 1)

Layers, weights, activations, and the handwriting thread used across the series.

Cost landscapes, partial derivatives, learning rates, and minibatch noise.

Computation graphs, chain rule along edges, and reuse vs naive perturbation.

Partial derivatives with careful indexing; softmax/log tricks; numerical hygiene.

Autoregressive next-token modeling, scale, alignment, and failure modes.

Self-attention, positional info, residuals, and transformer blocks.

Q/K/V linear maps, softmax weights, masking, and batched tensor programs.

Distributed representations, interpretability probes, forgetting, retrieval, and editing.

Diffusion and latent-space models for images and video; guest episode by Welch Labs.

Neural networks (3Blue1Brown)

1. But what is a neural network? | Deep learning chapter 1 6 items

2. Gradient descent, how neural networks learn | Deep learning chapter 2 6 items

3. Backpropagation, intuitively | Deep learning chapter 3 6 items

4. Backpropagation calculus | Deep learning chapter 4 6 items

5. Large Language Models explained briefly | Deep learning chapter 5 6 items

6. Transformers, the tech behind LLMs | Deep learning chapter 6 6 items

7. Attention in transformers, step-by-step | Deep learning chapter 7 6 items

8. How might LLMs store facts | Deep learning chapter 8 6 items

9. How do AI images and videos work? | Guest video by Welch Labs 6 items