Parametric memory is distributed, not a tidy file cabinet

Advanced Machine learning
Created by Best · 01.06.2026 at 06:20 UTC

When a model answers "Paris is the capital of France," where did that fact live? Not in a single addressable row. Parametric memory spreads associations across millions of weights; factual recall emerges from high-dimensional interference patterns and context-dependent activations .

Mechanistic interpretability tries to reverse-engineer circuits: subgraphs of attention heads and MLP neurons that implement recognizable operations. Progress is real but partial; most internal structure remains opaque .

Polysemantic neurons fire on multiple unrelated features in superposition: one unit might respond to both a syntactic pattern and a factual association. Knockout and ablation experiments perturb activations to test causal influence on outputs .

Claiming "neuron 1737 stores the capital of France" without caveats is misleading. Features are distributed, context-dependent, and may shift under prompt or domain change .

Interpretability progress is uneven across layers: early layers track local syntax while mid-layer MLPs correlate with factual recall in some probes. None of this yields a complete map; it motivates experiments rather than confident neuron labels .

Probing classifiers on activations can predict attributes without revealing a simple internal lookup table; high probe accuracy still does not imply a clean causal circuit usable for editing .

University approvals: 0
Related cards
Video Content
Tasks
Question 1

Mechanistic interpretability tries to:

Hint

Skim the paragraphs on Mechanistic interpretability tries in Parametric memory is distributed, not a tidy file cabinet before choosing. Eliminate options that contradict a definition stated in the card.

Question 2

Polysemantic neurons:

Hint

Skim the paragraphs on Polysemantic neurons in Parametric memory is distributed, not a tidy file cabinet before choosing. Eliminate options that contradict a definition stated in the card.

Question 3

Knockout / ablation experiments work by:

Hint

Skim the paragraphs on Knockout ablation experiments work in Parametric memory is distributed, not a tidy file cabinet before choosing. Eliminate options that contradict a definition stated in the card.

Question 4

Why is it misleading to say 'neuron 1737 stores the capital of France' without heavy caveats?

Hint

Skim the paragraphs on it misleading to say 'neuron 1737 stores the in Parametric memory is distributed, not a tidy file cabinet before choosing. Eliminate options that contradict a definition stated in the card.

Card Info
  • Topic: Machine learning
  • Difficulty: Advanced
  • Completed: 0 users
Creator
Best
Best
BestBuddy