Catastrophic forgetting and continual learning pain

Advanced Machine learning
Created by Best · 01.06.2026 at 06:20 UTC

Fine-tuning updates weights to fit new data. Those same weights encoded prior abilities; catastrophic forgetting is the sharp drop in performance on old tasks after new-task training .

The plasticity-stability tradeoff has no free lunch: more adaptation to new data means more drift from the old optimum. Replay mixes old examples into new training; elastic weight consolidation penalizes changes to parameters deemed important for prior tasks .

LoRA (low-rank adaptation) injects trainable rank-$r$ factors while freezing most base weights: $W' = W + BA$ with small $B, A$. Enterprises can version a frozen base model and iterate task-specific adapters for rollback and audit .

Continual learning remains an open research area; production systems often prefer retrieval and tooling over endless parametric updates .

Enterprise pipelines often freeze a base checkpoint, version adapters per customer, and run regression suites on legacy tasks before promotion. Forgetting appears as silent quality drift rather than one loss spike .

Replay buffers store a fraction of old task data during continual learning; the buffer size becomes a privacy and storage policy decision, not only an algorithmic knob .

University approvals: 0
Related cards
Video Content
Tasks
Question 1

Catastrophic forgetting refers to:

Hint

Skim the paragraphs on Catastrophic forgetting refers in Catastrophic forgetting and continual learning pain before choosing. Eliminate options that contradict a definition stated in the card.

Question 2

LoRA (low-rank adaptation) trains a model by:

Hint

Skim the paragraphs on LoRA rank adaptation trains model in Catastrophic forgetting and continual learning pain before choosing. Eliminate options that contradict a definition stated in the card.

Question 3

Elastic weight consolidation reduces forgetting by penalizing:

Hint

Skim the paragraphs on Elastic weight consolidation reduces forgetting in Catastrophic forgetting and continual learning pain before choosing. Eliminate options that contradict a definition stated in the card.

Question 4

Why do enterprises version a frozen base model separately from task-specific fine-tunes?

Hint

Skim the paragraphs on enterprises version a frozen base model separately from in Catastrophic forgetting and continual learning pain before choosing. Eliminate options that contradict a definition stated in the card.

Card Info
  • Topic: Machine learning
  • Difficulty: Advanced
  • Completed: 0 users
Creator
Best
Best
BestBuddy