Retrieval-augmented generation and tool use

Parametric weights cannot stay current with the live web. Retrieval-augmented generation (RAG) conditions the model on retrieved passages inserted into the prompt context window . Dual-encoder contrastive training aligns queries and documents in a shared embedding space for efficient search .

Enterprise RAG stacks add rerankers, access-control filters on document IDs, and logging of retrieved chunks for audit. Without logging, debugging a wrong answer means guessing which passage misled the model .

Toolformer-style training encourages models to emit API calls (calculator, search, database) when external computation beats parametric guessing . Retrieval quality, citation honesty, and stale corpora remain bottlenecks: wrong documents produce confident wrong answers .

Adversarially poisoned retrieval can steer outputs toward malicious content. RAG shifts failure modes from pure hallucination to grounded hallucination when citations lie .

RAG pipelines need freshness metadata so the model can reason about recency. Citations should map to retrievable IDs; otherwise the model invents plausible references that pass casual inspection .

Tool APIs should return structured JSON the model can quote; free-form tool text increases injection risk and makes citation auditing harder .

Hybrid answers should separate quoted evidence from model synthesis so users can verify claims without trusting fluent prose alone .

Retrieval-augmented generation and tool use

Related cards

Video Content

Tasks

Question 1

Question 2

Question 3

Question 4

Card Info

Creator