When should I fine-tune instead of using RAG or prompting?

Fine-tuning shines when you need to change how the model behaves — a consistent voice, a strict output format, a niche classification skill, or a style the base model can't reliably follow from instructions alone. If your real problem is giving the model facts it doesn't know, retrieval-augmented generation usually wins, because it injects fresh, citable knowledge at request time without retraining. Many production systems do both: fine-tune for behavior, retrieve for knowledge.

Does fine-tuning teach the model new facts?

Not reliably. Fine-tuning is better at shaping behavior, style, and skills than at injecting durable factual knowledge, and pushing too many facts in can cause the model to forget earlier capabilities. Facts also go stale, and retraining to update them is slow and costly. For knowledge that changes or must be cited, keep it in an external source the model retrieves at inference time rather than baking it into the weights.

Glossary

Fine-tuning

Fine-tuning is the process of further training a pretrained model on task- or domain-specific data to adjust its behavior, style, or skills.

Glossary
Updated 2026

Start building free Deep dive: RAG vs fine-tuning

Fine-tuning means continuing the training of a model that has already learned the basics. A large language model is first pretrained on a vast, general corpus until it has broad command of language and reasoning. Fine-tuning then takes that base model and trains it a little more on a curated set of examples that show the specific behavior you want — adjusting the model's weights so it leans toward your task, domain, format, or voice.

How it works: you assemble training pairs that demonstrate the target behavior — say, customer messages mapped to the exact JSON your system expects, or prompts paired with answers in your brand's tone. During fine-tuning, the model's parameters shift to reduce the gap between its predictions and your examples. Lightweight methods such as LoRA update only a small slice of the weights, which keeps the process cheaper and faster. Afterward, every request at inference time uses the adjusted weights, so the new behavior is baked in rather than re-explained in each prompt.

Why it matters: fine-tuning earns its keep when you need consistent behavior that instructions alone can't guarantee — a rigid output schema, a specialized classification skill, or a distinctive style held across thousands of calls. It is less suited to supplying fresh facts, since knowledge baked into weights goes stale and is expensive to refresh; for that, teams reach for retrieval-augmented generation (RAG) instead, and often combine the two — fine-tune for behavior, retrieve for knowledge.

Concrete example: a legal team wants summaries that always follow a fixed five-section template. Prompting gets close but drifts. By fine-tuning on a few hundred hand-formatted summaries, the model learns the template so well that it reproduces the structure reliably from a short instruction — saving prompt tokens and removing the formatting errors that crept in before.

Related terms

Concepts that connect to fine-tuning

Large language model: The pretrained base model that fine-tuning specializes for a task or domain. Read more →
Inference: The runtime stage where a fine-tuned model's adjusted weights produce its outputs. Read more →
RAG: Retrieval-augmented generation — the usual alternative for adding knowledge instead of behavior. Read more →

FAQ

Fine-tuning FAQ

Fine-tuning is the practice of taking a model that has already been pretrained on a broad dataset and training it further on a smaller, focused set of examples. The pretraining gives the model general language and reasoning ability; the fine-tuning nudges its weights toward a particular task, domain, tone, or output format. The result is a specialized version of the base model that behaves the way your examples demonstrate, without starting training from scratch.

Keep learning

Go deeper

RAG vs fine-tuning (deep dive)When to retrieve, when to retrain, when to do both Large language modelThe pretrained base that fine-tuning specializes RAGAdd knowledge at request time without retraining

Get started

Build agents on the right model strategy

Combine fine-tuning, retrieval, and tools to ship reliable agents. Free to start — no credit card required.

Start building free RAG vs fine-tuning