Language models do not know your organization's private data. RAG solves this by retrieving relevant documents and feeding them to the model at question time.
How it works
Documents are split into chunks, converted into vectors and indexed. At question time, the most relevant chunks are retrieved and the model answers grounded in them, citing the sources.
It is more reliable than fine-tuning a model when the knowledge base changes frequently.