Azure OpenAI Service: implement RAG with Azure AI Search
João Barros
15 de October de 2024
1 min read
Retrieval-Augmented Generation (RAG) is the architectural pattern for building AI assistants that answer based on organization-specific knowledge. It combines Azure OpenAI Service (text generation) with Azure AI Search (relevant document retrieval).
RAG architecture
1. Ingestion (offline):
Documents → Chunking → Embedding (text-embedding-ada-002) → AI Search Index
2. Query (runtime):
User question
→ Question embedding
→ AI Search (vector + keyword search) → Top-K relevant chunks
→ Prompt: "Based on these documents: {chunks} — answer: {question}"
→ Azure OpenAI GPT-4o → Grounded answer
Index documents in AI Search
import os
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from openai import AzureOpenAI
openai_client = AzureOpenAI(
api_key=os.environ["AZURE_OPENAI_KEY"],
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_version="2024-02-01"
)
def get_embedding(text: str) -> list[float]:
response = openai_client.embeddings.create(
model="text-embedding-ada-002",
input=text
)
return response.data[0].embedding
# For each document chunk:
search_client.upload_documents(documents=[{
"id": chunk_id,
"content": chunk_text,
"embedding": get_embedding(chunk_text),
"source": document_path
}])
Full RAG query
def rag_query(question: str) -> str:
# 1. Question embedding
q_embedding = get_embedding(question)
# 2. Retrieve relevant chunks (vector search)
results = search_client.search(
search_text=question,
vector_queries=[VectorizedQuery(vector=q_embedding, k_nearest_neighbors=5, fields="embedding")],
select=["content", "source"]
)
context = "\n\n".join([r["content"] for r in results])
# 3. Generate the answer with GPT-4o
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Answer only based on the provided documents. Cite the sources."},
{"role": "user", "content": f"Documents:\n{context}\n\nQuestion: {question}"}
]
)
return response.choices[0].message.content
Conclusion
RAG with Azure OpenAI + AI Search is the pattern for enterprise assistants that need answers grounded in internal documents. It is more reliable than fine-tuning for knowledge bases that change frequently, and more controllable than letting the model use its base training.