Product

RAG API Core provides APIs for Retrieval-Augmented Generation (RAG) and Chat, including:

What You Get

RAG API Core provides a complete API platform for building intelligent, knowledge-aware applications. It enables businesses to:

Build Q&A systems that provide accurate, sourced answers from your knowledge base
Build conversational interfaces with context-aware chat capabilities
Manage and update knowledge bases through secure document upload and indexing
Monitor system health and performance with comprehensive dashboards and alerts
Collect user feedback to continuously improve response quality
Conduct A/B testing of different prompts, models, and retrieval strategies

The API integrates with Azure services for reliability, security, and scalability.

Knowledge-base Q&A: Build systems that answer questions using your organization's documents, with citations and source grounding
Chatbots: Create chat interfaces with guardrails and templated responses
Content management: Upload, index, and search through large document collections
Experimentation platform: Test different AI approaches and measure their effectiveness

User or system sends a request to the API (e.g., ask a question, chat, upload knowledge).
API validates the request and determines the best way to fulfill it (RAG, chat, etc).
If retrieval is needed, the API fetches relevant knowledge from Azure AI Search or other sources.
The API assembles a prompt and calls Azure OpenAI (or other LLMs) to generate a response.
The response is returned, optionally with sources, metadata, and feedback options.
Health, feedback, and monitoring endpoints help ensure reliability and continuous improvement.

Unified RAG & Chat Endpoints: Consistent, flexible APIs for both retrieval-augmented and conversational AI. Test responses interactively through the /unified-test endpoint.
Prompt Management: Customize system prompts and response templates. Create versions, revert changes, and test prompts in the interface.
Feedback Loop: Collect user feedback on responses for ongoing improvement.
Knowledge Upload & Management: Add, update, and manage knowledge base content. Documents are indexed in Azure AI Search with processing, chunking, and embedding creation.
Health Monitoring: Live, ready, and deep health checks with JSON and HTML dashboards.
Testing Interface: Interactive testing through /unified-test for experimenting with prompts, parameters, and real-time responses.
OpenAPI & Docs Center: Auto-generated OpenAPI JSON and a modern, markdown-driven documentation center.
Input Validation: Pydantic v2 models ensure data quality and safety.
A/B Testing Support: Built-in fields for experiment tracking and variant selection.

Azure AI Search: For fast, scalable retrieval of knowledge base content.
Azure Blob/Table Storage: For storing documents, indexes, and feedback.
Azure OpenAI: For advanced language model responses.
Azure API Management (APIM) & Entra ID (JWT): For secure, enterprise-grade authentication and API fronting.
Customize templates in rag_api_core/templates/v2

Pydantic validation on inputs/outputs; clear error responses (400/4xx/5xx)
Multimodal validation: at least 1 image, up to 5; supported types: jpeg/jpg/png/webp; ≤ 5 MB per data URL
Feedback idempotency keyed by response_id

Last updated: September 2025