RAG Service
← Back to Documentation Center

Deployment

This page documents deployment options and best practices for RAG API Core. It is intended for DevOps engineers, cloud architects, and anyone responsible for running the platform in production or staging environments.

Supported Deployment Targets

  • Docker Compose: For local development and quick prototyping. Uses docker-compose.yml to spin up the API and dependencies locally.
  • Azure App Service: For production and staging. The API is built as a Docker container, pushed to Azure Container Registry (ACR), and deployed to Azure App Service. Supports environment variables, managed identity, and Azure integrations.

Note: Kubernetes and bare metal/VM deployments are not supported or recommended for this platform.

Typical Deployment Workflow

  1. Clone the Repository sh git clone <repo-url> cd RAG_API_Core
  2. Local Development (Optional)
  3. Use docker-compose.yml to run the API and dependencies locally for development and testing. sh docker-compose up --build
  4. Provision Azure Infrastructure
  5. Deploy an ARM template from a Template Spec (generated from a Bicep template) to create all required Azure resources and permissions. This includes App Service, Storage, Key Vault, AI Search, and managed identities.
  6. Define all required variables/parameters for the environment (resource names, locations, etc.) as part of the deployment.
  7. Build and Push Docker Image
  8. Build the Docker image for the API.
  9. Push the image to Azure Container Registry (ACR).
  10. Deploy to Azure App Service
  11. Configure App Service to pull the image from ACR.
  12. Set environment variables and assign managed identity as needed.
  13. Manual Steps
  14. Check that all permissions and role assignments were created correctly by the ARM deployment.
  15. Deploy LLM and embedding models to Azure OpenAI manually (using Azure Portal or CLI).
  16. Update the configuration YAML files for the specific environment with the correct endpoints, deployment names, and model details.
  17. Connect to the deployment services through a service connection (for CI/CD or automation).
  18. Connect Repos and Run Pipelines
  19. Connect your code repositories to Azure DevOps or GitHub.
  20. Run the deployment pipelines to build and deploy the relevant services.
  21. Deploy data pipelines and the App Service container to the Function App and App Service, respectively.
  22. Finalize setup and verify all services are running as expected.
  23. Verify Health
  24. Check /api/health or /api/v2/health/check for service status.
  25. Monitor Logs
  26. Use Azure monitoring tools (App Service logs, Application Insights, etc.).

Azure-Specific Notes

  • ARM/Bicep Templates: Infrastructure is provisioned using an ARM template generated from a Bicep template. This automates creation of all required services and permissions.
  • Manual Steps: After ARM deployment, verify permissions, deploy models, and update config YAMLs as needed.
  • Managed Identity: Used for secure access to Azure resources (Key Vault, Storage, AI Search).
  • App Settings: Set environment variables via Azure Portal or ARM template parameters.
  • Scaling: Use Azure scaling features for App Service.
  • Monitoring: Integrate with Azure Monitor and Application Insights.

Example: Azure App Service Deployment

  1. Build and push your Docker image to Azure Container Registry (ACR)
  2. Deploy the ARM template (from Bicep) to provision all required Azure resources
  3. Create or update the App Service to pull from ACR
  4. Set environment variables and assign managed identity
  5. Deploy models and update configuration YAMLs as needed
  6. Monitor health endpoints and logs

Value

  • Consistent, repeatable deployments across environments
  • Secure integration with Azure services
  • Health checks and monitoring for production readiness
  • Scalable and maintainable infrastructure

Back to Architecture