RAG Service

Settings

Manage system configuration and parameters
Loading settings…

🧠 Default Models

Used when no override is provided.
Affects retrieval and vectorizer selection.
πŸ’Ύ Saving updates config and hot-reloads components

πŸ€– LLM Parameters

πŸ’‘ Tip: Changes are saved to file and components are automatically reloaded. Changes take effect immediately.

🎨 Creativity & Randomness

--
Controls randomness: 0 = deterministic, 2 = very creative
--
Nucleus sampling: higher = more diverse responses

πŸ“ Length & Quality

Maximum response length
Reduce repetition: higher = less repetitive
Encourage new topics: higher = more diverse

βš™οΈ Advanced Settings

Number of completions to generate
Penalty for repetition
Sequences that stop generation

Configuration Sections

πŸ“Š Default Models Health

Shows health for current defaults: LLM and Embeddings

πŸš€ API Usage Examples

πŸ’‘ Performance Tip: Use skip_knowledge_base: true for faster responses on general questions or when you want instant answers without document search.

πŸ” Normal RAG (with Knowledge Base)

Searches documents first, then answers (default behavior)
{
  "question": "What can you tell me about yourself?",
  "skip_knowledge_base": false,
  "history": []
}
⏱️ ~2-3 seconds (includes document search)

⚑ Fast Mode (Skip Knowledge Base)

Direct LLM response without document search
{
  "question": "What can you tell me about yourself?",
  "skip_knowledge_base": true,
  "history": []
}
⏱️ ~0.5-1 second (direct LLM response)
πŸ“‹ Parameter Details:
  • skip_knowledge_base: boolean (default: false) - Set to true to bypass document search
  • When true: No fetchers are used, direct LLM call for instant responses
  • When false: Normal RAG flow with document search and context building

πŸ” Raw Configuration JSON

For debugging - shows the actual config data being used
πŸ“„ Click to show/hide raw config JSON

config.to_dict() result (cached):

Loading…

Fresh object attributes (current):

Loading…
Use these to test if config reloading is working