Settings

Manage system configuration and parameters

Loading settings…

🧠 Default Models

Select default LLM and Embeddings aliases

Default LLM Alias Used when no override is provided.

Default Embedding Alias Affects retrieval and vectorizer selection.

💾 Saving updates config and hot-reloads components

Navigation

← Back to Home Status Unified Test

🤖 LLM Parameters

Configure language model behavior

💡 Tip: Changes are saved to file and components are automatically reloaded. Changes take effect immediately.

🎨 Creativity & Randomness

Temperature (0.0-2.0)

Controls randomness: 0 = deterministic, 2 = very creative

Top P (0.0-1.0)

Nucleus sampling: higher = more diverse responses

📏 Length & Quality

Max Tokens Maximum response length

Frequency Penalty (-2.0-2.0) Reduce repetition: higher = less repetitive

Presence Penalty (-2.0-2.0) Encourage new topics: higher = more diverse

⚙️ Advanced Settings

N (Completions) Number of completions to generate

Repetition Penalty (0.0-2.0) Penalty for repetition

Stop Sequences Sequences that stop generation

Configuration Sections

Select a section to configure:

Section Settings

📊 Default Models Health

Shows health for current defaults: LLM and Embeddings

🚀 API Usage Examples

How to use the skip_knowledge_base parameter

💡 Performance Tip: Use skip_knowledge_base: true for faster responses on general questions or when you want instant answers without document search.

🔍 Normal RAG (with Knowledge Base)

Searches documents first, then answers (default behavior)

{
  "question": "What can you tell me about yourself?",
  "skip_knowledge_base": false,
  "history": []
}

⏱️ ~2-3 seconds (includes document search)

⚡ Fast Mode (Skip Knowledge Base)

Direct LLM response without document search

{
  "question": "What can you tell me about yourself?",
  "skip_knowledge_base": true,
  "history": []
}

⏱️ ~0.5-1 second (direct LLM response)

📋 Parameter Details:

skip_knowledge_base: boolean (default: false) - Set to true to bypass document search
When true: No fetchers are used, direct LLM call for instant responses
When false: Normal RAG flow with document search and context building

🔍 Raw Configuration JSON

For debugging - shows the actual config data being used

📄 Click to show/hide raw config JSON

config.to_dict() result (cached):

Loading…

Fresh object attributes (current):

Loading…

Use these to test if config reloading is working