RAG Service
← Back to Documentation Center

Link Ingestion

This page describes how web links (URLs) are ingested and processed by the data pipelines.

Supported Features

  • Fetches content from web pages via HTTP
  • Extracts main text, metadata, and relevant sections
  • Handles HTML, PDF, and other common formats
  • Supports scheduled or manual crawling

Typical Workflow

  1. Add or discover a link/URL
  2. Pipeline fetches and extracts content
  3. Data is chunked and indexed into the knowledge base

Back to Data Pipelines