WebsitePlatform Login

Overview

Overview of Data Pools, RAG, and source connectivity

Data Pools (RAG)

Data Pools are the foundation for retrieval-augmented generation (RAG) in meinGPT.
Content from connected sources is indexed and provided to your assistants as knowledge.

For Most Teams (Default)

In the default setup, you connect sources directly in meinGPT and use data pools without running your own infrastructure.

  • Connect sources in meinGPT
  • Select the data pool
  • Use it in assistants/workflows

You do not need to run or configure your own Data Vault for this.

Advanced: Customer-Managed Data Vault (On-Premise)

If you need your own on-prem knowledge infrastructure, you can deploy and operate your own Data Vault.

When to use Data Pools

  • You want to manage large document collections centrally
  • You need reusable knowledge across multiple assistants
  • You want controlled source sync and ingestion

Sources

All supported sources are listed here:

Typical source types:

  • SharePoint / OneDrive
  • Google Drive
  • Confluence
  • Amazon S3
  • SMB / WebDAV
  • Local filesystems

Custom Data Preparation Pipelines

For the dedicated pattern with S3 handover for third-party systems, see:

Next Step

On this page