Amazon S3

In meinGPT (UI)

For most teams, setup is done directly in meinGPT without editing local config files.

Open admin settings in meinGPT
Go to Data Pools / Data Sources
Click Add Source and choose this source type
Configure credentials and scope in the UI
Save and trigger the first sync

If you do not run your own DataVault runtime, this is usually all you need.

On-Prem Runtime Configuration (Advanced)

data_pools:
  - id: s3-documents
    type: s3
    access_key_id: $AWS_ACCESS_KEY_ID
    secret_access_key: $AWS_SECRET_ACCESS_KEY
    endpoint: https://s3.amazonaws.com
    bucket_name: my-bucket
    provider: "AWS"

Configuration Options

Field	Type	Default	Required	Description
`id`	string	-	✅	Unique identifier for the data pool
`type`	string	-	✅	Must be "s3"
`access_key_id`	string	-	✅	AWS access key
`secret_access_key`	string	-	✅	AWS secret key
`endpoint`	string	-	✅	S3 endpoint URL
`bucket_name`	string	-	✅	S3 bucket name
`provider`	string	"Other"	❌	Provider ("AWS", "MinIO", "DigitalOcean", "Other")
`base_path`	string	""	❌	Optional folder prefix

Synchronization

Vault connects to the configured bucket and syncs objects under base_path.
Subsequent runs are incremental and only process changes.
For large buckets, start with a narrow base_path for safer rollout.

Setup

Create IAM user: AWS IAM Console
Attach S3 policy: Give read-only access to your bucket
Generate access keys: Copy Access Key ID and Secret
Add to environment: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

→ Amazon S3 Documentation

On-prem only: this source page is relevant when you operate your own DataVault runtime and configure data_pools yourself.