Local Deployment
Simple local DataVault setup with an OpenAI-compatible embedding service
This page guides you through the basic setup steps for a self-managed/on-premise DataVault deployment on your own hardware. In the default meinGPT cloud setup, you usually do not need these steps.
After following this tutorial, you will have a working local instance of the DataVault, connected to meinGPT, running on your server. Make sure you have the required prerequisites in place before following this guide.
Overview
The local DataVault consists of these services managed via Docker Compose:
| Service | Purpose |
|---|---|
| vault | API server β serves queries and file downloads |
| vault-worker | Ingestion pipeline β processes and indexes documents |
| database | PostgreSQL with VectorChord β stores both ingestion metadata and vector embeddings |
| ollama | Local embedding model server (OpenAI-compatible API) |
| piko | Tunnel to meinGPT Cloud β connects the vault without exposing ports publicly |
Your final project directory will look like this:
datavault-local/
βββ config/
β βββ vault.env # Credentials and IDs (from meinGPT settings section)
β βββ app_config.yaml # Vault configuration
βββ data/
β βββ vault/ # Local documents for ingestion (optional)
β βββ postgres/ # Postgres data (auto-populated)
β βββ ollama/ # Embedding model cache (auto-populated)
βββ docker-compose.yaml # Service definitionsdata/vault/ is only needed if you want to ingest local files from disk. If all your sources are cloud-based (SharePoint, Google Drive, etc.), you can skip it.
Step 1 β Create directory structure
mkdir datavault-local
cd datavault-local
mkdir -p config data/vault data/postgres data/ollamaStep 2 β Create the environment file
Grab your Vault ID, Vault Secret, and Data Pool ID from the meinGPT dashboard and fill them in below.
VAULT_ID=your-vault-id
VAULT_SECRET=your-vault-secret
MEINGPT_URL=https://app.meingpt.com
POSTGRES_USER=datavault
POSTGRES_PASSWORD=your-postgres-password
OPENAI_BASE_URL=http://ollama:11434/v1
OPENAI_API_KEY=local-dev
OPENAI_EMBEDDING_MODEL=bge-m3
OPENAI_EMBEDDING_DIMENSIONS=1024Step 3 β Create the vault configuration
Values like $VAULT_ID reference the environment variables from vault.env β they are resolved automatically at runtime.
version: 1.0
meingpt_url: $MEINGPT_URL
vault:
id: $VAULT_ID
secret: $VAULT_SECRET
standalone_mode: false
data_dir: ./tmp
ingestion_interval: 300
tasks_batch_size: 3
chunk_size: 256
chunk_overlap: 26
metadata:
# 'deployment_type' is a legacy label only β storage always uses PostgreSQL/VectorChord.
# Both 'vault' and 'vault-worker' must point at the SAME postgres so they share the task queue.
deployment_type: "cloud"
postgres:
user: $POSTGRES_USER
password: $POSTGRES_PASSWORD
host: database
port: 5432
database: $POSTGRES_USER
embedding_model:
provider: "openai"
model: $OPENAI_EMBEDDING_MODEL
base_url: $OPENAI_BASE_URL
api_key: $OPENAI_API_KEY
embedding_dimensions: $OPENAI_EMBEDDING_DIMENSIONS
rpm: 1000
tpm: 100000
logging:
log_level: "INFO"
log_to_file: true
log_file_path: "logs/app.log"
uvicorn_log_file_path: "logs/uvicorn.log"
data_pools:
- id: your-datapool-id-from-meinGPT
type: local
base_path: /data/vaultFor getting up and running, no data_pools entries are needed here. Cloud sources like SharePoint or Google Drive can be added directly via meinGPT. For on-prem source types like SMB or WebDAV, you can add entries to this file later β see Sources for all available types.
Step 4 β Create the Docker Compose file
Replace your-vault-id in the piko command below with the Vault ID from vault.env.
services:
vault:
image: meingpt/vault:2.22.0
ports:
- 8080:8080
depends_on:
database:
condition: service_started
ollama:
condition: service_healthy
restart: unless-stopped
networks:
- vault_network
volumes:
- ./config/app_config.yaml:/app/src/vault/config/app_config.yaml:ro
- ./data/vault:/data/vault
# Optional: mount a local directory for ingestion
- ./documents:/app/documents:ro
environment:
- VAULT_CONFIG_FILE_PATH=/app/src/vault/config/app_config.yaml
env_file:
- ./config/vault.env
vault-worker:
image: meingpt/vault:worker-2.22.0
ports:
- 8081:8080
depends_on:
database:
condition: service_started
ollama:
condition: service_healthy
restart: unless-stopped
networks:
- vault_network
volumes:
- ./config/app_config.yaml:/app/src/vault/config/app_config.yaml:ro
- ./data/vault:/data/vault
# Optional: mount a local directory for ingestion
- ./documents:/app/documents:ro
environment:
- VAULT_CONFIG_FILE_PATH=/app/src/vault/config/app_config.yaml
env_file:
- ./config/vault.env
database:
image: ghcr.io/tensorchord/vchord-postgres:pg18-v1.1.1
ports:
- 5432:5432
env_file:
- ./config/vault.env
volumes:
- ./data/postgres/:/var/lib/postgresql/
networks:
- vault_network
ollama:
image: ollama/ollama:latest
# Pre-pull the embedding model on first start so vault can use it immediately.
entrypoint: ["/bin/sh", "-c"]
command: ["ollama serve & sleep 3 && ollama pull bge-m3 && wait"]
volumes:
- ./data/ollama:/root/.ollama
networks:
- vault_network
healthcheck:
# Becomes healthy only once the embedding model is fully pulled,
# which gates vault/vault-worker startup so first embed calls don't fail.
test: ["CMD-SHELL", "ollama list | grep -q bge-m3"]
interval: 10s
timeout: 5s
retries: 60
start_period: 10s
piko:
image: ghcr.io/andydunstall/piko:latest
command:
- agent
- http
# Replace "your-vault-id" with the vault ID from vault.env
- your-vault-id
- vault:8080
- --connect.url
- https://vault-proxy.meingpt.com
env_file:
- ./config/vault.env
networks:
- vault_network
networks:
vault_network:Embedding model: This config uses bge-m3 (1024 dim, multilingual β works well for German). Other Ollama embedding models like nomic-embed-text (768 dim) or mxbai-embed-large (1024 dim) work too β adjust OPENAI_EMBEDDING_MODEL and OPENAI_EMBEDDING_DIMENSIONS in vault.env accordingly.
GPU: Ollama runs CPU-only inside Docker by default. If you have an NVIDIA GPU on the host, add deploy: { resources: { reservations: { devices: [{ driver: nvidia, count: all, capabilities: [gpu] }] } } } to the ollama service for major throughput gains.
Step 5 β Deploy
- If you have local files to ingest, add them to
data/vault/ - Start services:
docker compose up -d - Check health:
curl http://localhost:8080/health - Monitor logs:
docker compose logs -f vault vault-worker
Troubleshooting
- Check service status:
docker compose ps - View API logs:
docker compose logs vault - View ingestion logs:
docker compose logs vault-worker - Test Postgres:
docker compose exec database pg_isready -U datavault - Test Ollama:
docker compose exec ollama ollama list - Restart services:
docker compose restart
Common pitfall: ingestion not picked up
If ingestion never runs (no errors, but documents stay un-indexed), check that the metadata.postgres block in app_config.yaml is present and that both vault and vault-worker resolve to the same database host. They must share a Postgres so the worker sees tasks the API enqueues β earlier images defaulted to a per-container SQLite, which silently caused this exact symptom.