DataVault Deployment

Azure Cloud Deployment

Simple Azure setup with Azure OpenAI and SharePoint

Azure Setup: Deploy DataVault with Azure OpenAI embeddings and SharePoint integration.

Prerequisites

  • Azure subscription with Azure OpenAI access
  • SharePoint Online with documents
  • Azure AD app registration for SharePoint

Setup

mkdir datavault-azure && cd datavault-azure
mkdir -p {config,data}

Configuration

Environment File

config/vault.env
VAULT_ID=your-vault-id-from-dashboard
VAULT_SECRET=your-vault-secret-from-dashboard
MEINGPT_URL=https://api.meingpt.com

# Azure OpenAI
AZURE_OPENAI_API_KEY=your-azure-openai-api-key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/

# SharePoint (optional)
SHAREPOINT_CLIENT_ID=your-app-registration-client-id
SHAREPOINT_CLIENT_SECRET=your-app-registration-secret
SHAREPOINT_TENANT_ID=your-azure-tenant-id

Main Configuration

config/app_config.yaml
version: 1.0
meingpt_url: $MEINGPT_URL

vault:
  id: $VAULT_ID
  secret: $VAULT_SECRET
  standalone_mode: false
  data_dir: ./tmp
  ingestion_interval: 900
  tasks_batch_size: 10
  chunk_size: 512
  chunk_overlap: 51

weaviate:
  connection_type: local
  host: weaviate
  port: 8001
  grpc_host: weaviate
  grpc_port: 50051
  api_key: ""

embedding_model:
  provider: "azure"
  api_key: $AZURE_OPENAI_API_KEY
  api_version: "2023-05-15"
  model: "text-embedding-3-small"
  endpoint: $AZURE_OPENAI_ENDPOINT
  embedding_dimensions: 1536
  rpm: 3000
  tpm: 1000000

logging:
  log_level: "INFO"
  log_to_file: true
  log_file_path: "logs/app.log"
  uvicorn_log_file_path: "logs/uvicorn.log"

# Optional SharePoint integration
data_pools:
  - id: sharepoint-docs
    type: onedrive
    client_id: $SHAREPOINT_CLIENT_ID
    client_secret: $SHAREPOINT_CLIENT_SECRET
    refresh_token: $SHAREPOINT_REFRESH_TOKEN
    drive_id: $SHAREPOINT_DRIVE_ID
    drive_type: "documentLibrary"
    tenant_id: $SHAREPOINT_TENANT_ID
    base_path: "/"

Docker Compose

docker-compose.yaml
services:
  vault:
    image: meingpt/vault:latest
    ports:
      - 8080:8080
    depends_on:
      - weaviate
    networks:
      - vault_network
    volumes:
      - ./config/app_config.yaml:/app/src/vault/config/app_config.yaml:ro
      - ./data:/data/vault
    environment:
      - VAULT_CONFIG_FILE_PATH=/app/src/vault/config/app_config.yaml
    env_file:
      - ./config/vault.env

  piko:
    image: ghcr.io/andydunstall/piko:latest
    command:
      - agent
      - http
      - ${VAULT_ID}
      - vault:8080
      - --connect.url
      - https://piko.deploy.selectcode.dev
    env_file:
      - ./config/vault.env
    networks:
      - vault_network

  weaviate:
    image: cr.weaviate.io/semitechnologies/weaviate:1.28.3
    command:
      - --host
      - 0.0.0.0
      - --port
      - '8001'
      - --scheme
      - http
    expose:
      - 8001
      - 50051
    volumes:
      - weaviate_data:/var/lib/weaviate
    restart: on-failure:3
    environment:
      QUERY_DEFAULTS_LIMIT: 100
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_API_BASED_MODULES: 'true'
      CLUSTER_HOSTNAME: 'node1'
    ports:
      - 8001:8001
      - 50051:50051
    networks:
      - vault_network

volumes:
  weaviate_data:

networks:
  vault_network:

Deploy

  1. Configure Azure OpenAI service and get API key
  2. Setup SharePoint app registration (if using SharePoint)
  3. Start services: docker compose up -d
  4. Check health: curl http://localhost:8080/health
  5. Monitor logs: docker compose logs -f vault

SharePoint Setup (Optional)

To connect SharePoint, create an Azure AD app registration with these permissions:

  • Sites.Read.All
  • Files.Read.All

Get the OAuth refresh token through the standard OAuth flow.

Troubleshooting

  • Check service status: docker compose ps
  • View logs: docker compose logs vault
  • Test Azure OpenAI: Check API key and endpoint
  • Test SharePoint: Verify OAuth tokens are valid
  • Restart services: docker compose restart

That's it. Your Azure DataVault will use Azure OpenAI for embeddings and optionally sync SharePoint documents.