Performance Tuning
Optimize DataVault performance for your document volume and usage patterns
Core Performance Settings
Adjust these settings in your config/app_config.yaml based on your document volume:
Processing Settings
vault:
ingestion_interval: 900 # Seconds between ingestion runs (0 to disable)
tasks_batch_size: 10 # Tasks processed per datapool at once
chunk_size: 256 # Text chunk size in tokens
chunk_overlap: 26 # Overlapping tokens between chunksEmbedding Model Rate Limits
embedding_model:
rpm: 3000 # Requests per minute
tpm: 1000000 # Tokens per minuteSearch Rate Limiting
search_requests_per_minute: 30 # API rate limit
search_results_limit: 20 # Max results per searchWeaviate Memory Settings
Configure memory in your docker-compose.yaml:
weaviate:
environment:
GOMEMLIMIT: 4GiB # Memory limit for Weaviate
QUERY_DEFAULTS_LIMIT: 100 # Default query result limitMonitoring Settings
Enable monitoring based on your needs:
logging:
system_monitoring_interval: 60 # System stats (0 to disable)
storage_monitoring_interval: 300 # Storage stats (0 to disable)
database_monitoring_interval: 60 # Database stats (0 to disable)
heartbeat_interval_minutes: 1 # Uptime monitoring intervalTuning Guidelines
For Small Setups (< 1,000 documents)
tasks_batch_size: 5ingestion_interval: 300(5 minutes)GOMEMLIMIT: 1GiB
For Medium Setups (1,000 - 10,000 documents)
tasks_batch_size: 10(default)ingestion_interval: 900(15 minutes)GOMEMLIMIT: 4GiB
For Large Setups (10,000+ documents)
tasks_batch_size: 20ingestion_interval: 3600(1 hour)GOMEMLIMIT: 8GiB
Troubleshooting
Check Processing Performance
# Monitor ingestion logs
docker compose logs -f vault | grep -i "ingestion\|batch"
# Check current resource usage
docker stats --no-stream vault weaviateCommon Issues
- Slow ingestion: Increase
tasks_batch_sizeor reduceingestion_interval - Memory issues: Reduce
tasks_batch_sizeor increaseGOMEMLIMIT - Rate limit errors: Reduce
rpm/tpmin embedding model config - Search timeouts: Increase
QUERY_DEFAULTS_LIMITor reducesearch_results_limit
That's it. Focus on these core settings rather than complex optimizations.