Embedding Models

Overview and configuration of supported embedding models

Choose an embedding provider based on your requirements:

Supported Providers

Common Configuration

All providers support these parameters:

ParameterTypeDefaultDescription
providerstring-Provider name
rpminteger3000Requests per minute
tpminteger1000000Tokens per minute

Basic Setup

embedding_model:
  provider: "openai"  # or azure, nebius, huggingface_local
  # Provider-specific parameters...

Selection Guide

Cloud Providers (OpenAI, Azure, Nebius):

  • No local hardware required
  • Pay-per-use pricing
  • Easy to get started

Local Provider (HuggingFace):

  • Complete data privacy
  • No API costs after setup
  • Requires local compute resources

Important: Changing embedding models requires complete reindexing of all documents. Choose carefully before initial setup.

Comparing Embedding Quality

Test Criteria

  • Language: German and English document search
  • Domain: Domain-specific vs. general content
  • Length: Short vs. long text passages
  • Cost: API costs vs. hardware investment

Test different models with your own documents:

  1. Upload a small document collection
  2. Perform typical search queries
  3. Evaluate the relevance of results
  4. Compare response times

Migration Between Models

Preparation

  1. Backup: Back up current configuration
  2. Test Environment: Test new model separately
  3. Time Planning: Plan downtime

Migration Process

  1. Stop DataVault
  2. Clear Weaviate database
  3. Activate new embedding configuration
  4. Restart DataVault
  5. Wait for complete reindexing

Support and Troubleshooting

Common Issues

  • API Limits: Adjusting rpm/tpm parameters
  • Authentication: Checking API keys
  • Performance: GPU optimization for local models

Contact

For specific questions about embedding configuration: 📧 Enterprise Support: enterprise@meingpt.com