Embedding Models

Choose an embedding provider based on your requirements:

Supported Providers

OpenAI

Cloud-based embeddings via OpenAI API

Azure OpenAI

OpenAI models via Microsoft Azure

Nebius

Multilingual embeddings optimized for European languages

HuggingFace (Local)

Local execution for maximum data privacy

Common Configuration

All providers support these parameters:

Parameter	Type	Default	Description
`provider`	string	-	Provider name
`rpm`	integer	3000	Requests per minute
`tpm`	integer	1000000	Tokens per minute

Basic Setup

embedding_model:
  provider: "openai"  # or azure, nebius, huggingface_local
  # Provider-specific parameters...

Selection Guide

Cloud Providers (OpenAI, Azure, Nebius):

No local hardware required
Pay-per-use pricing
Easy to get started

Local Provider (HuggingFace):

Complete data privacy
No API costs after setup
Requires local compute resources

Important: Changing embedding models requires complete reindexing of all documents. Choose carefully before initial setup.

Comparing Embedding Quality

Test Criteria

Language: German and English document search
Domain: Domain-specific vs. general content
Length: Short vs. long text passages
Cost: API costs vs. hardware investment

Migration Between Models

Preparation

Backup: Back up current configuration
Test Environment: Test new model separately
Time Planning: Plan downtime

Migration Process

Stop DataVault
Clear Weaviate database
Activate new embedding configuration
Restart DataVault
Wait for complete reindexing

Support and Troubleshooting

Common Issues

API Limits: Adjusting rpm/tpm parameters
Authentication: Checking API keys
Performance: GPU optimization for local models

Contact

For specific questions about embedding configuration: 📧 Enterprise Support: enterprise@meingpt.com