WebsitePlatform Login

Deployment Verification

Verify your DataVault deployment is working correctly

Deployment Verification

After deploying DataVault, it's crucial to verify that all components are functioning correctly. This guide provides a comprehensive checklist and testing procedures.

Initial Health Checks

1. Service Status

Check all services are running:

# Docker deployment
docker-compose ps

# Kubernetes deployment
kubectl get pods -n datavault
kubectl get svc -n datavault

# Systemd services
systemctl status datavault-api
systemctl status weaviate
systemctl status postgresql

Expected output:

  • All services show "running" or "healthy" status
  • No restart loops
  • No error states

2. API Accessibility

Test API endpoint:

# Health check
curl -X GET http://localhost:8000/health

# Expected response:
{
  "status": "healthy",
  "version": "2.0.0",
  "services": {
    "database": "connected",
    "vector_db": "connected",
    "cache": "connected"
  }
}

# API docs
curl -X GET http://localhost:8000/docs

3. Database Connectivity

PostgreSQL:

# Test connection
psql -h localhost -U datavault -d datavault -c "SELECT version();"

# Check tables
psql -h localhost -U datavault -d datavault -c "\dt"

Weaviate:

# Check Weaviate status
curl -X GET http://localhost:8080/v1/meta

# List schemas
curl -X GET http://localhost:8080/v1/schema

Functional Testing

1. Authentication

Test login:

# Get access token
curl -X POST http://localhost:8000/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "username": "admin",
    "password": "your-password"
  }'

# Use token for authenticated requests
export TOKEN="your-access-token"
curl -X GET http://localhost:8000/api/v1/user/profile \
  -H "Authorization: Bearer $TOKEN"

2. Document Ingestion

Upload test document:

# Create test file
echo "This is a test document for DataVault verification." > test.txt

# Upload document
curl -X POST http://localhost:8000/api/v1/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@test.txt" \
  -F "metadata={\"category\":\"test\"}"

# Check processing status
curl -X GET http://localhost:8000/api/v1/documents/status/{document_id} \
  -H "Authorization: Bearer $TOKEN"

Test semantic search:

# Search for similar documents
curl -X POST http://localhost:8000/api/v1/search \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "test document verification",
    "limit": 10,
    "threshold": 0.7
  }'

4. RAG Functionality

Test question answering:

# Ask a question
curl -X POST http://localhost:8000/api/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "What is DataVault?"}],
    "use_context": true,
    "model": "gpt-3.5-turbo"
  }'

Performance Verification

1. Response Times

Measure API latency:

# Simple latency test
time curl -X GET http://localhost:8000/health

# Load test with Apache Bench
ab -n 100 -c 10 -H "Authorization: Bearer $TOKEN" \
   http://localhost:8000/api/v1/documents/list

# Expected results:
# - Health check: <100ms
# - Document list: <500ms
# - Search queries: <2s

2. Resource Usage

Monitor system resources:

# CPU and Memory
top -b -n 1 | grep -E "datavault|weaviate|postgres"

# Disk usage
df -h | grep -E "datavault|docker"

# Network connections
netstat -tulpn | grep -E "8000|8080|5432"

3. Database Performance

Check query performance:

-- PostgreSQL slow queries
SELECT query, calls, mean_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

-- Table sizes
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

Integration Testing

1. Data Source Connections

Test each configured source:

# S3 source
curl -X POST http://localhost:8000/api/v1/sources/test \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "s3",
    "config": {
      "bucket": "test-bucket",
      "region": "us-east-1"
    }
  }'

# Confluence source
curl -X POST http://localhost:8000/api/v1/sources/test \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "confluence",
    "config": {
      "url": "https://company.atlassian.net",
      "space": "TEST"
    }
  }'

2. Embedding Provider

Verify embedding generation:

# Test embedding endpoint
curl -X POST http://localhost:8000/api/v1/embeddings/generate \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Test embedding generation",
    "model": "text-embedding-ada-002"
  }'

3. LLM Provider

Test LLM connectivity:

# Test completion without context
curl -X POST http://localhost:8000/api/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Hello"}],
    "use_context": false,
    "model": "gpt-3.5-turbo"
  }'

Security Verification

1. Authentication Tests

# Test invalid token
curl -X GET http://localhost:8000/api/v1/documents/list \
  -H "Authorization: Bearer invalid-token"
# Expected: 401 Unauthorized

# Test missing token
curl -X GET http://localhost:8000/api/v1/documents/list
# Expected: 401 Unauthorized

# Test expired token
# Wait for token expiry, then:
curl -X GET http://localhost:8000/api/v1/documents/list \
  -H "Authorization: Bearer $EXPIRED_TOKEN"
# Expected: 401 Unauthorized

2. Authorization Tests

# Test accessing other user's documents
curl -X GET http://localhost:8000/api/v1/documents/{other_user_doc_id} \
  -H "Authorization: Bearer $TOKEN"
# Expected: 403 Forbidden

# Test admin-only endpoints as regular user
curl -X GET http://localhost:8000/api/v1/admin/users \
  -H "Authorization: Bearer $USER_TOKEN"
# Expected: 403 Forbidden

3. Input Validation

# Test SQL injection
curl -X POST http://localhost:8000/api/v1/search \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "test'; DROP TABLE documents; --",
    "limit": 10
  }'
# Expected: Normal results, no database damage

# Test XSS
curl -X POST http://localhost:8000/api/v1/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@test.txt" \
  -F "metadata={\"title\":\"<script>alert('xss')</script>\"}"
# Expected: Escaped or rejected

Monitoring Setup

1. Prometheus Metrics

# Check metrics endpoint
curl -X GET http://localhost:8000/metrics

# Verify key metrics:
# - http_requests_total
# - http_request_duration_seconds
# - document_processing_duration_seconds
# - vector_search_duration_seconds

2. Logging

# Check application logs
tail -f /var/log/datavault/api.log

# Check error logs
tail -f /var/log/datavault/error.log

# Verify log format and content:
# - Timestamp
# - Log level
# - Request ID
# - User ID
# - Action performed

3. Alerts

Test alert conditions:

# Simulate high CPU
stress --cpu 8 --timeout 60s

# Simulate disk full
dd if=/dev/zero of=/tmp/testfile bs=1G count=10

# Simulate service down
docker stop datavault-api

# Verify alerts are triggered

Load Testing

Gradual Load Test

# load_test.py
import concurrent.futures
import requests
import time

BASE_URL = "http://localhost:8000"
TOKEN = "your-token"

def search_request():
    headers = {"Authorization": f"Bearer {TOKEN}"}
    data = {"query": "test query", "limit": 10}
    response = requests.post(
        f"{BASE_URL}/api/v1/search",
        json=data,
        headers=headers
    )
    return response.status_code, response.elapsed.total_seconds()

# Test with increasing load
for num_users in [1, 5, 10, 20, 50]:
    print(f"\nTesting with {num_users} concurrent users:")
    
    with concurrent.futures.ThreadPoolExecutor(max_workers=num_users) as executor:
        start_time = time.time()
        futures = [executor.submit(search_request) for _ in range(100)]
        results = [f.result() for f in futures]
        
        total_time = time.time() - start_time
        success_count = sum(1 for status, _ in results if status == 200)
        avg_response_time = sum(t for _, t in results) / len(results)
        
        print(f"Total time: {total_time:.2f}s")
        print(f"Success rate: {success_count}/100")
        print(f"Avg response time: {avg_response_time:.2f}s")

Verification Checklist

Core Functionality

  • API health endpoint returns healthy
  • Authentication works
  • Document upload succeeds
  • Document processing completes
  • Vector search returns results
  • RAG queries work
  • All configured sources connect

Performance

  • Response times meet requirements
  • Resource usage is acceptable
  • System handles expected load
  • No memory leaks observed

Security

  • Authentication is enforced
  • Authorization rules work
  • Input validation is effective
  • SSL/TLS is configured
  • Logs don't contain sensitive data

Reliability

  • Services auto-restart on failure
  • Data persists across restarts
  • Backups are created
  • Monitoring is active
  • Alerts are configured

Troubleshooting Failed Checks

If any verification fails:

  1. Check service logs for errors
  2. Verify configuration files
  3. Ensure all dependencies are installed
  4. Check network connectivity
  5. Verify resource availability
  6. Review security settings

For detailed troubleshooting, see the operations guide.