WebsitePlatform Login
Custom DatabasesDataset Manager (Beta)

Ingestion & Scan

How data is loaded and analyzed in Dataset Manager

Ingestion path

Data is loaded into per-dataset tables, typically via exports from third-party systems or prepared pipelines.

Supported upload formats:

  • CSV
  • JSON
  • SQLite
  • XML

During upload, tables are created/updated (optionally with dropExisting) and dataset metrics are updated.

What the scan does

Schema scan analyzes the dataset and caches an enriched schema.

It includes:

  • table inventory and estimated table size
  • columns with types, nullability, defaults
  • base statistics (distinct, null share, top values, min/max by type)
  • primary/foreign keys and references
  • index information
  • detected join paths across tables

The scan can be refreshed on demand and is used to improve agent query quality.

Documentation per dataset/table

Beyond technical introspection, you can maintain human-readable docs:

  • dataset documentation (Markdown)
  • table documentation
  • stored query descriptions and parameters

This creates a combined model of auto-introspection + business context.

On this page