Custom DatabasesDataset Manager (Beta)
Ingestion & Scan
How data is loaded and analyzed in Dataset Manager
Ingestion path
Data is loaded into per-dataset tables, typically via exports from third-party systems or prepared pipelines.
Supported upload formats:
- CSV
- JSON
- SQLite
- XML
During upload, tables are created/updated (optionally with dropExisting) and dataset metrics are updated.
What the scan does
Schema scan analyzes the dataset and caches an enriched schema.
It includes:
- table inventory and estimated table size
- columns with types, nullability, defaults
- base statistics (distinct, null share, top values, min/max by type)
- primary/foreign keys and references
- index information
- detected join paths across tables
The scan can be refreshed on demand and is used to improve agent query quality.
Documentation per dataset/table
Beyond technical introspection, you can maintain human-readable docs:
- dataset documentation (Markdown)
- table documentation
- stored query descriptions and parameters
This creates a combined model of auto-introspection + business context.