DataMynd Docs
Synthetic Data Generator Docs
Synthetic Data Generator Docs
  • Synthetic Data Generator
    • About
    • Requirements And Setup
    • Private Link Information
  • Application Workflow
    • Project Setup
    • Select ML Model
    • Data Config
    • Train and Generate
    • Quality Report and Preview
  • Other
    • Object Privileges
    • Performance and Accuracy Tuning
Powered by GitBook
On this page
  1. Application Workflow

Quality Report and Preview

The Data Explorer page provides several tools for evaluating results. Shown as 'Explore' page in app.

PreviousTrain and GenerateNextObject Privileges

Last updated 7 months ago

Explore Your Data: This page is intended to show the user how well the new synthetic data matches the original data. The layout option on the left will allow the user to select what types of exploration options to view.

The Quality Report includes a histogram comparison of the real and synthetic data as well as several quality metrics that show how well the synthetic data represents the real data.

Histogram View: Generates a histogram for each field for the real data (green line) and the synthetic data (blue line). The grey bars indicate the variance % between histograms. Does not currently react to the filters (as of the first release of the app). Large variations indicate that you may need to tweak training parameters (like # of epochs). If you see 100% variance for all bins, you likely have an issue with types (see Configure Page – Column Types)

The Table View allows you to preview your new synthetic data. Filters in the left sidebar can be used for quick exploration (table + histograms).

Sidebar Options:

  • Layout: provides several views of the data and options for toggling certain fields to be showin in histogram comparisons (ids and anonymized fields)

  • Select Table: For multi-table projects, use this selector to choose which table to view metrics for.

  • Filters: Filters affect both the Table view as well as the Histogram view. Use these to explore your new synthetic data.

  • Note: Currently, the filters do not affect the quality or fit scores & metrics.