Train and Generate
This page handles training the synthetic data model and using the trained model to generate synthetic data. Shown as 'Execute' page in the app.
Last updated
This page handles training the synthetic data model and using the trained model to generate synthetic data. Shown as 'Execute' page in the app.
Last updated
Model Training:
By default the model will sample "10000" rows of the real data to train the configured model. Increasing this number can improve quality of results but may also increase training time.
Select a root table for projects with multiple tables. This should be your most central fact table in your model.
After clicking "Train Model" please allow some time for the model to be generated. Depending on the amount of data being used to train the model this can run for a few minutes to a couple of hours.
You may leave this page while the model trains (and even start or edit another model!). All training and generation tasks run in the background.
The Status should automatically update once the model has been trained.
Data Generation:
After training, the last step is to generate data. Enter the number of rows to be generated, and whether or not to overwrite any existing data in the new synthetic data table. For projects with multiple tables, you must select the number of rows to generate for each.
Click "Generate Data" to generate your synthetic data and move onto the next page to view the results after a few moments! The speed of this step is also dependent on data size and model complexity.
Once complete, you will get a notification showing the location that the new data was stored.