Test Data Generator

Selecting Tables

Test Data generator allows you to generate data for all tables in a database, for selected group of tables or for a single table. When generating data for multiple tables, it automatically sets the order in which data will be generated according to tables' dependencies based on referential integrity constraints (primary/foreign keys dependencies) so that data are generated first for tables with primary keys and only after that for tables that have foreign keys referring to these primary keys.

To select multiple tables, launch Test Data Generator using 'Generate Test Data' menu on a database node in the Object Explorer or select Tools -> Data Manager -> Generate Test Data top-level menu.

Test Data generator uses default value generation methods for all tables' columns to satisfy columns' datatypes and primary/foreign key dependencies but you can explicitly set your own rules for each column, such as specific ranges of numbers or dates to generate, patterns of text data or simply loading data from provided text files. It is especially important to set rules manually for columns that have CHECK constraints with user-defined expressions. Test Data Generation default values do not take these constraints into account and some inserted rows might fail because of it. You will see a warning message about existing columns with CHECK or UNIQUE constraints among the selected tables when Test Data Generator wizard opens. Use Warning radiobutton in the wizard’s panel to filter tables that have these constraints and set data generation rules for each column with check or unique constraint accordingly. For example, if CHECK constraint for OPERATION_CODE column with datatype char(1) is set to be 'A' or 'B' or 'C', you can enter text expression [ABC] into the Format field for this column to force data generator create only values A, B or C.

Number of Rows to Generate

You can set a number of rows to generate for each selected table. For example, for typical business applications lookup tables, such as list of countries or states, have a small number of rows, while transactional tables, such as user_orders, have millions of rows.

For your convenience, when multiple tables are selected for data generation, initially you can set a single default number of rows to generate for all selected tables. You should choose the number that applies for most of selected tables so that you would not have to override it afterwards on individual tables’ level. After the wizard panel is opened, you should select in table grid a table that requires a different number of rows to generate and enter a correct value into provided field in details area. If the specified value also applies to other tables, press Apply to Other Tables button located next to the 'Number of Rows to be generated' field. It allows choosing a group of other tables, for which the same number of rows to generate apply.

NOTE: If you attempt to generate test data for large database with many tables, the good practice is to initially set a number of rows to generate to some small number (for example, 10) for all tables and perform a sample test data generation run. It will allow you to make sure that all referential integrity and other constraints are set correctly without having to spend a lot of time before you run into a specific problem generating data for hundred’s table. For example, if there are 500 tables in your database, it will take 2-3 minutes to generate 10 rows for each table in a database and if there are some issues with referential constraints or other default data generation rules, you will know it right away. After you have a result of sample run, fix all issues by manually adjusting data generation rules for tables that have issues, for example provide correct default values for columns that have CHECK constraints, and run a data generation process again with a required large number of rows, which will take a significantly longer time in case you need to generate millions of rows for many tables. When generating test data for a second time, set the required number of rows for any table and then use Apply To Other Tables button to replicate this value for all other tables or selected group of tables. Also use 'Delete All Table Data' checkbox to make sure that during a second data generation attempt, the old data is deleted from tables and there will be no issues with inserting duplicate values, which potentially might violate primary key constraints. Use 'Apply to Other Tables' button located next to 'Delete All Data' checkbox to set the same option for selected group of tables.

Default Test Data Generation Rules

When you choose to perform test data generation for multiple tables, for example using Generate Test Data menu from Tables or a database node in the Object Explorer, the Options dialog box opens that allows to set default rules for data generation as well as default number of rows to generate for all selected tables. Even though you can adjust these rules for each selected table at later time when Test Data Generator wizard is opened, use this dialog box to set options that apply for most of selected tables – it will save you time adjusting those parameters for each table individually. Switch to Defaults tab of Options dialog box to set default rules.

There are two group of parameters Primary Key Data and Random Values. Primary Key Data rules apply to columns that are part of a primary key in selected tables. Random Values rules apply to columns that are not part of primary key. You can set rules per datatype – for example, for numeric and date columns you can set ranges of values to generate. For text columns you can set a pattern – default pattern is “*”, which means to generate random number of alpha-numeric characters but not exceeding the datatype limitation on a number of characters. You can also define a percentage of NULL values to generate for columns that allow NULL values.

Test Data Generator Wizard

Test Data generator Wizard is opened after you press OK button in the Test Data Generator dialog box where you select a list of tables and define a default number of rows to generate for all selected tables. In Test Data Generator wizard you can customize data generation rules for each table and column and actually start data generation process.

First page of the wizard displays a list of tables ordered by their dependencies at the top of the panel and details for a table selected in the list at the bottom. The details area is split into two panels. The left panel displays a list of all columns for the selected table. The right panel allows to define a data generation method for a column selected in the column's list and preview a sample of the data, which will be inserted into this column.

Use Find field located under the list of tables to search for a required table in the list. To search for a table, type a first few characters of the required table name in the Find input field and press Next button. 'Warnings' and 'Failures' radiobuttons allow to display only tables that had either warning messages or errors during test data generation process.

First you should select, which table columns you want to explicitly generate the data for, by checking Include checkbox in the list. You might want to skip some columns, for example, if they have default values assigned in the CREATE TABLE statement and you want to use these default values or you want to have NULL values inserted in these columns. When you open the Test Data Generator wizard, all columns except SERIAL columns are included by default. The SERIAL columns values are usually generated by the server according to rules set up in the CREATE TABLE statement. You still can explicitly override this behavior by checking Include checkbox for SERIAL columns.

For each included column, you should define the Value Generation Method. There are four value generation methods available - RANDOM, SEQUENTIAL, FROM FILE, FROM DATABASE OBJECT.

By default, the program assigns the value generation method for each column according to its referential integrity constraints and its datatype. So if you do not have any special requirements for the generated data and want to accept defaults, just press Next button to proceed to a second page of the wizard. For columns, which do not have a primary or foreign key constraints, the default value generation method is usually RANDOM. If the column is part of a foreign key constraint, the data value is taken from another table column, which is linked to this foreign key column. If the column has a primary key constraint and has a numeric datatype, the default value generation method is SEQUENTIAL.

To see example of data, which will be inserted into selected column according with the selected value generation method, press the Sample Data button.

Press Next button to switch to the Data Generation page, where you will be able to enter required number of rows to generate and specify error handling options.

Value Generation Methods

Each value generation method has common options and its own specific options. The common option is if the NULL values should be generated and the percentage of NULL values relative to a total number of generated rows. For example, if you specify to generate 1000 rows and you choose percentage of NULL values 30%, it will generate 300 NULL values for this column. This option is disabled when the selected column does not allow NULL values at all as a result NOT NULL constraint in the CREATE TABLE statement.

RANDOM value generation method

When this method is selected, the data are generated randomly to comply with the column datatype and defined data generation rules specific to this datatype.

For numeric columns you can specify the range of values to generate random data.

For text columns, such as CHAR, VARCHAR, TEXT, you can specify a format of string value by providing a pattern. There are also pre-defined patterns, such as Email Address, US Phone Number, US SS#, US ZIP Code. The patterns comply with standard Informix and Perl patterns for match function. For a complete description of allowed pattern values see Text Format Expressions topic.

For date and datetime columns you can enter the range of possible dates.

For BLOB and CLOB columns the data is taken from binary files with extensions .BLOB and .CLOB, which should be created in the TEST_DATA subdirectory of the Server Studio JE directory. The Data Manager provides by default three .BLOB files and three .CLOB files. The files are selected randomly. If you want to add your own BLOB and CLOB data to insert, you have to put each of the value into the binary file, rename these files to have extension . BLOB or .CLOB depending on the datatype, which you inserting the data into, and copy these files into TEST_DATA sub-directory.

SEQUENTIAL value generation method

This method is available only for columns having numeric, SERIAL and DATE datatypes.

For numeric columns you can specify starting value and an increment.

For DATE columns you can specify starting date, an increment and type of increment value. For example, you can increment in years, months, days, hours, etc.

FROM DATABASE OBJECT value generation method

This method allows you to use data in another table to populate the selected column. Data from the selected database table is retrieved sequentially row by row. If the number of rows in the referenced table is less than a number of rows required to generate, the values will be repeated  from the beginning when the end of the referenced table is reached.

This is the most suitable method to handle foreign key columns and is defaulted by the application for all foreign key columns.

FROM FILE value generation method

This method allows you to use data from the text files to populate required test table. It is very convenient when you want to have meaningful data in your test table. For example you can use UNLOAD command or Export Data Wizard to create a text file out of the real production table and then use this text file to populate test data table.

Data from the selected file field is retrieved sequentially record by record. If the number of rows in the file is less than a number of rows required to generate, the values will be repeated  from the beginning when the end of the file is reached.

You should select the file name, where to take the data from , the field number in the file and the field separator character.

Use Browse button to select a file name. By default when you select the file, the application tries to guess its field separator by scanning first 100 rows of the file. If it did not guess correctly you can explicitly specify the correct separator.

You required to enter the file field number from where to take the values from.

Use Preview File button to view the file content if you are not sure what the field number or the field separator is.

Use Sample Data button to make sure that the correct values from the file are retrieved.

Generating Test Data

Test data generation process is launched at the second page of the Test Data Generator Wizard. The data generation is an asyncronious process so that you can switch to other document within the Server Studio framework and continue to use the application while the data is generated.

To start test data generation:

  1. Press Next button on the first page of Test Data Generation wizard
  2. If you want to stop automatically the generation process when any error occurs, check "Stop if error occurs" checkbox. Otherwise, the generation will attempt to continue and the error messages will be displayed in the Status Messages area.
  3. Press Generate button to start data generation process. Press Cancel button if you want to break the data generation process. After data generation is completed, use Failures radiobutton located under the list of tables to filter tables that had problems. Select each table with a problem in the list to see detailed error message in the Status Messages area.

You can press Back button to change data generation rules for a selected table or several tables and repeat data generation. After table's data was generated successfully, the wizard unchecks Run checkbox in the table's list for this table. If you want to repeat test data generation for the table, check 'Run' checkbox in the top table list and press Generate button.