Test Data -> How do test data generators support me nowadays?

There are topics that are chased through the technical media landscape with a lot of thunder. Other topics, although of no minor importance, do not receive this attention. We talk about one of these topics in this blog.

The Importance, Provisioning and Management of Test Data in Databases

Doesn't sound very exciting at first. And yet there is a lot of potential in this area!

This article focuses on the backgrounds that play a role in providing test data for software with database connectivity. Therefore, we don't discuss here the basic functional tests that every software has to go through and which ensure the principle of error-free operation of a software.

Image Clouds

The topic "Test data in databases" is not new. It already exist as long as there are databases. Nevertheless, this topic never came into the focus of public discussion or that of the software developers.

Practically every developer who develops a system with a database connection has something to do with this topic. To this day, still many developers use SQL scripts to create some test records for their databases in order to perform system tests.

Since in today's databases, data models have become more complex than in the past, this methodology soon reaches its limits and the time advantage of self-written scripts quickly disappears if a data model has more than 30 tables. In addition, mastering the interdependencies of tables (referential integrity) becomes increasingly difficult the larger the data model is.

The developer is faced with the dilemma of being able to test his application according to requirements, i. e. also under realistic load, but at the same time using as little as possible of his resources for the provision of adequate test data volumes.
The reason for this is that even today, a comprehensive software test is still often regarded as an "accessory" for which the developer should only invest the most necessary time.
Unfortunately, these are contradictory interests, which are usually not compatible with each other and are usually at the expense of meaningful test series.

Unfortunately, the software industry is still accused to the prejudice of bringing solutions that have not been fully tested to market. True to the motto: The quicker a software product is on the market, the faster money can be earned with it.
That this is a delusion can be studied by looking at the example of the automotive industry, which has to spend considerable sums of money on recalls due to quality defects, not to mention the loss of customer confidence.

Therefore, the software industry is well advised to only market products that have been tested according to solid test strategies.

The topic of the provision of test data can be understood as a stand-alone topic or a larger frame must be set up. The dividing line is whether a software should be tested manually or automatically.
Special test systems are used for automatic tests, which define the generation of test data as part of the test process itself and already have this property as an immanent part of their functional spectrum. Such test systems are expensive to purchase and also involve considerable effort in setup and operations.
Automatic test systems are generally only profitable in larger organizations that work with many developers on extensive software systems. In order to set up and operate the sensible and trouble-free operation of such test machines, several employees are often required.

The alternative to this is the manual testing of a software. In the meantime, solutions for this purpose are available on the market that offer software developers and testers comprehensive support.

As already mentioned, the first problem is the complexity of the data model. Most of the existing SQL databases work with foreign keys, which means that you can only insert or delete a data record in a dependent table if the referenced record already exists in the parent table.
Considering these dependencies does not pose any problems if the model does not consist of too many tables. However, if about 20 to 30 tables in the data model are dependent on each other, it becomes difficult to keep track of the branches of the dependencies.
Beyond 50 - 60 interdependent tables, the whole thing becomes a Vabanque game, which, unintentionally, can lead to errors during the generation of test data due to incorrect referencing and thus to a rapid increase in effort. At this point, a tool that frees the developer from searching for dependencies between the tables of his data model is helpful, because it recognizes and takes these dependencies into account by itself. The developer then only has to worry about the appearance of his test data.

Image Levels

Under realistic test conditions, it is rarely enough to create a single test data scenario. Several scenarios are required just for testing under different load conditions.
Not to mention the partial testing of a software which does not require the creation of new test data in all tables and which does not interfere with the provision of test data for parallel tests of other parts of the software!
No matter which combination, a considerable number of different test data scenarios can be expected, which in the case of a manual script creation requires a lot of work and effort.

Thanks to a graphical representation, some tools offer the option of excluding certain tables in a database from the test data creation process. This is much faster than rewriting an existing SQL script.

This means that managing these test data scenarios also plays an important role. A helpful developer tool bundles all scenarios under one interface and can initiate each of them on request.

The centralization of the work for the creation of test data can still be increased. Many of the DBMS manufacturers offer their own, simple tools, such as, for example, SQL editors to create and execute SQL scripts.
For the developer, this means that if he works with different DBMSs, he has to switch to a different tool each time.
Time-saving and easier to use would be the use of a single tool that can generate test data in any SQL database!
There are already products that meet this requirement. They are usually based on ODBC/JDBC/.NET technology to access all available DBMS on the market. In addition, database drivers for these technologies are generally available free of charge.

The crowning glory of the support for a developer would be, of course, if a tool would relieve him of all work with the creation of test data! If it

  • automatically recognizes the organization of the database tables by the scan
  • assigns test data definitions to each table column automatically

Then the developer would only have to define the number of data records to be created in the database (if this differs from the default value) and then only press the 'Start' button to generate the test data in the target database.
This would of course be an unbeatable productivity advantage compared to other methods and procedures. The cost of purchasing such a tool would be already amortized with only one application!

Such an ideal use is, of course, only conceivable if the requirements for the test data are not particularly high, since standard algorithms are used which are designed to produce a theoretically unlimited number of test data sets.

However, if such a tool gives the developer the ability to refine the look of the test data in the areas that matter to them, it still has a huge productivity advantage, as the other test data definitions are automatically generated within seconds by the tool itself.

Some tools offer further assistance by visualizing the test data before they are written to the database.
This enables the developer to identify which test data meets his or her requirements and which does not. And he can see that test data descriptions may still be missing.
This saves a lot of time compared to the case that a control of the test data is only possible after writing it into the database. Errors then have to be eliminated laboriously by recursion of the processes "Write - Control - Change script - Reset DB - Write again".

Image Test Data

Conclusion:
The aforementioned "wallflower being" of the topic of test data management is no longer authorized thanks to the powerful tools available today.
With the help of such tools, today's developers working with software connected to an SQL database can significantly increase their productivity in the provisioning of test data and its management.
What's more, these tools provide functionalities that hadn't be previously available, or could only have done at an unjustifiably high cost.

Some of these tools appear in a modern look, are intuitive to use and offer something for the eye. This, too, meets the appreciation of a developer's work.

In addition, some of these tools are available in the lower price segment, so the purchase of such software is a ridiculously low investment compared to the resources saved.

Even in the best interest of making the work of creating test data as painless as possible, every developer should consider purchasing such a tool.

It is time to sensitize companies (and developers) to the benefits of such solutions. They are available on the market, partly for little money and offer real potential to make the work of developers and testers easier.

It would be foolish for companies to ignore the competitive advantages offered by these tools. This is where a significant increase in productivity can actually be achieved, helping companies to make their products fit for their markets and position themselves better against their competitors.