Chapter 13. Generating and Loading Bulk Datasets
Prev		Next

Chapter 13. Generating and Loading Bulk Datasets

Table of Contents

For all workloads HammerDB can create the schema and generate and load the data without requiring a staging area, in many circumstances this is the preferred method of loading especially for OLTP workloads. Nevertheless in some circumstances it is preferable to create the data externally as flat files and then use a special database vendor provided bulk loading command to load the data into pre-created tables. This option may be preferred for example where the target database to load is located in the cloud or where the target database has a column structure meaning that load performance using batch inserts is poor. Additionally bulk loading can enable more flexibility to modify the schema according to preference and reload during testing. This chapter details how to generate and load large data sets with HammerDB. From version 4.2 the limit for generating the TPROC-C schema has increased from 30,000 to 100,000 warehouses. Note that this is an interface limit to prevent over-provisioning (100,000 warehouses may generate up to 10TB of data), however it is straightforward to exceed this capacity by manually modifying the generated datageb build script to increase the value.

Prev		Next
12. Calculate the Geometric Mean	Home	1. Generate the Dataset