How to Graph HammerDB Response Times

The HammerDB workload derived from TPC-C contains a feature to record the response times for the stored procedures that are run in the logfile. This post shows how you can easily convert this data into spreadsheet format to view the response times you have captured over a run.

Firstly, to activate the time profile feature select Time Profile for the GUI or the set the equivalent parameter in the CLI to true.

Time Profile

You can note that when loaded the driver script is modified to report response times for the first active virtual user running the workload. (Only the first virtual user is tracked to minimise the impact on the workload). This data is reported every 10 seconds into the HammerDB log.  The output is in percentile format reporting the minimum, 50th percentile, 95th percentile, 99th percentile and maximum for each of the procedures during that 10 second interval with an extract shown below.

+-----------------+--------------+------+--------+--------------+
 |PERCENTILES 2019-08-23 04:54:21 to 2019-08-23 04:54:31
 |neword|MIN-812|P50%-2320|P95%-3614.5|P99%-4838|MAX-10610|SAMPLES-2221
 |payment|MIN-281|P50%-1136|P95%-1892|P99%-2344|MAX-3725|SAMPLES-2133
 |delivery|MIN-2931|P50%-4048|P95%-5247|P99%-5663|MAX-5989|SAMPLES-197
 |slev|MIN-1521|P50%-2108|P95%-2629|P99%-2859|MAX-3138|SAMPLES-248
 |ostat|MIN-212|P50%-398.5|P95%-610|P99%-770|MAX-1732|SAMPLES-205
 |gettimestamp|MIN-3|P50%-5|P95%-6|P99%-6|MAX-27|SAMPLES-4753
 +-----------------+--------------+------+--------+--------------+
 |PERCENTILES 2019-08-23 04:54:31 to 2019-08-23 04:54:41
 |neword|MIN-797|P50%-2301.5|P95%-3590|P99%-6458|MAX-10130|SAMPLES-2147
 |payment|MIN-299|P50%-1136|P95%-1840|P99%-2301|MAX-3470|SAMPLES-2124
 |delivery|MIN-2922|P50%-4164.5|P95%-5334|P99%-5802|MAX-6179|SAMPLES-247
 |slev|MIN-1342|P50%-2074|P95%-2700|P99%-2945|MAX-3038|SAMPLES-218
 |ostat|MIN-193|P50%-409|P95%-571|P99%-620|MAX-897|SAMPLES-220
 |gettimestamp|MIN-3|P50%-5|P95%-6|P99%-6|MAX-34|SAMPLES-4735
 +-----------------+--------------+------+--------+--------------+

In text from this data can be useful, however often it is better to view this in graphical form. The simple script below can be run at the command line and provided with a logfile with the data for one run only. (Note that it is important that to convert you only provide a logfile for that to convert, otherwise all of the data will be combined).  When run on a logfile with data such as shown above this will output the data in tab delimited format that can be interpreted by a spreadsheet.

#!/bin/tclsh
 set filename [lindex $argv 0]
 set fp [open "$filename" r]
 set file_data [ read $fp ]
 set data [split $file_data "\n"]
 foreach line $data {
 if {[ string match *PERCENTILES* $line ]} {
 set timeval "[ lindex [ split $line ] 3 ]"
 append xaxis "$timeval\t"
         }
     }
 puts "TIME INTERVALS"
 puts "\t$xaxis"
 foreach storedproc {neword payment delivery slev ostat} {
 puts [ string toupper $storedproc ]
 foreach line $data {
 if {[ string match *PROCNAME* $line ]} { break }
 if {[ string match *$storedproc* $line ]} {
 regexp {MIN-[0-9.]+} $line min
 regsub {MIN-} $min "" min
 append minlist "$min\t"
 regexp {P50%-[0-9.]+} $line p50
 regsub {P50%-} $p50 "" p50
 append p50list "$p50\t"
 regexp {P95%-[0-9.]+} $line p95
 regsub {P95%-} $p95 "" p95
 append p95list "$p95\t"
 regexp {P99%-[0-9.]+} $line p99
 regsub {P99%-} $p99 "" p99
 append p99list "$p99\t"
 regexp {MAX-[0-9.]+} $line max
 regsub {MAX-} $max "" max
 append maxlist "$max\t"
     }
       }
 puts -nonewline "MIN\t"
 puts $minlist
 unset -nocomplain minlist
 puts -nonewline "P50\t"
 puts $p50list 
 unset -nocomplain p50list
 puts -nonewline "P95\t"
 puts $p95list 
 unset -nocomplain p95list
 puts -nonewline "P99\t"
 puts $p99list
 unset -nocomplain p99list
 puts -nonewline "MAX\t"
 puts $maxlist
 unset -nocomplain maxlist
     }
 close $fp

In this example we run the script above, pass the name of the logfile for the run where response times were captured and output them to a file with a spreadsheet extension name.  Note that it is important to output the data to a file and not to a terminal with that data then cut and paste into a spreadsheet. If output to a terminal it may format the output by removing the tab characters which are essential to the formatting.

$ ./extracttp.tcl pgtp.log > pgtp.txt

If we look in this file we can see that it has extracted the data and output them into rows with the extracted values.

$ more pgtp.txt
TIME INTERVALS
     04:49:51    04:50:01    04:50:11    04:50:21    04:50:31    04:50:41    04:50:51    04:51:01    04:51:11    04:51:21    04:51:31    04:51:41    04:51:51    04:52:01    04:52:11    04:52:21    04:52:31    04:52:41    04:52:51    04:53:01    04:53:11    04:53:21    04:53:31    04:53:41    04:53:51    04:54:01    04:54:11    04:54:21    04:54:31    04:54:41    04:54:51    04:55:01    04:55:11    04:55:21    04:55:31    04:55:41    04:55:51    04:56:01    04:56:11    04:56:21    04:56:31    04:56:41    
 NEWORD
 MIN    457 519 684 734 828 829 795 894 803 880 775 840 774 851 821 792 720 849 751 813 715 879 823 778 853 739 807 812 797 781 841 852 775 865 832 686 805 845 813 822 863 833 
 P50    866 1099    1455    1918.5  2318    2267    2271    2307    2315    2347    2299    2296    2301.5  2313    2314    2351    2324    2312    2284    2347    2344 ...

We now have an option.  In testing with Excel 2013 we can simply give this file a .xls extension and open it. If we do it will give the following warning, however if you click OK it will open with the correctly formatted data.

Excel extension

Alternatively if we open the file with the .txt extension it will show 3 steps for the Text Import Wizard.  Click through the Wizard until Finish.

Text Import 1
Text Import 2
Text Import 3

After clicking Finish the data has been imported into the spreadsheet without warnings.

Imported Data

As an example we will highlight the rows we want to graph by clicking on the row numbers.

Highlighted Spreadsheet

If we then click on Insert and Recommended Charts, the default graph produced by Excel is shown below with the addition of a vertical axis title and a chart header.

Finally make sure when saving the spreadsheet it is saved in Excel format rather than the imported Tab (Text Delimited).

With this approach it is straightforward to turn the captured response times into a visual representation to being further analysis to your workloads.

HammerDB v3.3 event driven scaling

HammerDB v3.3 includes a new feature called event driven scaling to enable the scaling of virtual users to thousands of sessions running with keying and thinking time enabled. This feature adds additional benefit to your testing scenarios with the ability to handle large numbers of connections or testing with connection pooling. This post explains the benefits that this feature brings and how to enable it.

When running transactional workloads with HammerDB the default mode is CPU intensive meaning that one virtual user will run as many transactions as possible.  As an example  we will run a simple test by creating a 10 warehouse TPC-C schema and configuring a single virtual user to run a test in this default mode.

Once we’ve created this Virtual User and run the test for 1 minute on a simple PC configuration we receive a result showing that the Virtual User ran many thousands of transactions a minute.

Using HammerDB metrics we can also see that this single Virtual User was using a fair amount of CPU.

Now lets select keying and thinking time by checking the “Keying and Thinking Time” option we can now see that the transaction rate has dropped considerably.

It is clear that we would need to create a large number of Virtual Users to even begin approaching the same number of transactions that a single virtual user could process without keying and thinking time. Event driven scaling is a feature that enables each Virtual User to create multiple database sessions and manage the keying and thinking time for each asynchronously in an event-driven loop enabling HammerDB to create a much larger session count.  It should be clear that this feature is only designed to work with keying and thinking time enabled as it is only the keying and thinking time that is managed asynchronously.

To configure this feature we now select Asynchronous Scaling noting that Keying and Thinking Time is automatically selected. We have also selected a 1000 clients for our one Virtual User and a Login Delay of 60ms. This means that each client will wait for 60ms after the previous client has logged in before then logging in itself. Finally we have chosen verbose output.  Note that with this feature it is important to allow the clients enough time to both login fully before measuring performance and also at the end it will take additional time for the clients to all complete their current keying and thinking time and to exit before the virtual user reports all clients as complete.

We now create a single virtual user as before. However this time once all the clients have connected we now see that there are the 1000 asynchronous clients connected to the test database.

SELECT DB_NAME(dbid) as DBName, COUNT(dbid) as NumberOfConnections FROM sys.sysprocesses WHERE dbid > 0 GROUP BY dbid;
DBName NumberOfConnections
master 43
tempdb 1
tpcc 1003

When the workload has completed (noting as before that additional time must be given to allow all of the asynchronous clients to complete their work and log off) we can see that HammerDB reports the active sessions based on the asynchronous client count and the transaction rate for the single virtual user is considerably higher with keying and thinking time.

The event driven scaling feature is not intended to replace the default CPU intensive mode of testing and it is expected that this will continue to be the most popular methodology. Instead being able to increase up client sessions with keying and thinking time adds additional test scenarios for highly scalable systems.

How to maximize CPU performance for PostgreSQL 12.0 benchmarks on Linux

HammerDB doesn’t publish competitive database benchmarks, instead we always encourage people to be better informed by running their own.  Nevertheless in this blog sometimes we do publish performance data to highlight best practices or potential configuration pitfalls and although we’ve mentioned this one before it is worth dedicating an entire post to it as this issue seems to appear numerous times running database workloads on Linux.

As noted previously  the main developer of HammerDB is an Intel employee (#IAMINTEL) however HammerDB is a personal open source project and any opinions are my own, specific to the context of HammerDB as an independent personal project and not representing Intel.

So over at Phoronix some database benchmarks were published showing   PostgreSQL 12 Performance With AMD EPYC 7742 vs. Intel Xeon Platinum 8280 Benchmarks  

So what jumps out immediately here is the comment “The single-threaded PostgreSQL 12 performance was most surprising.” Usually when benchmark results are surprising it is a major hint that something could be misconfigured  and that certainly seems the case here, so what could it be?  Well its difficult to be entirely sure however the tests have all the characteristics of tests observed previously where the CPUs are running in powersave mode.

So lets take an Ubuntu system with Platinum 8280 CPUs with the following Ubuntu OS, reboot and check the CPU configuration before running any tests.

# uname -a
Linux ubuntu19 5.3.0-rc3-custom #1 SMP Mon Aug 12 14:07:33 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Welcome to Ubuntu 19.04 (GNU/Linux 5.3.0-rc3-custom x86_64)

So by default, the system boots into powersave.  You can see this by running the “cpupower frequency-info” command and checking the governor line. Note that this is also the case on the developer’s Ubuntu based laptop and therefore is a logical default where systems may be running on battery power.  it is also extremely important to note that for Intel CPUs the driver must show “intel_pstate” if it doesn’t then either your kernel does not support the CPU or you have misconfigured BIOS settings. (Yes it is worth reiterating that for Intel CPUs the driver MUST show intel_pstate, if it doesn’t then something is set up wrong)

cpupower frequency-info
analyzing CPU 0:
driver: intel_pstate
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: Cannot determine or is not supported.
hardware limits: 1000 MHz - 4.00 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 1000 MHz and 4.00 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 1.00 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes

So after installing PostgreSQL 12.0 we can install and run the single-threaded pgbench fairly simply (as a single threaded test we verified that the scale factor does not impact the result).

./bin/createdb pgbench
./bin/pgbench -i -s \$SCALING_FACTOR pgbench
./bin/pgbench -c 1 -S -T 60 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1000
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1377426
latency average = 0.044 ms
tps = 22957.058444 (including connections establishing)
tps = 22957.628615 (excluding connections establishing)

As this result is pretty lacklustre, lets check the frequency that the CPU is running at using the turbostat command.  Note that turbostat will show you the maximum frequency your system can run out depending on how many cores are utilised. Therefore in this case for a single-threaded workload we should be able to run up to 4GHz depending on what else is running at the time.

# ./turbostat
turbostat version 18.07.27 - Len Brown <lenb@kernel.org></lenb@kernel.org>
CPUID(0): GenuineIntel 0x16 CPUID levels; 0x80000008 xlevels;
...
33 * 100.0 = 3300.0 MHz max turbo 28 active cores
35 * 100.0 = 3500.0 MHz max turbo 24 active cores
37 * 100.0 = 3700.0 MHz max turbo 20 active cores
37 * 100.0 = 3700.0 MHz max turbo 16 active cores
37 * 100.0 = 3700.0 MHz max turbo 12 active cores
37 * 100.0 = 3700.0 MHz max turbo 8 active cores
38 * 100.0 = 3800.0 MHz max turbo 4 active cores
40 * 100.0 = 4000.0 MHz max turbo 2 active cores
...

Now running turbostat while pgbench is running we can see one core is busy, however it is only running at an average of 2.7 – 2.8 GHz over the snapshot time with a couple of examples below.

Package Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz 
0 1 1 2784 73.00 3823 2694
--
0 4 4 2864 74.83 3834 2696 

Remember that you have configured the system to save power rather than performance so it will not be running anywhere near full potential.  So lets switch the system to performance mode with one command.

./cpupower frequency-set --governor=performance
Setting cpu: 0
Setting cpu: 1
Setting cpu: 2
Setting cpu: 3
Setting cpu: 4
Setting cpu: 5
Setting cpu: 6
Setting cpu: 7
Setting cpu: 8
Setting cpu: 9
...

We can now see that the CPUs are set to performance mode.

./cpupower frequency-info
analyzing CPU 0:
driver: intel_pstate
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: Cannot determine or is not supported.
hardware limits: 1000 MHz - 4.00 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 1000 MHz and 4.00 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 3.52 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes

We don’t have to stop there, we can also change the energy performance bias setting to performance as well.

./x86_energy_perf_policy
cpu0: EPB 7
cpu0: HWP_REQ: min 10 max 40 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
cpu0: HWP_CAP: low 7 eff 10 guar 27 high 40
cpu1: EPB 7
cpu1: HWP_REQ: min 10 max 40 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
cpu1: HWP_CAP: low 7 eff 10 guar 27 high 40
cpu2: EPB 7
...

./x86_energy_perf_policy performance

./x86_energy_perf_policy
cpu0: EPB 0
cpu0: HWP_REQ: min 10 max 40 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
cpu0: HWP_CAP: low 7 eff 10 guar 27 high 40
cpu1: EPB 0
cpu1: HWP_REQ: min 10 max 40 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
cpu1: HWP_CAP: low 7 eff 10 guar 27 high 40
cpu2: EPB 0
cpu2: HWP_REQ: min 10 max 40 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
cpu2: HWP_CAP: low 7 eff 10 guar 27 high 40
cpu3: EPB 0

Note how the EPB value changed from 7 to 0 to change the balance towards performance. It is also worth noting that C-states are an important part of balancing power and performance and in this case we leave them enabled.

./cpupower idle-info
CPUidle driver: intel_idle
CPUidle governor: menu
analyzing CPU 0:
Number of idle states: 4
Available idle states: POLL C1 C1E C6

So now lets see what we get in performance mode, an almost 32% improvement (and 53% higher than the published benchmarks). 

./bin/pgbench -c 1 -S -T 60 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1000
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1815653
latency average = 0.033 ms
tps = 30260.836480 (including connections establishing)
tps = 30261.819189 (excluding connections establishing)

and turbostat tells us that we are running at almost 3.8GHz making it clear the reason for the performance gain. We asked the system to emphasise performance over power.


Package Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz
0 0 56 3779 98.12 3860 2694
--
0 25 22 3755 97.88 3846 2694

PgBench is based on TPC-B and there is an excellent post on what it is running here Do you know what you are measuring with pgbench?  so instead to run a more enterprise based transactional workload  we will run the HammerDB OLTP workload that is derived from TPC-C to see the overall impact of running transactional benchmarks in powersave mode. 

One of the most important aspects of running your own database benchmarks is that you can generate a performance profile  this means that instead of taking a single data point as a measure of database performance you run a series of tests back-to-back increasing the virtual user count each time until you have reached maximum performance.  HammerDB has the autopilot feature in the GUI or scripting in the command line to run these profiles unattended. So in this case we will do just that once in performance mode and once in powersave mode with the results shown below.

Clearly it is a good job we took the time to measure the full performance profile. Whereas in performance mode the results measured in NOPM reach a peak and then remain fairly constant in powersave mode we see the system throttle back performance with the balance towards saving power, exactly as the setting has asked it to do.  Also note that this is not an either/or scenario and you can have multiple configurations in-between, however by default powersave is chosen.

It is also important to note however that at the level of peak performance approached at 80 virtual users in this case. We are only seeing around 50% utilization across all of the CPUs in the system. The screenshot below of a subsection of the CPUs in the system (using the HammerDB metrics option in the GUI) shows that this is evenly distributed across all of the cores.  As you increase the number of virtual users you increase the CPU utilization but don’t increase the throughput (NOPM), why is this?

So is there a bottleneck in HammerDB or the TPC-C workload?  The answer is no. HammerDB has a signficant advantage of being able to run on multiple databases allowing this sort of question again to be answered from your running your own tests. HammerDB and the TPC-C derived workload can achieve signficantly higher levels of throughput running at 95%+ CPU utilization with some of the commercial database software offerings.  Instead in this case we are hitting the ceiling of PostgreSQL scalability for this type of transactional workload.  There is also an excellent presentation on exactly this topic called Don’t Stop the World which is well worth reviewing to understand the limitations. If we review the workload we can see that PostgreSQL is spending most of its time in locking.

and at the CPU level we can see PostgreSQL is using the atomic operation cmpxchg

At this point PostgreSQL is already doing considerably more than 2M PostgreSQL TPM and 1M NOPM so the scability is already incredibly good. However  it is clear that adding CPU resource beyond this point will only result in more time spent waiting in locking and will not improve performance for this intensive OLTP workload beyond a certain point.  (It is always accepted that there may be options to tune the stored procedures or postgresql.conf for some gains, however this does impact the core argument around database scalability).  Therefore it is especially important that you ensure that your CPUs are not running in powersave mode to see the maximum return on your configuration.

Finally it is worth noting that our focus has been on OLTP transactional workloads. However the CPU configuration advice applies equally to OLAP read only workloads.  For these workloads HammerDB includes an implementation derived from the TPC-H benchmark to run complex ad-hoc analytic queries against a PostgreSQL database.  In particular this is ideally suited to testing the PostgreSQL Parallel Query feature utilizing multiple cores for a single query and there is an example of this type of parallel query test here.

Benchmark on a Parallel Processing Monster!

But If we haven’t emphasised it enough, firstly whatever database benchmark you are running make sure that your CPUs are not configured to run in powersave mode.

How to add your database to HammerDB – Pt4 Commit changes and pull request

In parts 1 to 3 of this series we have gone through the steps of taking the HammerDB code and adding support for a new database.  At the moment these changes are on the local development system. in summary we have modified database.xml in the config directory, added a new mariadb.xml file, created a mariadb directory in the src directory and then added and updated the metrics, options, oltp, olap and transaction counter files for our new database.  So the next stage is to add these changes:

~/HammerDB-Fork/HammerDB$ git add .
~/HammerDB-Fork/HammerDB$ git status
On branch 54
Changes to be committed:
(use "git reset HEAD ..." to unstage)

modified: config/database.xml
new file: config/mariadb.xml
new file: src/mariadb/mariamet.tcl
new file: src/mariadb/mariaolap.tcl
new file: src/mariadb/mariaoltp.tcl
new file: src/mariadb/mariaopt.tcl
new file: src/mariadb/mariaotc.tcl

and commit then to the branch we are working on.

~/HammerDB-Fork/HammerDB$ git commit -m "Template for adding MariaDB as separate database"
[54 8ababeb] Template for adding MariaDB as separate database
7 files changed, 3471 insertions(+)
create mode 100755 config/mariadb.xml
create mode 100755 src/mariadb/mariamet.tcl
create mode 100755 src/mariadb/mariaolap.tcl
create mode 100755 src/mariadb/mariaoltp.tcl
create mode 100755 src/mariadb/mariaopt.tcl
create mode 100755 src/mariadb/mariaotc.tcl

They can then be pushed to the repository.

~/HammerDB-Fork/HammerDB$ git push origin 54
...
remote: Resolving deltas: 100% (4/4), completed with 4 local objects.
remote:
remote: Create a pull request for '54' on GitHub by visiting:
remote: https://github.com/sm-shaw/HammerDB/pull/new/54
remote:
To https://github.com/sm-shaw/HammerDB.git
* [new branch] 54 -> 54

If you think you are ready for the changes for your additional database to be included in HammerDB then you should open a pull request.  In this example there is more work to do and therefore we will not be opening the pull request at this point in time.

At the same time as opening the pull request you should update the Issue you opened in part 1 to discuss the changes you made.  In particular if there are new binaries required to support both Linux and Windows then this will be necessary for discussion to support the new binaries in a future release.

Finally bear in mind that the pull request to add a new database will be very much the start as opposed to the end of the process of adding a database.  As users begin to use your contributions to test your chosen database they will inevitably have questions for support therefore you will be required to provide assistance and respond accordingly to any questions that they may have.

How to add your database to HammerDB – Pt3 Adding to and modifying the source

From the previous post we now have source and binary copies of HammerDB for modification and test. The next step is to see if there is an Issue already created for the database we want to add or whether we should create one. In this example using MariaDB there is already an Issue created Add MariaDB as a separate database #54  although note that this can be for any database you choose.  You now want to create a new branch for this work,  in this case using the Issue number for branch.

git checkout -b 54
Switched to a new branch '54'
git branch
* 54
61
master

git status will also show you the current branch. You can now begin modifying the source to add your database.  Firstly in the database.xml in the config directory add the information for your new database.  Note in particular the prefix that will be used for variables and file and procedure names.  The commands section is used to identify words for highlighting in the script editor only. 

<mariadb>
<name>MariaDB</name>
<description>MariaDB</description>
<prefix>maria</prefix>
<library>mysqltcl</library>
<workloads>TPC-C TPC-H</workloads>
mysql::sel mysqluse mysqlescape mysqlsel mysqlnext mysqlseek mysqlmap mysqlexec mysqlclose mysqlinfo mysqlresult mysqlcol mysqlstate mysqlinsertid mysqlquery mysqlendquery mysqlbaseinfo mysqlping mysqlchangeuser mysqlreceive
mariadb>

Then create an xml file with the name of the database, in this case mariadb.xml with values that will become the user defined variables in your workload.

<?xml version="1.0" encoding="utf-8"?>
<mariadb>
<connection>
<maria_host>127.0.0.1</maria_host>
<maria_port>3306</maria_port>
</connection>
<tpcc>
<schema>
<maria_count_ware>1</maria_count_ware>
<maria_num_vu>1</maria_num_vu>
<maria_user>root</maria_user>
<maria_pass>mysql</maria_pass>
<maria_dbase>tpcc</maria_dbase>
<maria_storage_engine>innodb</maria_storage_engine>
<maria_partition>false</maria_partition>
</schema>
<driver>
...

As a test having created the config files and copied them to the binaries directory and run it we can see that MariaDB has been added as a database but cannot find the files that define the configuration and workloads, clearly this is  because we have not created them yet!

If you run the CLI you will see the same message. However you can now set your database to MariaDB and print dict will show the variables you defined in the XML file.

~/HammerDB-Fork/HammerDB-3.2$ ./hammerdbcli 
HammerDB CLI v3.2
Copyright (C) 2003-2019 Steve Shaw
Type "help" for a list of commands
The xml is well-formed, applying configuration
Error loading database source files/mariadb/mariaopt.tcl
Error loading database source files/mariadb/mariaoltp.tcl
Error loading database source files/mariadb/mariaolap.tcl
Error loading database source files/mariadb/mariaotc.tcl
hammerdb>dbset db maria
Database set to MariaDB

hammerdb>print dict
Dictionary Settings for MariaDB
connection {
maria_host = 127.0.0.1
maria_port = 3306
}
tpcc {
maria_count_ware = 1
maria_num_vu = 1
maria_user = root
maria_pass = mysql
maria_dbase = tpcc
maria_storage_engine = innodb
maria_partition = false
maria_total_iterations = 1000000
maria_raiseerror = false
maria_keyandthink = false
maria_driver = test
maria_rampup = 2
maria_duration = 5
maria_allwarehouse = false
maria_timeprofile = false
}

 What you clearly  need to do next is to create the directory named after your database and the required files of mariadb/mariaopt.tcl  for the graphical options screens, mariadb/mariamet.tcl  for metrics, mariadb/mariaoltp.tcl for the scripts for the TPC-C  workloads, mariadb/mariaolap.tcl for the TPC-H workloads and mariadb/mariaotc.tcl for the online transaction counter.  You then need to modify these files according to your needs.  In this case as MariaDB is a fork of MySQL and the workloads are so close we are going to copy all of the required files from the MySQL directory, rename them and update the variable and procedure names from mysql to mariadb.

A key feature is that the XML file where you defined the variables is stored as a dict structure with the prefix config and the name of your database. In this case configmariadb.  It is this dict which you use in the scripts to fetch and update the variables.

Starting with mariadb/mariaopt.tcl we are setting the graphical options to interact with this dict.  In this case the options are the same for MariaDB as they are for MySQL however will be different for different databases. Copying the modified source to the binary directories enables testing.

After modifying the graphical options we can then update and test the TPC-C schema build and tests in mariadb/mariaoltp.tcl. As well as ensuring we are using configmariadb and updating the variable names we also need to change the procedure names to include the prefix defined in the XML file, in this case maria. Therefore the following procedures will exist for the schema build, test and timed test scripts respectively.

proc build_mariatpcc
proc loadmariatpcc
proc loadtimedmariatpcc

Once these changes have been made we can test the schema build.

and timed test.

and the transaction counter is updated in mariadb/mariaotc.tcl

Similarly we can update the TPC-H workload in mariadb/mariaolap.tcl and then test the schema build.

and TPC-H query test

At this point we have now created a new database under MariaDB but are running exactly the same schema builds and tests as the regular MySQL schemas. However this now gives us the opportunity to make the changes we want for MariaDB only such as adding PL/SQL compatibility for the TPC-C workload or changes to support column stores for TPC-H. The important point is that  we have added support for a completely new database and all of the code changes do not impact any of the other databases in any way.

Note that if you do not want your database to appear as an option in HammerDB it can be commented out in the database.xml file.


<trafodion>
<name>Trafodion</name>
<description>Trafodion</description>
<prefix>traf</prefix>
tdbc::odbc
<workloads>TPC-C</workloads>
tdbc::odbc
</trafodion>
-->

How to add you database to HammerDB – Pt2 Fork, clone and binary downloads

As noted in the previous post to add a database to HammerDB you will need to change the source code.  There are different ways to do this, however the recommended way is as follows.  Firstly make a fork of the latest HammerDB source on GitHub to your account. To do this select fork in the top right hand corner of the GitHub page and follow the instructions. Shortly you should have your own copy of HammerDB under your account showing that it was forked from the main site.

Next you will want a development copy on your own system. This can be done with Clone or download.  in this example we will use the command line to do this using the web URL shown in the Clone or download link.

For this example with a development system on Linux we will make a directory to show that we are working in a fork of HammerDB.

mkdir HammerDB-Fork
cd HammerDB-Fork

Then in this directory clone using the URL above.

git clone https://github.com/sm-shaw/HammerDB.git
Cloning into 'HammerDB'...
...
Resolving deltas: 100% (193/193), done.

Under the working directory we know have the HammerDB source code in which to add the new database.

~/HammerDB-Fork$ ls
HammerDB
~/HammerDB-Fork$ ls HammerDB/
agent hammerdbcli.bat hammerdbws.bat license
ChangeLog hammerdb modules config hammerdb.bat readme
config.xml hammerdbcli hammerdbws images src

It is also good practice to add the upstream HammerDB site so that you can also apply changes made here to your fork

git remote add upstream https://github.com/TPC-Council/HammerDB.git

The git remote command will show the remote repositories.

git remote
origin
upstream

So just for interest let’s try and run HammerDB from the source to see what happens.

~/HammerDB-Fork/HammerDB$ ./hammerdb
While loading component file "gentheme.tcl"...
can't find package ttk::theme::clearlooks
while executing
"package require ttk::theme::$theme"
(procedure "ttk::setTheme" line 4)
invoked from within
"ttk::setTheme $theme"
(file "./src/generic/gentheme.tcl" line 354)
invoked from within
"source [ file join $UserDefaultDir src generic $f ]"

So HammerDB started as I already had TCL installed on my system but then failed to find a package it needed.  As identified in the previous post we don’t have the correct bin and lib directory so need to download the binaries version from here.  Install these alongside your source directory.  In the newly installed binaries directory you will notice that there are the bin and lib directories we need and that HammerDB will run in this directory.

~/HammerDB-Fork$ ls
HammerDB HammerDB-3.2 HammerDB-3.2-Linux.tar.gz

As the source directory will be updated from time to time from the upstream remote and the bin and lib directories are different for both Linux and Windows it is best to make your changes to the source and copy these to binary build for testing as you do not want to commit any of the binary files into the source.

Now we have a copy of HammerDB that we can make changes to, to add a new database. In the next post – we will show adding MariaDB as a separate database into HammerDB.  As a fork of MySQL this has the advantage that at least initially we can use the already provided mysqltcl3.052 interface.

How to add your database to HammerDB – Pt1 Opening an issue

A very common request is for HammerDB to add support for a new database. Before reaching out with a request your first reference should be the DB-Engines Ranking  to gauge the relative popularity of a database.  it is no coincidence that HammerDB supports the most popular databases with all of the databases currently supported being in the top 10 of this ranking.  There may be compelling reasons to add a new database outside of the top 10 to HammerDB, however clearly the HammerDB developers cannot add and maintain support for them all. Therefore this series of posts explains how any contributor can add support for a new database in HammerDB.

Firstly you will need both the source code from the HammerDB GitHub site  and the binaries for your chosen platform for testing.   Note that the key difference between the source code download and the binaries is the addition of the bin and lib directories in the binaries.  Any changes that you make to the source download can be run on your chosen platform by adding these bin and lib directories to the source or functionally equivalent keeping the binary download as a test directory and copying the modified source here over the existing files to test any changes you have made.

While on the subject of the lib directories one important aspect to consider before you begin making changes is the compiled library interface that you are going to use to communicate with your new database.  Using MySQL as an example if you load the driver script you can see the following lines:

set library mysqltcl ;# MySQL Library
...if [catch {package require $library} message] { error "Failed to load $library - $message" }

The package require line loads the compiled library from the lib directory, in this case the library in the lib/mysqltcl3.052 directory.  Looking for example in this directory on Windows there is the file libmysqltcl.dll – opening this file in an application such as dependency walker shows that this file also requires the MySQL client library libmysql.dll.  This library provides the interface between HammerDB and the TCL language it uses and the database provided client library.  The source code for this interface is here and all intefaces used are open source and GPL compliant. (Note for clarity it is the TCL interface that must be open source rather than the database client library itself) Therefore you have a choice as to whether you use an existing interface already provided in HammerDB, an interface for your database already written that you will compile and put in your test lib directory, write a new interface or use the generic provided ODBC interface. If the later is a consideration then you should use the TDBC interface that is already provided with HammerDB.   As an example although not currently visible in HammerDB the previosuly supported Trafodion database was interfaced with TDBC and is still present in the src directory. Therefore this can provide an example of adding a new database with TDBC.

If you plan for your database to be included in a HammerDB release then you will need to ensure that the client library you use works on both Linux and Windows and that the interface is open source.

Once you have decided to go ahead and add support for a new database to HammerDB and have a working client for Linux and Windows go ahead and create an Issue on theTPC  GitHub site  this will show to the HammerDB developers and others that you are considering adding support for a new database and provide the opportunity for discussion of your plans.  It is also a place to reach out for help if you get stuck. For an example there is an existing Issue to add MariaDB as a separate database on the HammerDB site.

Therefore this example series of posts will take the steps to show how to add support for this database.

Using HammerDB as a Web Service

HammerDB already has 2 interfaces with which to interface with the commands to build and test databases using the GUI interface or CLI. From HammerDB version 3.2 there is an additional interface that enables HammerDB to run as Web Service. This allows HammerDB to be driven with a REST type client using a HTTP interface to call and retrieve output from the CLI commands.  Additional  json, rest and huddle packges have been added with which to format and process input and output.  This interface can be started using the hammerdbws command at which hammerdb will proceed to listen on a predefined port. (see the documentation for setting the port). 

$ ./hammerdbws 
HammerDB Web Service v3.2
Copyright (C) 2003-2019 Steve Shaw
Type "help" for a list of commands
The xml is well-formed, applying configuration
Initialized new SQLite in-memory database
Starting HammerDB Web Service on port 8080
Listening for HTTP requests on TCP port 8080

Using a browser to navigate to that port will show the help screen.

It is important to note that scripts written to drive this interface can be written in any language of choice. When using TCL the additional packages provided can be used for formatting. The following test script shows how this interaction can be done also including deliberate errors to demonstrate error handling.


$ more restchk.tcl
#!/bin/sh
#########################################################################
## \
export LD_LIBRARY_PATH=./lib:$LD_LIBRARY_PATH
## \
export PATH=./bin:$PATH
## \
exec ./bin/tclsh8.6 "$0" ${1+"$@"}
########################################################################
set UserDefaultDir [ file dirname [ info script ] ]
::tcl::tm::path add "$UserDefaultDir/modules"
package require rest
package require huddle
puts "TEST DIRECT PRINT COMMANDS"
#clear any existing script
set res [rest::post http://localhost:8080/clearscript "" ]
puts "--------------------------------------------------------"
foreach i {db bm dict script vuconf vucreated vustatus datagen} {
puts "Printing output for $i and converting JSON to text"
set res [rest::get http://localhost:8080/$i "" ]
puts "JSON format"
puts $res
puts "TEXT format"
set res [rest::format_json $res]
puts $res
}
puts "--------------------------------------------------------"
puts "PRINT COMMANDS COMPLETE"
puts "--------------------------------------------------------"
puts "TEST PRINT COMMANDS AS OPTION TO PRINT ie print?dict "
foreach i {db bm dict script vuconf vucreated vustatus datagen} {
puts "Printing output for $i and converting JSON to text"
set res [rest::get http://localhost:8080/print?$i "" ]
puts "JSON format"
puts $res
puts "TEXT format"
set res [rest::format_json $res]
puts $res
}
puts "PRINT COMMANDS COMPLETE"
puts "--------------------------------------------------------"
puts "TEST DISET"
puts "Setting Warehouse Count to 800"
set body { "dict": "tpcc", "key": "count_ware", "value": "800" }
set res [rest::post http://localhost:8080/diset $body ]

puts "Setting password to new password"
set body { "dict": "tpcc", "key": "tpcc_pass", "value": "new_password" }
set res [rest::post http://localhost:8080/diset $body ]

puts "Setting password error 1 invalid string"
set body { "dict": "tpcc", "ke }
set res [rest::post http://localhost:8080/diset $body ]
puts $res

puts "Setting password error 2 invalid number of arguments"
set body { "dict": "tpcc" }
set res [rest::post http://localhost:8080/diset $body ]
puts $res

puts "Setting password error 3 invalid key"
set body { "dict": "tpcc", "key": "tpcds_pass", "value": "new_password" }
set res [rest::post http://localhost:8080/diset $body ]
puts $res

puts "Setting password error 4 invalid string"
set body { "dict": "tpcfg", "key": "tpcds_pass", "value": "new_password" }
set res [rest::post http://localhost:8080/diset $body ]
puts $res

puts "Setting Driver Script"
set body { "dict": "tpcc", "key": "ora_driver", "value": "timed" }
set res [rest::post http://localhost:8080/diset $body ]
puts $res

puts "Setting Driver Script Error"
set body { "dict": "tpcc", "key": "ora_driver", "value": "timid" }
set res [rest::post http://localhost:8080/diset $body ]
puts $res

puts "Clearscript"
set res [rest::post http://localhost:8080/clearscript "" ]
puts $res

puts "Loadscript"
set res [rest::post http://localhost:8080/loadscript "" ]
puts $res
set res [rest::get http://localhost:8080/print?script "" ]
#uncomment to print script
#puts $res
puts "Script in TEXT format"
set res [rest::format_json $res]
#uncomment to print script
#puts $res

puts "VU Status"
set res [rest::get http://localhost:8080/vustatus "" ]
puts $res

puts "Testing dbset"
set body { "db": "mssqs" }
set res [ rest::post http://localhost:8080/dbset $body ]
puts $res

puts "Testing dbset"
set body { "dx": "mssqls" }
set res [ rest::post http://localhost:8080/dbset $body ]
puts $res

puts "Testing dbset"
set body { "db": "pg" }
set res [ rest::post http://localhost:8080/dbset $body ]
puts $res

puts "Testing dbset"
set body { "bm": "TPC-X" }
set res [ rest::post http://localhost:8080/dbset $body ]
puts $res

puts "Testing dbset"
set body { "bm": "TPC-H" }
set res [ rest::post http://localhost:8080/dbset $body ]
puts $res

puts "Testing vuset"
set body { "vuxx": "109" }
set res [ rest::post http://localhost:8080/vuset $body ]
puts $res

puts "Testing vuset"
set body { "vu": "10" }
set res [ rest::post http://localhost:8080/vuset $body ]
puts $res

puts "Testing dgset"
set body { "directory": "/home/oracle" }
set res [ rest::post http://localhost:8080/dgset $body ]
puts $res

puts "Testing Custom Script : Open File convert to JSON and post"
set customscript "testscript.tcl"
set _ED(file) $customscript
if {$_ED(file) == ""} {return}
if {![file readable $_ED(file)]} {
puts "File \[$_ED(file)\] is not readable."
return
}
if {[catch "open \"$_ED(file)\" r" fd]} {
puts "Error while opening $_ED(file): \[$fd\]"
} else {
set _ED(package) "[read $fd]"
close $fd
}
set huddleobj [ huddle compile {string} "$_ED(package)" ]
set jsonobj [ huddle jsondump $huddleobj ]
set body [ subst { {"script": $jsonobj}} ]
set res [ rest::post http://localhost:8080/customscript $body ]
puts $res
set res [rest::get http://localhost:8080/print?script "" ]
#uncomment to print script
#puts $res
puts "Custom Script in TEXT format"
set res [rest::format_json $res]
#uncomment to print script in text
#puts $res

puts "Testing vuset"
set body { "vu": "5" }
set res [ rest::post http://localhost:8080/vuset $body ]
puts $res

puts "Testing vucreate"
set res [ rest::post http://localhost:8080/vucreate "" ]
puts $res

puts "Testing vucreate"
set res [ rest::post http://localhost:8080/vucreate "" ]
puts $res

puts "VU Status after create"
set res [rest::get http://localhost:8080/vustatus "" ]
puts $res

puts "Testing vudestroy"
set res [ rest::post http://localhost:8080/vudestroy "" ]
puts $res

puts "VU Status after destroy"
set res [rest::get http://localhost:8080/vustatus "" ]
puts $res

Running this script shows the following output.

$ ./restchk.tcl 
TEST DIRECT PRINT COMMANDS
--------------------------------------------------------
Printing output for db and converting JSON to text
JSON format
{
"ora": "Oracle",
"mssqls": "MSSQLServer",
"db2": "Db2",
"mysql": "MySQL",
"pg": "PostgreSQL",
"redis": "Redis"
}
TEXT format
ora Oracle mssqls MSSQLServer db2 Db2 mysql MySQL pg PostgreSQL redis Redis
Printing output for bm and converting JSON to text
JSON format
{"benchmark": "TPC-C"}
TEXT format
benchmark TPC-C
Printing output for dict and converting JSON to text
JSON format
{
"connection": {
"system_user": "system",
"system_password": "manager",
"instance": "oracle",
"rac": "0"
},
"tpcc": {
"count_ware": "1",
"num_vu": "1",
"tpcc_user": "tpcc",
"tpcc_pass": "tpcc",
"tpcc_def_tab": "tpcctab",
"tpcc_ol_tab": "tpcctab",
"tpcc_def_temp": "temp",
"partition": "false",
"hash_clusters": "false",
"tpcc_tt_compat": "false",
"total_iterations": "1000000",
"raiseerror": "false",
"keyandthink": "false",
"checkpoint": "false",
"ora_driver": "test",
"rampup": "2",
"duration": "5",
"allwarehouse": "false",
"timeprofile": "false"
}
}
TEXT format
connection {system_user system system_password manager instance oracle rac 0} tpcc {count_ware 1 num_vu 1 tpcc_user tpcc tpcc_pass tpcc tpcc_def_tab tpcctab tpcc_ol_tab tpcctab tpcc_def_temp temp partition false hash_clusters false tpcc_tt_compat false total_iterations 1000000 raiseerror false keyandthink false checkpoint false ora_driver test rampup 2 duration 5 allwarehouse false timeprofile false}
Printing output for script and converting JSON to text
JSON format
{"error": {"message": "No Script loaded: load with loadscript"}}
TEXT format
error {message {No Script loaded: load with loadscript}}
Printing output for vuconf and converting JSON to text
JSON format
{
"Virtual Users": "1",
"User Delay(ms)": "500",
"Repeat Delay(ms)": "500",
"Iterations": "1",
"Show Output": "1",
"Log Output": "0",
"Unique Log Name": "0",
"No Log Buffer": "0",
"Log Timestamps": "0"
}
TEXT format
{Virtual Users} 1 {User Delay(ms)} 500 {Repeat Delay(ms)} 500 Iterations 1 {Show Output} 1 {Log Output} 0 {Unique Log Name} 0 {No Log Buffer} 0 {Log Timestamps} 0
Printing output for vucreated and converting JSON to text
JSON format
{"Virtual Users created": "0"}
TEXT format
{Virtual Users created} 0
Printing output for vustatus and converting JSON to text
JSON format
{"Virtual User status": "No Virtual Users found"}
TEXT format
{Virtual User status} {No Virtual Users found}
Printing output for datagen and converting JSON to text
JSON format
{
"schema": "TPC-C",
"database": "Oracle",
"warehouses": "1",
"vu": "1",
"directory": "\/tmp\""
}
TEXT format
schema TPC-C database Oracle warehouses 1 vu 1 directory /tmp\"
--------------------------------------------------------
PRINT COMMANDS COMPLETE
--------------------------------------------------------
TEST PRINT COMMANDS AS OPTION TO PRINT ie print?dict
Printing output for db and converting JSON to text
JSON format
{
"ora": "Oracle",
"mssqls": "MSSQLServer",
"db2": "Db2",
"mysql": "MySQL",
"pg": "PostgreSQL",
"redis": "Redis"
}
TEXT format
ora Oracle mssqls MSSQLServer db2 Db2 mysql MySQL pg PostgreSQL redis Redis
Printing output for bm and converting JSON to text
JSON format
{"benchmark": "TPC-C"}
TEXT format
benchmark TPC-C
Printing output for dict and converting JSON to text
JSON format
{
"connection": {
"system_user": "system",
"system_password": "manager",
"instance": "oracle",
"rac": "0"
},
"tpcc": {
"count_ware": "1",
"num_vu": "1",
"tpcc_user": "tpcc",
"tpcc_pass": "tpcc",
"tpcc_def_tab": "tpcctab",
"tpcc_ol_tab": "tpcctab",
"tpcc_def_temp": "temp",
"partition": "false",
"hash_clusters": "false",
"tpcc_tt_compat": "false",
"total_iterations": "1000000",
"raiseerror": "false",
"keyandthink": "false",
"checkpoint": "false",
"ora_driver": "test",
"rampup": "2",
"duration": "5",
"allwarehouse": "false",
"timeprofile": "false"
}
}
TEXT format
connection {system_user system system_password manager instance oracle rac 0} tpcc {count_ware 1 num_vu 1 tpcc_user tpcc tpcc_pass tpcc tpcc_def_tab tpcctab tpcc_ol_tab tpcctab tpcc_def_temp temp partition false hash_clusters false tpcc_tt_compat false total_iterations 1000000 raiseerror false keyandthink false checkpoint false ora_driver test rampup 2 duration 5 allwarehouse false timeprofile false}
Printing output for script and converting JSON to text
JSON format
{"error": {"message": "No Script loaded: load with loadscript"}}
TEXT format
error {message {No Script loaded: load with loadscript}}
Printing output for vuconf and converting JSON to text
JSON format
{
"Virtual Users": "1",
"User Delay(ms)": "500",
"Repeat Delay(ms)": "500",
"Iterations": "1",
"Show Output": "1",
"Log Output": "0",
"Unique Log Name": "0",
"No Log Buffer": "0",
"Log Timestamps": "0"
}
TEXT format
{Virtual Users} 1 {User Delay(ms)} 500 {Repeat Delay(ms)} 500 Iterations 1 {Show Output} 1 {Log Output} 0 {Unique Log Name} 0 {No Log Buffer} 0 {Log Timestamps} 0
Printing output for vucreated and converting JSON to text
JSON format
{"Virtual Users created": "0"}
TEXT format
{Virtual Users created} 0
Printing output for vustatus and converting JSON to text
JSON format
{"Virtual User status": "No Virtual Users found"}
TEXT format
{Virtual User status} {No Virtual Users found}
Printing output for datagen and converting JSON to text
JSON format
{
"schema": "TPC-C",
"database": "Oracle",
"warehouses": "1",
"vu": "1",
"directory": "\/tmp\""
}
TEXT format
schema TPC-C database Oracle warehouses 1 vu 1 directory /tmp\"
PRINT COMMANDS COMPLETE
--------------------------------------------------------
TEST DISET
Setting Warehouse Count to 800
Setting password to new password
Setting password error 1 invalid string
{"error": {"message": "Not a valid JSON string: '{ \"dict\": \"tpcc\", \"ke }'"}}
Setting password error 2 invalid number of arguments
{"error": {"message": "Incorrect number of parameters to diset dict key value"}}
Setting password error 3 invalid key
{"error": {"message": "Dictionary \"tpcc\" for Oracle exists but key \"tpcds_pass\" doesn't"}}
Setting password error 4 invalid string
{"error": {"message": "Dictionary \"tpcfg\" for Oracle does not exist"}}
Setting Driver Script
{"success": {"message": "Set driver script to timed, clearing Script, reload script to activate new setting"}}
Setting Driver Script Error
{"error": {"message": "Error: Driver script must be either \"test\" or \"timed\""}}
Clearscript
{"success": {"message": "Script cleared"}}
Loadscript
{"success": {"message": "script loaded"}}
Script in TEXT format
VU Status
{"Virtual User status": "No Virtual Users found"}
Testing dbset
{"error": {"message": "Unknown prefix mssqs, choose one from ora mssqls db2 mysql pg redis"}}
Testing dbset
{"error": {"message": "Invalid option to dbset key value"}}
Testing dbset
{"success": {"message": "Database set to PostgreSQL"}}
Testing dbset
{"error": {"message": "Unknown benchmark TPC-X, choose one from TPC-C TPC-H"}}
Testing dbset
{"success": {"message": "Benchmark set to TPC-H for PostgreSQL"}}
Testing vuset
{"error": {"message": "Invalid option to vuset key value"}}
Testing vuset
{"success": {"message": "Virtual users set to 10"}}
Testing dgset
{"success": {"message": "Set directory to \/tmp for data generation"}}
Testing Custom Script : Open File convert to JSON and post
{"success": {"message": "Set custom script"}}
Custom Script in TEXT format
Testing vuset
{"success": {"message": "Virtual users set to 5"}}
Testing vucreate
{"success": {"message": "6 Virtual Users Created with Monitor VU"}}
Testing vucreate
{"error": {"message": "Virtual Users exist, destroy with vudestroy before creating"}}
VU Status after create
{"Virtual User status": "1 {WAIT IDLE} 2 {WAIT IDLE} 3 {WAIT IDLE} 4 {WAIT IDLE} 5 {WAIT IDLE} 6 {WAIT IDLE}"}
Testing vudestroy
{"success": {"message": "vudestroy success"}}
VU Status after destroy
{"Virtual User status": "No Virtual Users found"}

When the environment is configured you can build schemas and run workloads with the same commands used for the CLI, for example:

set res [rest::post http://localhost:8080/buildschema "" ]
set res [rest::post http://localhost:8080/vurun "" ]

The key difference is that output is now stored in a job format that can be retrieved at a later point. An example is shown where a build generates a jobid.

{"success": {"message": "Building 5 Warehouses with 6 Virtual Users, 5 active + 1 Monitor VU(dict value num_vu is set to 5): JOBID=5D23464E58D203E273738333"}}

That is then used to query the status of the build.

The output is stored in a SQLite database that by default runs in in-memory mode that is not persistent over restarts,  however it can be configured to use a permanent database.

It is not intended for the web service to replace the GUI or CLI environments that still prove the most popular way to run HammerDB, however the aim is to provide an additional way that HammerDB can be integrated into tests in  in cloud environments particular.

Running hammerdbcli from a bash or batch script updated for v3.2

By default the HammerDB command line interface was designed to run interactively. However many people wanted to run it from a script and an example to do this was shown here. However this only worked on Linux and therefore @xychar proposed a modification to run command line scripts directly instead of typing commands by hand for Windows, as using pipe for standard input does not work on Windows.

Reviewing the proposed change it was clear that doing the same thing as running an autopilot script in the GUI could be adapted for use with the CLI and therefore from version 3.2 it is possible to add the auto keyword and the same of a script to run it non-interactively, for example hammerdbcli auto script_to_run.tcl  on Linux or hammerdbcli.bat auto script_to_run.tcl on Windows.

An example script is shown to build a schema called buildcli.tcl. Note that the line “vwait forever” has been added so that the interpreter enters the event loop that happens automatically in interactive mode.

!/bin/tclsh
puts "SETTING CONFIGURATION"
global complete
proc wait_to_complete {} {
global complete
set complete [vucomplete]
if {!$complete} { after 5000 wait_to_complete } else { exit }
}
dbset db ora
diset connection system_password oracle
diset connection instance vulpdb1
diset tpcc count_ware 2
diset tpcc num_vu 2
diset tpcc tpcc_def_tab users
print dict
buildschema
wait_to_complete
vwait forever

This can then be called as follows from a driving script. The auto keyword then makes hammerdbcli enter a non-interactive mode to run the specified script.

$ more build.sh
#!/bin/bash
echo "BASH SCRIPT BEFORE BUILD.."
./hammerdbcli auto buildcli.tcl
echo "BASH SCRIPT AFTER BUILD.."

When run the output looks as follows:

./build.sh
BASH SCRIPT BEFORE BUILD..
HammerDB CLI v3.2
Copyright (C) 2003-2019 Steve Shaw
Type "help" for a list of commands
The xml is well-formed, applying configuration
SETTING CONFIGURATION
Database set to Oracle
...
Vuser 2:FINISHED SUCCESS
Vuser 1:Workers: 0 Active 2 Done
Vuser 1:CREATING TPCC INDEXES
Vuser 1:CREATING TPCC STORED PROCEDURES
Vuser 1:GATHERING SCHEMA STATISTICS
Vuser 1:TPCC SCHEMA COMPLETE
Vuser 1:FINISHED SUCCESS
ALL VIRTUAL USERS COMPLETE
BASH SCRIPT AFTER BUILD..

As per the original intention this can also be driven from a Windows batch script for automation.

HammerDB Source Code moved to the TPC Council

The TPC Council recently announced that the TPC is now hosting the HammerDB open source projects GitHub repository.

HammerDB has increased dramatically in popularity and use and has been identified as  the industry default for database benchmarking illustrating both the popularity of open source  and TPC based benchmarks.  Consequently with the wish to engage with the open source database benchmarking community the TPC approached HammerDB to collaborate on development and bring together both the communities around the TPC benchmarks and HammerDB.  The first step towards this goal is the recently announced move of the HammerDB source code to the TPC-Council GitHub repository.  The previous developer owned GitHub repository sm-shaw github repository has been deleted and the TPC-Council repository will be the source code repository for ongoing and future HammerDB development.  Anyone wishing to contribute to the development of HammerDB should do so from the HammerDB TPC-Council GitHub repository. From the first release up to version 3.1 HammerDB has been released on the HammerDB sourceforge site and currently hosts the runnable binary downloads and support site. Over time downloads and support will also transition to the TPC-Council GitHub site.  The HammerDB website hosts the latest documentation in docbook format and links to published benchmarks and will be maintained as the primary website.  Downloads from the download page on this site will show the current release.  To understand the difference between the downloadable binaries and the source code view the post on HammerDB Concepts and Architecture.

The license of HammerDB remains as GPLv3 and copyright to Steve Shaw the developer of HammerDB. For additional clarity Steve Shaw is an employee of Intel however HammerDB is not Intel software and was developed as an approved personal open source project where a stipulation of this approval was that there would be no association of the software with Intel or the developers status as an employee of the company.