{"id":1205,"date":"2022-07-05T16:47:28","date_gmt":"2022-07-05T16:47:28","guid":{"rendered":"https:\/\/www.hammerdb.com\/blog\/?p=1205"},"modified":"2022-07-05T16:47:28","modified_gmt":"2022-07-05T16:47:28","slug":"why-tcl-is-700-faster-than-python-for-database-benchmarking","status":"publish","type":"post","link":"https:\/\/www.hammerdb.com\/blog\/uncategorized\/why-tcl-is-700-faster-than-python-for-database-benchmarking\/","title":{"rendered":"Why Tcl is 700% faster than Python for database benchmarking"},"content":{"rendered":"<p>Python is a popular programming language, especially for beginners, and consequently we see it occurring in places where it just shouldn&#8217;t be used, such as database benchmarking.\u00a0 In contrast, a frequent question when it comes to HammerDB is why is it written in Tcl? Surely any language will do?<\/p>\n<p>This post addresses both questions to illustrate exactly why HammerDB is written in Tcl and why using Python to benchmark a database can result in poor performance and misleading results.\u00a0 To do this, we are going to rewrite HammerDB in Python and run a series of tests on a 2 socket Intel(R) Xeon(R) Platinum 8280L server to see how and why Tcl is 700% faster than Python for database benchmarking*<\/p>\n<h3>Background and Concepts<\/h3>\n<p>To begin with, you should be familiar with the concepts of parallelism vs concurrency and why it is vital that your benchmark should run in parallel so that you are testing the concurrency of the database. An introduction to these concepts is given in the following post.<\/p>\n<blockquote class=\"wp-embedded-content\" data-secret=\"oFeyLDIlwE\"><p><a href=\"https:\/\/www.hammerdb.com\/blog\/uncategorized\/what-programming-languages-does-hammerdb-use-and-why-does-it-matter\/\">What programming languages does HammerDB use and why does it matter?<\/a><\/p><\/blockquote>\n<p><iframe loading=\"lazy\" class=\"wp-embedded-content\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; clip: rect(1px, 1px, 1px, 1px);\" title=\"&#8220;What programming languages does HammerDB use and why does it matter?&#8221; &#8212; HammerDB Blog\" src=\"https:\/\/www.hammerdb.com\/blog\/uncategorized\/what-programming-languages-does-hammerdb-use-and-why-does-it-matter\/embed\/#?secret=UvxTuANoJi#?secret=oFeyLDIlwE\" data-secret=\"oFeyLDIlwE\" width=\"525\" height=\"296\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\"><\/iframe><\/p>\n<p>You should also be familiar with the concepts of <a href=\"https:\/\/en.wikipedia.org\/wiki\/ACID\">ACID compliance and locking versus multiversioning<\/a> as well as <a href=\"https:\/\/en.wikipedia.org\/wiki\/Isolation_(database_systems)\">database transaction isolation levels,<\/a> as it is essential to ensure that the database benchmark itself is not serialized.<\/p>\n<h3>Rewriting HammerDB in Python<\/h3>\n<p>So we are not actually going to rewrite HammerDB in Python, however it is trivial to write a command line driver script in Python as all the hard work in creating the schema and writing the stored procedures has already been done. (We use stored procedures because, as the introductory post shows, using single SQL statements turns our database benchmark into a network test).\u00a0 So instead we are going to take a cut down version of the HammerDB TPROC-C driver script and do the same in Python and use the HammerDB infrastructure to measure performance. For this example, we are going to use PostgreSQL stored procedures in PostgreSQL 14.1.<\/p>\n<p>So firstly we have the Tcl based driver script called pgtest_thread.tcl<\/p>\n<div class=\"dm-code-snippet dark default  dm-normal-version\" style=\"background-color:#abb8c3;\" snippet-height=\"\">\n\t\t\t<div class=\"control-language\">\n                <div class=\"dm-buttons\">\n                    <div class=\"dm-buttons-left\">\n                        <div class=\"dm-button-snippet red-button\"><\/div>\n                        <div class=\"dm-button-snippet orange-button\"><\/div>\n                        <div class=\"dm-button-snippet green-button\"><\/div>\n                    <\/div>\n                    <div class=\"dm-buttons-right\">\n                        <a id=\"dm-copy-raw-code\">\n                        <span class=\"dm-copy-text\">Copy Code<\/span>\n                        <span class=\"dm-copy-confirmed\" style=\"display:none\">Copied<\/span>\n                        <span class=\"dm-error-message\" style=\"display:none\">Use a different Browser<\/span><\/a>\n                    <\/div>\n                <\/div>\n                <pre class=\"no-line-numbers\"><code id=\"dm-code-raw\" class=\"no-wrap language-clike\"><\/p>\n<pre class=\"dm-pre-admin-side\">#!\/usr\/local\/bin\/tclsh8.6\r\npackage require Thread\r\nset maxvuser 64\r\nfor { set vuser 0 } {$vuser &lt; $maxvuser } {incr vuser} {\r\n        set threadID [thread::create {\r\n#EDITABLE OPTIONS##################################################\r\nset library Pgtcl ;# PostgreSQL Library\r\nset total_iterations 10000000 ;# Number of transactions before logging off\r\nset host \"localhost\" ;# Address of the server hosting PostgreSQL\r\nset port \"5432\" ;# Port of the PostgreSQL Server\r\nset sslmode \"disable\" ;# SSLMode of the PostgreSQL Server\r\nset user \"tpcc\" ;# PostgreSQL user\r\nset password \"tpcc\" ;# Password for the PostgreSQL user\r\nset db \"tpcc\" ;# Database containing the TPC Schema\r\n#EDITABLE OPTIONS##################################################\r\n#RANDOM NUMBER\r\n  proc RandomNumber {m M} {return [expr {int($m+rand()*($M+1-$m))}]}\r\n  proc NURand { iConst x y C } {return [ expr {((([RandomNumber 0 $iConst] | [RandomNumber $x $y]) + $C) % ($y - $x + 1)) + $x }]} \r\n   #RANDOM NAME\r\n  proc randname { num } {\r\n    array set namearr { 0 BAR 1 OUGHT 2 ABLE 3 PRI 4 PRES 5 ESE 6 ANTI 7 CALLY 8 ATION 9 EING }\r\n    set name [ concat $namearr([ expr {( $num \/ 100 ) % 10 }])$namearr([ expr {( $num \/ 10 ) % 10 }])$namearr([ expr {( $num \/ 1 ) % 10 }]) ]\r\n    return $name\r\n  }\r\n\r\n#LOAD LIBRARIES AND MODULES\r\nif [catch {package require $library} message] { error \"Failed to load $library - $message\" }\r\n\r\n#TIMESTAMP\r\nproc gettimestamp { } {\r\n    set tstamp [ clock format [ clock seconds ] -format %Y%m%d%H%M%S ]\r\n    return $tstamp\r\n}\r\n#POSTGRES CONNECTION\r\nproc ConnectToPostgres { host port sslmode user password dbname } {\r\n    global tcl_platform\r\n    if {[catch {set lda [pg_connect -conninfo [list host = $host port = $port sslmode = $sslmode user = $user password = $password dbname = $dbname ]]} message]} {\r\n        set lda \"Failed\" ; puts $message\r\n        error $message\r\n    } else {\r\n        if {$tcl_platform(platform) == \"windows\"} {\r\n            #Workaround for Bug #95 where first connection fails on Windows\r\n            catch {pg_disconnect $lda}\r\n            set lda [pg_connect -conninfo [list host = $host port = $port sslmode = $sslmode user = $user password = $password dbname = $dbname ]]\r\n        }\r\n        pg_notice_handler $lda puts\r\n        set result [ pg_exec $lda \"set CLIENT_MIN_MESSAGES TO 'ERROR'\" ]\r\n        pg_result $result -clear\r\n    }\r\n    return $lda\r\n}\r\n#NEW ORDER\r\nproc neword { lda no_w_id w_id_input } {\r\n    #2.4.1.2 select district id randomly from home warehouse where d_w_id = d_id\r\n    set no_d_id [ RandomNumber 1 10 ]\r\n    #2.4.1.2 Customer id randomly selected where c_d_id = d_id and c_w_id = w_id\r\n    set no_c_id [ RandomNumber 1 3000 ]\r\n    #2.4.1.3 Items in the order randomly selected from 5 to 15\r\n    set ol_cnt [ RandomNumber 5 15 ]\r\n    #2.4.1.6 order entry date O_ENTRY_D generated by SUT\r\n    set date [ gettimestamp ]\r\n    set result [pg_exec $lda \"call neword($no_w_id,$w_id_input,$no_d_id,$no_c_id,$ol_cnt,0.0,'','',0.0,0.0,0,TO_TIMESTAMP('$date','YYYYMMDDHH24MISS')::timestamp without time zone)\" ]\r\n    if {[pg_result $result -status] != \"PGRES_TUPLES_OK\"} {\r\n            error \"[pg_result $result -error]\"\r\n        pg_result $result -clear\r\n    } else {\r\n        #puts \"New Order: $no_w_id $w_id_input $no_d_id $no_c_id $ol_cnt 0 [ pg_result $result -list ]\"\r\n        pg_result $result -clear\r\n    }\r\n}\r\n#PAYMENT\r\nproc payment { lda p_w_id w_id_input } {\r\n    #2.5.1.1 The home warehouse id remains the same for each terminal\r\n    #2.5.1.1 select district id randomly from home warehouse where d_w_id = d_id\r\n    set p_d_id [ RandomNumber 1 10 ]\r\n    #2.5.1.2 customer selected 60% of time by name and 40% of time by number\r\n    set x [ RandomNumber 1 100 ]\r\n    set y [ RandomNumber 1 100 ]\r\n    if { $x &lt;= 85 } {\r\n        set p_c_d_id $p_d_id\r\n        set p_c_w_id $p_w_id\r\n    } else {\r\n        #use a remote warehouse\r\n        set p_c_d_id [ RandomNumber 1 10 ]\r\n        set p_c_w_id [ RandomNumber 1 $w_id_input ]\r\n        while { ($p_c_w_id == $p_w_id) &amp;&amp; ($w_id_input != 1) } {\r\n            set p_c_w_id [ RandomNumber 1  $w_id_input ]\r\n        }\r\n    }\r\n    set nrnd [ NURand 255 0 999 123 ]\r\n    set name [ randname $nrnd ]\r\n    set p_c_id [ RandomNumber 1 3000 ]\r\n    if { $y &lt;= 60 } {\r\n        #use customer name\r\n        #C_LAST is generated\r\n        set byname 1\r\n    } else {\r\n        #use customer number\r\n        set byname 0\r\n        set name {}\r\n    }\r\n    #2.5.1.3 random amount from 1 to 5000\r\n    set p_h_amount [ RandomNumber 1 5000 ]\r\n    #2.5.1.4 date selected from SUT\r\n    set h_date [ gettimestamp ]\r\n    #2.5.2.1 Payment Transaction\r\n    #change following to correct values\r\n    set result [pg_exec $lda \"call payment($p_w_id,$p_d_id,$p_c_w_id,$p_c_d_id,$byname,$p_h_amount,'0','$name',$p_c_id,'','','','','','','','','','','','','','','','','','',TO_TIMESTAMP('$h_date','YYYYMMDDHH24MISS')::timestamp without time zone,0.0,0.0,0.0,'',TO_TIMESTAMP('$h_date','YYYYMMDDHH24MISS')::timestamp without time zone)\" ]\r\n    if {[pg_result $result -status] != \"PGRES_TUPLES_OK\"} {\r\n            error \"[pg_result $result -error]\"\r\n        pg_result $result -clear\r\n    } else {\r\n        #puts \"Payment: $p_w_id $p_d_id $p_c_w_id $p_c_d_id $p_c_id $byname $p_h_amount $name 0 0 [ pg_result $result -list ]\"\r\n        pg_result $result -clear\r\n    }\r\n}\r\n#ORDER_STATUS\r\nproc ostat { lda w_id } {\r\n    #2.5.1.1 select district id randomly from home warehouse where d_w_id = d_id\r\n    set d_id [ RandomNumber 1 10 ]\r\n    set nrnd [ NURand 255 0 999 123 ]\r\n    set name [ randname $nrnd ]\r\n    set c_id [ RandomNumber 1 3000 ]\r\n    set y [ RandomNumber 1 100 ]\r\n    if { $y &lt;= 60 } {\r\n        set byname 1\r\n    } else {\r\n        set byname 0\r\n        set name {}\r\n    }\r\n            set date [ gettimestamp ]\r\n            set result [pg_exec $lda \"call ostat($w_id,$d_id,$c_id,$byname,'$name','','',0.0,0,TO_TIMESTAMP('$date','YYYYMMDDHH24MISS')::timestamp without time zone,0,'')\" ] \r\n    if {[pg_result $result -status] != \"PGRES_TUPLES_OK\"} {\r\n            error \"[pg_result $result -error]\"\r\n        pg_result $result -clear\r\n    } else {\r\n        #puts \"Order Status: $w_id $d_id $c_id $byname $name [ pg_result $result -list ]\"\r\n        pg_result $result -clear\r\n    }\r\n}\r\n#DELIVERY\r\nproc delivery { lda w_id } {\r\n    set carrier_id [ RandomNumber 1 10 ]\r\n    set date [ gettimestamp ]\r\n            set result [pg_exec $lda \"call delivery($w_id,$carrier_id,TO_TIMESTAMP('$date','YYYYMMDDHH24MISS')::timestamp without time zone)\" ] \r\n    if {[pg_result $result -status] ni {\"PGRES_TUPLES_OK\" \"PGRES_COMMAND_OK\"}} {\r\n            error \"[pg_result $result -error]\"\r\n        pg_result $result -clear\r\n    } else {\r\n        #puts \"Delivery: $w_id $carrier_id [ pg_result $result -list ]\"\r\n        pg_result $result -clear\r\n    }\r\n}\r\n#STOCK LEVEL\r\nproc slev { lda w_id stock_level_d_id } {\r\n    set threshold [ RandomNumber 10 20 ]\r\n            set result [pg_exec $lda \"call slev($w_id,$stock_level_d_id,$threshold,0)\"]\r\n    if {[pg_result $result -status] ni {\"PGRES_TUPLES_OK\" \"PGRES_COMMAND_OK\"}} {\r\n            error \"[pg_result $result -error]\"\r\n        pg_result $result -clear\r\n    } else {\r\n        #puts \"Stock Level: $w_id $stock_level_d_id $threshold [ pg_result $result -list ]\"\r\n        pg_result $result -clear\r\n    }\r\n}\r\n\r\n#RUN TPC-C\r\nset lda [ ConnectToPostgres $host $port $sslmode $user $password $db ]\r\nif { $lda eq \"Failed\" } {\r\n    error \"error, the database connection to $host could not be established\"\r\n} \r\npg_select $lda \"select max(w_id) from warehouse\" w_id_input_arr {\r\n    set w_id_input $w_id_input_arr(max)\r\n}\r\n#2.4.1.1 set warehouse_id stays constant for a given terminal\r\nset w_id  [ RandomNumber 1 $w_id_input ]  \r\npg_select $lda \"select max(d_id) from district\" d_id_input_arr {\r\n    set d_id_input $d_id_input_arr(max)\r\n}\r\nset stock_level_d_id  [ RandomNumber 1 $d_id_input ]  \r\nputs \"Processing $total_iterations transactions without output suppressed...\"\r\nfor {set it 0} {$it &lt; $total_iterations} {incr it} {\r\n    set choice [ RandomNumber 1 23 ]\r\n    if {$choice &lt;= 10} {\r\n        neword $lda $w_id $w_id_input \r\n    } elseif {$choice &lt;= 20} {\r\n        payment $lda $w_id $w_id_input\r\n    } elseif {$choice &lt;= 21} {\r\n        delivery $lda $w_id\r\n    } elseif {$choice &lt;= 22} {\r\n        slev $lda $w_id $stock_level_d_id\r\n    } elseif {$choice &lt;= 23} {\r\n        ostat $lda $w_id\r\n    }\r\n}\r\npg_disconnect $lda\r\nthread::wait }]\r\n}\r\nvwait forever\r\n<\/pre>\n<p><\/code><\/pre>\n\t\t\t<\/div>\n        <\/div>\n<p>and the same in Python called pgtest_thread.py<\/p>\n<div class=\"dm-code-snippet dark default  dm-normal-version\" style=\"background-color:#abb8c3;\" snippet-height=\"\">\n\t\t\t<div class=\"control-language\">\n                <div class=\"dm-buttons\">\n                    <div class=\"dm-buttons-left\">\n                        <div class=\"dm-button-snippet red-button\"><\/div>\n                        <div class=\"dm-button-snippet orange-button\"><\/div>\n                        <div class=\"dm-button-snippet green-button\"><\/div>\n                    <\/div>\n                    <div class=\"dm-buttons-right\">\n                        <a id=\"dm-copy-raw-code\">\n                        <span class=\"dm-copy-text\">Copy Code<\/span>\n                        <span class=\"dm-copy-confirmed\" style=\"display:none\">Copied<\/span>\n                        <span class=\"dm-error-message\" style=\"display:none\">Use a different Browser<\/span><\/a>\n                    <\/div>\n                <\/div>\n                <pre class=\"no-line-numbers\"><code id=\"dm-code-raw\" class=\"no-wrap language-python\"><\/p>\n<pre class=\"dm-pre-admin-side\">#!\/usr\/bin\/python\r\nimport psycopg2\r\nimport sys\r\nimport random\r\nimport math\r\nfrom datetime import datetime\r\nimport threading\r\n\r\ndef gettimestamp():\r\n    tstamp = datetime.now().strftime(\"%Y%m%d%H%M%S\")\r\n    return tstamp\r\n\r\ndef nurand(iConst,x,y,C): \r\n    nrnd = ((((random.randint(0,iConst) | random.randint(x,y)) + C) % (y - x + 1 )) + x)\r\n    return nrnd\r\n\r\ndef randname(num):\r\n    namearr = [ \"BAR\", \"OUGHT\", \"ABLE\", \"PRI\", \"PRES\", \"ESE\", \"ANTI\", \"CALLY\", \"ATION\", \"EING\" ]\r\n    pt1 = namearr[math.floor((num \/ 100) % 10)]\r\n    pt2 = namearr[math.floor((num \/ 10) % 10)]\r\n    pt3 = namearr[math.floor((num \/ 1) % 10)]\r\n    name = pt1 + pt2 + pt3\r\n    return name\r\n\r\ndef neword(cur, no_w_id, w_id_input):\r\n    #print(\"neword\")\r\n    no_d_id = random.randint(1,10)\r\n    no_c_id = random.randint(1,3000)\r\n    ol_cnt = random.randint(5,15)\r\n    date = gettimestamp()\r\n    cur.execute('call neword(%s,%s,%s,%s,%s,0.0,\\'\\',\\'\\',0.0,0.0,0,TO_TIMESTAMP(%s,\\'YYYYMMDDHH24MISS\\')::timestamp without time zone)',(no_w_id,w_id_input,no_d_id,no_c_id,ol_cnt,date))\r\n    rows = cur.fetchall()\r\n    #print(rows)\r\n\r\ndef payment(cur, p_w_id, w_id_input):\r\n    #print(\"payment\")\r\n    p_d_id = random.randint(1,10)\r\n    x = random.randint(1,100)\r\n    y = random.randint(1,100)\r\n    if (x &lt;= 85):\r\n            p_c_d_id = p_d_id\r\n            p_c_w_id = p_w_id\r\n    else:\r\n            p_c_d_id = random.randint(1,10)\r\n            p_c_w_id = random.randint(1,w_id_input) \r\n            while (p_c_w_id == p_w_id) and (w_id_input != 1):\r\n                p_c_w_id = random.randint(1,w_id_input) \r\n\r\n    nrnd = nurand(255,0,999,123)\r\n    name = randname(nrnd)\r\n    p_c_id = random.randint(1,3000)\r\n    if (y &lt;= 85):\r\n            byname = 1\r\n    else:\r\n            byname = 0\r\n            name = \"\"\r\n\r\n    p_h_amount = random.randint(1,5000)\r\n    h_date = gettimestamp()\r\n    p_c_since = gettimestamp()\r\n    cur.execute('call payment(%s,%s,%s,%s,%s,%s,\\'0\\',%s,%s,\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',\\'\\',TO_TIMESTAMP(%s,\\'YYYYMMDDHH24MISS\\')::timestamp without time zone,0.0,0.0,0.0,\\'\\',TO_TIMESTAMP(%s,\\'YYYYMMDDHH24MISS\\')::timestamp without time zone)',(p_w_id,p_d_id,p_c_w_id,p_c_d_id,byname,p_h_amount,name,p_c_id,p_c_since,h_date))\r\n    rows = cur.fetchall()\r\n    #print(rows)\r\n\r\ndef delivery(cur, w_id):\r\n    #print(\"delivery\")\r\n    carrier_id = random.randint(1,10)\r\n    date = gettimestamp()\r\n    cur.execute('call delivery(%s,%s,TO_TIMESTAMP(%s,\\'YYYYMMDDHH24MISS\\')::timestamp without time zone)',(w_id,carrier_id,date))\r\n    #rows = cur.fetchall()\r\n    #print(rows)\r\n\r\ndef slev(cur, w_id, stock_level_d_id):\r\n    #print(\"slev\" )\r\n    threshold = random.randint(10, 20)\r\n    cur.execute('call slev(%s,%s,%s,0)',(w_id,stock_level_d_id,threshold))\r\n    rows = cur.fetchall()\r\n    #print(rows)\r\n\r\ndef ostat(cur, w_id):\r\n    #print(\"ostat\")\r\n    d_id = random.randint(1, 10)\r\n    nrnd = nurand(255,0,999,123)\r\n    name = randname(nrnd)\r\n    c_id = random.randint(1, 3000)\r\n    y = random.randint(1, 100)\r\n    if (y &lt;= 60):\r\n        byname = 1\r\n    else:\r\n        byname = 0\r\n        name = \"\"\r\n\r\n    date = gettimestamp()\r\n    cur.execute('call ostat(%s,%s,%s,%s,%s,\\'\\',\\'\\',0.0,0,TO_TIMESTAMP(%s,\\'YYYYMMDDHH24MISS\\')::timestamp without time zone,0,\\'\\')',(w_id,d_id,c_id,byname,name,date))\r\n    rows = cur.fetchall()\r\n    #print(rows)\r\n\r\ndef runtpcc():\r\n    total_iterations = 1000000\r\n    conn = psycopg2.connect(host=\"localhost\",port=\"5432\",database=\"tpcc\",user=\"tpcc\",password=\"tpcc\")\r\n    conn.set_session(autocommit=True)\r\n    cur = conn.cursor()\r\n    cur.execute(\"select max(w_id) from warehouse\")\r\n    w_id_input = cur.fetchone()[0]\r\n    w_id = random.randint(1, w_id_input)\r\n    cur.execute(\"select max(d_id) from district\")\r\n    d_id_input = cur.fetchone()[0]\r\n    stock_level_d_id = random.randint(1, d_id_input)\r\n    print (\"Processing\", total_iterations, \"transactions without output suppressed...\")\r\n    for x in range(total_iterations):\r\n        choice = random.randint(1, 23)\r\n        if (choice &lt;= 10):\r\n            neword (cur, w_id, w_id_input)\r\n        elif (choice &lt;= 20):\r\n            payment (cur,w_id, w_id_input)\r\n        elif (choice &lt;= 21):\r\n            delivery (cur, w_id)\r\n        elif (choice &lt;= 22):\r\n            slev (cur, w_id, stock_level_d_id)\r\n        elif (choice &lt;= 23):\r\n            ostat (cur, w_id)\r\n\r\n        if (x == total_iterations):\r\n            conn.close()\r\n\r\ndef main():\r\n    threads = []\r\n    for n in range(64):\r\n        t = threading.Thread(target=runtpcc)\r\n        threads.append(t)\r\n        t.start()\r\n\r\n    for t in threads:\r\n        t.join()\r\n\r\nif __name__ == '__main__':\r\n    main()\r\n<\/pre>\n<p><\/code><\/pre>\n\t\t\t<\/div>\n        <\/div>\n<p>It should be clear that\u00a0 these driver scripts do exactly the same thing, they create multiple threads and loop calling the 5 TPROC-C stored procedures meaning that on the PostgreSQL database the workload itself is identical only the language calling the stored procedures is different.<\/p>\n<h3>Building the Test Schema and Timing Script<\/h3>\n<p>Next we are going to build a PostgreSQL test schema using the HammerDB CLI as follows:<\/p>\n<div class=\"row\">\n<div class=\"small-12 columms\">\n<div class=\"dm-code-snippet dark default  dm-normal-version\" style=\"background-color:#abb8c3;\" snippet-height=\"\">\n\t\t\t<div class=\"control-language\">\n                <div class=\"dm-buttons\">\n                    <div class=\"dm-buttons-left\">\n                        <div class=\"dm-button-snippet red-button\"><\/div>\n                        <div class=\"dm-button-snippet orange-button\"><\/div>\n                        <div class=\"dm-button-snippet green-button\"><\/div>\n                    <\/div>\n                    <div class=\"dm-buttons-right\">\n                        <a id=\"dm-copy-raw-code\">\n                        <span class=\"dm-copy-text\">Copy Code<\/span>\n                        <span class=\"dm-copy-confirmed\" style=\"display:none\">Copied<\/span>\n                        <span class=\"dm-error-message\" style=\"display:none\">Use a different Browser<\/span><\/a>\n                    <\/div>\n                <\/div>\n                <pre class=\"no-line-numbers\"><code id=\"dm-code-raw\" class=\"no-wrap language-clike\"><\/p>\n<pre class=\"dm-pre-admin-side\">dbset db pg\r\ndbset bm TPC-C\r\nvuset logtotemp 0\r\ndiset tpcc pg_superuser postgres\r\ndiset tpcc pg_superuserpass postgres\r\ndiset tpcc pg_storedprocs true\r\ndiset tpcc pg_count_ware 1000\r\ndiset tpcc pg_num_vu 64\r\nbuildschema\r\nwaittocomplete\r\nquit<\/pre>\n<p><\/code><\/pre>\n\t\t\t<\/div>\n        <\/div>\n<\/div>\n<\/div>\n<p>Finally we are going to build a HammerDB timing script. The key thing to note is that we are going to set pg_total_iterations to 1, this means that we have a special form of driver script that will time the transactions on the database but will not run any transactions itself to impact the load. This way we can run our external Tcl and Python test scripts and capture the results.<\/p>\n<div class=\"dm-code-snippet dark default  dm-normal-version\" style=\"background-color:#abb8c3;\" snippet-height=\"\">\n\t\t\t<div class=\"control-language\">\n                <div class=\"dm-buttons\">\n                    <div class=\"dm-buttons-left\">\n                        <div class=\"dm-button-snippet red-button\"><\/div>\n                        <div class=\"dm-button-snippet orange-button\"><\/div>\n                        <div class=\"dm-button-snippet green-button\"><\/div>\n                    <\/div>\n                    <div class=\"dm-buttons-right\">\n                        <a id=\"dm-copy-raw-code\">\n                        <span class=\"dm-copy-text\">Copy Code<\/span>\n                        <span class=\"dm-copy-confirmed\" style=\"display:none\">Copied<\/span>\n                        <span class=\"dm-error-message\" style=\"display:none\">Use a different Browser<\/span><\/a>\n                    <\/div>\n                <\/div>\n                <pre class=\"no-line-numbers\"><code id=\"dm-code-raw\" class=\"no-wrap language-clike\"><\/p>\n<pre class=\"dm-pre-admin-side\">dbset db pg\r\ndbset bm TPC-C\r\ndiset tpcc pg_superuser steve\r\ndiset tpcc pg_defaultdbase postgres\r\ndiset tpcc pg_storedprocs true\r\ndiset tpcc pg_total_iterations 1\r\ndiset tpcc pg_driver timed\r\ndiset tpcc pg_rampup 0\r\ndiset tpcc pg_duration 2\r\nvuset logtotemp 1\r\nloadscript\r\nvuset vu 1\r\ntcstart\r\nvucreate\r\nvurun\r\nruntimer 130\r\nvudestroy\r\ntcstop<\/pre>\n<p><\/code><\/pre>\n\t\t\t<\/div>\n        <\/div>\n<p>If all is configured correctly when running either our Tcl or Python driver script we will see output such as follows, note that Vuser 2 shows FINISHED SUCCESS just after it starts so the only activity that this script is doing on the database is capturing the NOPM and TPM over a time period, in this example 2 minutes.<\/p>\n<pre>$ .\/hammerdbcli auto pgtime.tcl\r\nHammerDB CLI v4.4\r\nCopyright (C) 2003-2022 Steve Shaw\r\nType \"help\" for a list of commands\r\nThe xml is well-formed, applying configuration\r\nDatabase set to PostgreSQL\r\nBenchmark set to TPC-C for PostgreSQL\r\nChanged tpcc:pg_superuser from postgres to steve for PostgreSQL\r\nValue postgres for tpcc:pg_defaultdbase is the same as existing value postgres, no change made\r\nChanged tpcc:pg_storedprocs from false to true for PostgreSQL\r\nChanged tpcc:pg_total_iterations from 10000000 to 1 for PostgreSQL\r\nClearing Script, reload script to activate new setting\r\nScript cleared\r\nChanged tpcc:pg_driver from test to timed for PostgreSQL\r\nChanged tpcc:pg_rampup from 2 to 0 for PostgreSQL\r\nChanged tpcc:pg_duration from 5 to 2 for PostgreSQL\r\nScript loaded, Type \"print script\" to view\r\nTransaction Counter Started\r\nVuser 1 created MONITOR - WAIT IDLE\r\nVuser 2 created - WAIT IDLE\r\nLogging activated\r\nto \/tmp\/hammerdb.log\r\n2 Virtual Users Created with Monitor VU\r\nVuser 1:RUNNING\r\nVuser 1:Beginning rampup time of 0 minutes\r\nVuser 1:Rampup complete, Taking start Transaction Count.\r\n0 PostgreSQL tpm\r\nVuser 1:Timing test period of 2 in minutes\r\nVuser 2:RUNNING\r\nVuser 2:Processing 1 transactions with output suppressed...\r\nVuser 2:FINISHED SUCCESS\r\n3650220 PostgreSQL tpm\r\n4245132 PostgreSQL tpm\r\n4203948 PostgreSQL tpm\r\n4211748 PostgreSQL tpm\r\n4203648 PostgreSQL tpm\r\nTimer: 1 minutes elapsed\r\nVuser 1:1 ...,\r\n4367244 PostgreSQL tpm\r\n4265898 PostgreSQL tpm\r\n4320510 PostgreSQL tpm\r\n4258518 PostgreSQL tpm\r\n4426578 PostgreSQL tpm\r\n4413780 PostgreSQL tpm\r\nTimer: 2 minutes elapsed\r\nVuser 1:2 ...,\r\nVuser 1:Test complete, Taking end Transaction Count.\r\nVuser 1:1 Active Virtual Users configured\r\nVuser 1:TEST RESULT : System achieved 1844256 NOPM from 4232155 PostgreSQL TPM\r\nVuser 1:FINISHED SUCCESS\r\nALL VIRTUAL USERS COMPLETE<\/pre>\n<h3>Running a single process pre-test<\/h3>\n<p>Before we test the multithreading capabilities it is important to callibrate how our respective scripts. For this we will use a slightly modified version for both that removes the threading. We start these running as follows for Python<\/p>\n<pre>python3 pgtest_proc.py<\/pre>\n<p>and as follows from the HammerDB directory (export the HammerDB .\/lib directory into the LD_LIBRARY_PATH first)<\/p>\n<pre>.\/bin\/tclsh8.6 pgtest_proc.tcl<\/pre>\n<p>Once the workload is started we then run the timing script in another command window and once we get the result we Ctrl-C the driver script to stop the test.<\/p>\n<p>In our single process test we see 62257 NOPM for Python and 64142 NOPM for Tcl giving confidence that the test scripts and methodology are correct (remember that the workload is running in PostgreSQL stored procedures and the test is calling those stored procedures, so we expect the single process result to be very close).<\/p>\n<p><a href=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pytcl1proc.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1223 size-full\" src=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pytcl1proc.png\" alt=\"\" width=\"707\" height=\"409\" srcset=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pytcl1proc.png 707w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pytcl1proc-300x174.png 300w\" sizes=\"auto, (max-width: 707px) 100vw, 707px\" \/><\/a><\/p>\n<p>We can also use <a href=\"https:\/\/github.com\/gaogaotiantian\/viztracer\/\">VizTracer<\/a> to run the same workload under Python this time in a single thread to understand a little about what the workload should look like. For each stored procedure we calculate the parameters with a number of random operations and then call psycopg2.extensions.cursor.execute, during this we sleep while we wait for the PostgreSQL database to execute the stored procedure and then we run psycopg2.extensions.cursor.fetchall to fetch the results. In total the Wall duration to run the neword stored procedure was 1.8ms.<\/p>\n<p><a href=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-1232 size-full\" src=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz1.png\" alt=\"\" width=\"1085\" height=\"675\" srcset=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz1.png 1085w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz1-300x187.png 300w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz1-1024x637.png 1024w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz1-768x478.png 768w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><\/a><\/p>\n<h3>Running the Mulithreaded Test<\/h3>\n<p>Now lets run a the multithreaded tests shown above. For each test we are going to edit the script to change the number of threads testing at 1,2,4,8,16,32 and 64 threads with the following results.<\/p>\n<p><a href=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyvstcl.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-1226 size-full\" src=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyvstcl.png\" alt=\"\" width=\"705\" height=\"403\" srcset=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyvstcl.png 705w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyvstcl-300x171.png 300w\" sizes=\"auto, (max-width: 705px) 100vw, 705px\" \/><\/a><\/p>\n<p>At 1, 2 or 4 threads we barely notice a difference, however at 64 threads Python gives us 247976 NOPM and Tcl 1844256 NOPM meaning Tcl is more than 700% faster for an identical workload once we start running multiple threads.<\/p>\n<p>So why is there such a marked contrast between an identical workload in Python and Tcl? As an early indication we can take a look at the output from top when running 64 threads. Firstly with Python observe that we have 9 running tasks, 5.8% user CPU utilisation and all of the top postgres processes show as idle.<\/p>\n<pre>Tasks: 1144 total, 9 running, 1135 sleeping, 0 stopped, 0 zombie\r\n%Cpu(s): 5.8 us, 1.0 sy, 0.0 ni, 93.0 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st\r\nMiB Mem : 772719.8 total, 581155.8 free, 6252.7 used, 185311.2 buff\/cache\r\nMiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 711977.4 avail Mem\r\n\r\nPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND\r\n52656 postgres 20 0 5347252 25784 13644 S 178.8 0.0 0:44.94 python3 pgtest_thread.py\r\n52794 postgres 20 0 49.3g 2.4g 2.4g S 10.9 0.3 0:02.83 postgres: tpcc tpcc ::1(45366) idle\r\n52663 postgres 20 0 49.3g 2.2g 2.2g S 10.3 0.3 0:02.88 postgres: tpcc tpcc ::1(45234) idle\r\n52721 postgres 20 0 49.3g 2.3g 2.3g S 10.3 0.3 0:02.75 postgres: tpcc tpcc ::1(45292) idle\r\n52778 postgres 20 0 49.3g 2.4g 2.4g S 10.3 0.3 0:02.82 postgres: tpcc tpcc ::1(45350) idle\r\n52784 postgres 20 0 49.3g 2.3g 2.3g S 10.3 0.3 0:02.81 postgres: tpcc tpcc ::1(45356) idle<\/pre>\n<p>in contrast the same workload in Tcl shows that we have 64 running tasks (for 64 threads) 53.1% CPU and the postgres processes are all busy running stored procedures.<\/p>\n<pre>Tasks: 1143 total, 64 running, 1079 sleeping, 0 stopped, 0 zombie\r\n%Cpu(s): 53.1 us, 6.0 sy, 0.0 ni, 40.0 id, 0.1 wa, 0.0 hi, 0.8 si, 0.0 st\r\nMiB Mem : 772719.8 total, 578477.9 free, 9582.1 used, 184659.7 buff\/cache\r\nMiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 708649.4 avail Mem\r\n\r\nPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND\r\n52329 postgres 20 0 5351472 222896 10216 S 441.3 0.0 1:33.25 .\/bin\/tclsh8.6 pgtest_thread.tcl\r\n52362 postgres 20 0 49.3g 5.5g 5.5g R 87.1 0.7 0:17.59 postgres: tpcc tpcc ::1(45114) CALL\r\n52408 postgres 20 0 49.3g 6.0g 6.0g R 87.1 0.8 0:17.49 postgres: tpcc tpcc ::1(45160) CALL\r\n52338 postgres 20 0 49.3g 6.0g 6.0g R 86.8 0.8 0:17.63 postgres: tpcc tpcc ::1(45086) CALL\r\n52344 postgres 20 0 49.3g 6.2g 6.2g R 86.8 0.8 0:17.63 postgres: tpcc tpcc ::1(45096) CALL\r\n52348 postgres 20 0 49.3g 4.5g 4.5g S 86.8 0.6 0:17.42 postgres: tpcc tpcc ::1(45100) CALL<\/pre>\n<h3>Analyzing the Python GIL<\/h3>\n<p>So, if you have read this far you won&#8217;t be surprised that we suspect that the roadblock we have encountered with Python is a major one called the Global Interpreter Lock or GIL that means only one thread can run at any one time. But lets confirm this using the <a href=\"https:\/\/github.com\/chrisjbillington\/gil_load\">GIL load tool<\/a> to time the amount of time that the GIL is held for our 64 thread workload. The tool is installed and run as follows:<\/p>\n<pre>pip install gil_load\r\npython3 -m gil_load pgtest_thread_gil.py<\/pre>\n<p>and we add gil_load.start and gil_load.stop to the Python script and run a smaller number of transactions without timing the test. The following is the result.\u00a0 The GIL is held for 77% of the time and the threads wait for the GIL 99.9% of the time. For Tcl this is 0% and 0% respectively because there is no GIL meaning Tcl threads execute in parallel and the workload continues to scale.<\/p>\n<p><a href=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/gilohead.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1240 size-full aligncenter\" src=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/gilohead.png\" alt=\"\" width=\"473\" height=\"282\" srcset=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/gilohead.png 473w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/gilohead-300x179.png 300w\" sizes=\"auto, (max-width: 473px) 100vw, 473px\" \/><\/a><\/p>\n<p>The difference between the two also shows that there is time when we are synchronizing between the threads and not running any workload at all.\u00a0 As our results show the more threads we add the worse the performance gets due to this GIL synchronization.<\/p>\n<pre>held: 0.771 (0.774, 0.771, 0.771)\r\nwait: 0.999 (0.998, 0.999, 0.999)\r\n&lt;140263890196288&gt;\r\nheld: 0.0 (0.0, 0.0, 0.0)\r\nwait: 0.0 (0.0, 0.0, 0.0)\r\n&lt;140263853995776&gt;\r\nheld: 0.008 (0.008, 0.008, 0.008)\r\nwait: 0.903 (0.887, 0.902, 0.903)\r\n&lt;140263845603072&gt;\r\nheld: 0.01 (0.01, 0.01, 0.01)\r\nwait: 0.904 (0.888, 0.903, 0.904)\r\n...\r\n&lt;140257727084288&gt;\r\nheld: 0.01 (0.01, 0.01, 0.01)\r\nwait: 0.904 (0.892, 0.904, 0.904)<\/pre>\n<p>If we now return to VizTracer and perf using the method described<a href=\"https:\/\/www.maartenbreddels.com\/perf\/jupyter\/python\/tracing\/gil\/2021\/01\/14\/Tracing-the-Python-GIL.html\"> here<\/a> to trace Python with 16 threads we can now see that the Wall Duration time for the neword stored procedure has increased by 300% and we are spending a lot longer sleeping with only one thread allowed to execute at any one time. PostgreSQL can sustain higher throughput but our Python client cannot.<\/p>\n<p><a href=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-1238\" src=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz2-1024x614.png\" alt=\"\" width=\"525\" height=\"315\" srcset=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz2-1024x614.png 1024w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz2-300x180.png 300w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz2-768x461.png 768w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/pyviz2.png 1159w\" sizes=\"auto, (max-width: 525px) 100vw, 525px\" \/><\/a><\/p>\n<h3>Python GIL PostgreSQL wait events<\/h3>\n<p>So if the bottleneck is in the client what wait event would we expect to see on our PostgreSQL database? To do this we can use the pg_stat_statements and pg_sentinel packages to query the active session history with the SQL as follows:<\/p>\n<div class=\"dm-code-snippet dark default  dm-normal-version\" style=\"background-color:#abb8c3;\" snippet-height=\"\">\n\t\t\t<div class=\"control-language\">\n                <div class=\"dm-buttons\">\n                    <div class=\"dm-buttons-left\">\n                        <div class=\"dm-button-snippet red-button\"><\/div>\n                        <div class=\"dm-button-snippet orange-button\"><\/div>\n                        <div class=\"dm-button-snippet green-button\"><\/div>\n                    <\/div>\n                    <div class=\"dm-buttons-right\">\n                        <a id=\"dm-copy-raw-code\">\n                        <span class=\"dm-copy-text\">Copy Code<\/span>\n                        <span class=\"dm-copy-confirmed\" style=\"display:none\">Copied<\/span>\n                        <span class=\"dm-error-message\" style=\"display:none\">Use a different Browser<\/span><\/a>\n                    <\/div>\n                <\/div>\n                <pre class=\"no-line-numbers\"><code id=\"dm-code-raw\" class=\"no-wrap language-sql\"><\/p>\n<pre class=\"dm-pre-admin-side\">with ash as (\r\n\t select *,ceil(extract(epoch from max(ash_time)over()-min(ash_time)over()))::numeric samples\r\n\t from pg_active_session_history where ash_time&gt;=current_timestamp - interval '10 minutes'\r\n) select  round(100 * count(*)\/sum(count(*)) over(),0) as \"%\", round(count(*)\/samples,2) as \"AAS\",\r\n   datname,wait_event_type,wait_event\r\n from ash\r\n group by samples,\r\n  datname,wait_event_type,wait_event\r\n order by 1 desc\r\n;<\/pre>\n<p><\/code><\/pre>\n\t\t\t<\/div>\n        <\/div>\n<p>with the results as follows on Python:<\/p>\n<pre> % | AAS | datname | wait_event_type | wait_event\r\n----+------+---------+-----------------+--------------------\r\n62 | 4.10 | tpcc | Client | ClientRead\r\n33 | 2.19 | tpcc | CPU | CPU\r\n3 | 0.18 | tpcc | LWLock | pg_stat_statements\r\n1 | 0.06 | tpcc | IO | DataFileRead\r\n1 | 0.03 | tpcc | LWLock | WALInsert\r\n0 | 0.01 | tpcc | IO | SLRURead\r\n0 | 0.01 | tpcc | IO | DataFileExtend\r\n0 | 0.01 | tpcc | Timeout | VacuumDelay\r\n0 | 0.00 | tpcc | IO | DataFileWrite\r\n(9 rows)<\/pre>\n<p>and as follows on Tcl.<\/p>\n<pre>% | AAS | datname | wait_event_type | wait_event\r\n----+-------+---------+-----------------+----------------------\r\n60 | 34.76 | tpcc | CPU | CPU\r\n26 | 15.12 | tpcc | LWLock | pg_stat_statements\r\n7 | 3.88 | tpcc | LWLock | WALInsert\r\n4 | 2.48 | tpcc | Client | ClientRead\r\n1 | 0.52 | tpcc | IO | DataFileRead\r\n1 | 0.32 | tpcc | Lock | extend\r\n0 | 0.02 | tpcc | IO | SLRURead\r\n0 | 0.03 | tpcc | LWLock | XidGen\r\n0 | 0.26 | tpcc | Lock | transactionid\r\n0 | 0.23 | tpcc | IPC | ProcArrayGroupUpdate\r\n0 | 0.03 | tpcc | LWLock | LockManager\r\n0 | 0.02 | tpcc | LWLock | BufferContent\r\n0 | 0.12 | tpcc | LWLock | WALWrite\r\n0 | 0.08 | tpcc | LWLock | XactSLRU\r\n0 | 0.02 | tpcc | IO | DataFileExtend\r\n0 | 0.26 | tpcc | LWLock | ProcArray\r\n(16 rows)<\/pre>\n<p>So when using Python we are spending 62% of our time on a Client wait event called ClientRead or in other words PostgreSQL is telling us it is spending most of its time waiting for the Python client to respond, either the transaction has finished and it is waiting for Python to fetch the results or it has already fetched the results and is waiting for the next query to be sent.\u00a0 Either way high waits on ClientRead means we are waiting for the client and not the database.\u00a0 Also note that with the Tcl workload we now start to see some of the LWLock wait events that a database engineer working on scalability would focus on and we can use the HammerDB PostgreSQL statistics viewer to drill down on some of these wait events.<\/p>\n<p><a href=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/sent5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-1248\" src=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/sent5-1024x900.png\" alt=\"\" width=\"525\" height=\"461\" srcset=\"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/sent5-1024x900.png 1024w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/sent5-300x264.png 300w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/sent5-768x675.png 768w, https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2022\/07\/sent5.png 1045w\" sizes=\"auto, (max-width: 525px) 100vw, 525px\" \/><\/a><\/p>\n<h3>Summary<\/h3>\n<p>What we have observed is that the Python GIL makes it impossible to run anything but a single threaded benchmark against a database because the GIL means that only one thread can ever run at one time.\u00a0 For decades databases have been designed to scale to process millions of transactions a minute whilst maintaining database consistency, a serialized benchmark client compromises the ability to compare databases.<\/p>\n<p>HammerDB and its predecessor Hammerora were written in the Tcl language because they were written by database engineers looking to write scalable database benchmarks rather than by application engineers looking to use a particular language. At the time Tcl was the only multi-threaded GIL free scripting language for database access and this still holds true today. (And although the Python test script could run in multiple processes HammerDB is an application simulating multiple virtual users in a GUI and CLI environment requiring the close interaction and control of using multiple threads).<\/p>\n<h3>*700% Faster<\/h3>\n<p>Although we have seen that Tcl is 700% faster than Python for our example PostgreSQL benchmark we observed that even with Tcl we were only at approximately 50% CPU Utilization. With commerical databases HammerDB can scale to full CPU utilization and 2-3X the transaction rate seen here meaning that in reality the gap between Tcl and Python running against a commericial database is more than 700%.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Python is a popular programming language, especially for beginners, and consequently we see it occurring in places where it just shouldn&#8217;t be used, such as database benchmarking.\u00a0 In contrast, a frequent question when it comes to HammerDB is why is it written in Tcl? Surely any language will do? This post addresses both questions to &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.hammerdb.com\/blog\/uncategorized\/why-tcl-is-700-faster-than-python-for-database-benchmarking\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Why Tcl is 700% faster than Python for database benchmarking&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"ppma_author":[5],"class_list":["post-1205","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"authors":[{"term_id":5,"user_id":2,"is_guest":0,"slug":"hammerdb","display_name":"HammerDB","avatar_url":{"url":"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2018\/10\/logo-white.png","url2x":"https:\/\/www.hammerdb.com\/blog\/wp-content\/uploads\/2018\/10\/logo-white.png"},"author_category":"","user_url":"http:\/\/www.hammerdb.com","last_name":"","first_name":"","job_title":"","description":""}],"_links":{"self":[{"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/posts\/1205","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/comments?post=1205"}],"version-history":[{"count":39,"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/posts\/1205\/revisions"}],"predecessor-version":[{"id":1224,"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/posts\/1205\/revisions\/1224"}],"wp:attachment":[{"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/media?parent=1205"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/categories?post=1205"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/tags?post=1205"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.hammerdb.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1205"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}