Wednesday, October 1, 2014

Single thread performance in MySQL 5.7.5 versus older releases via sql-bench

MySQL 5.7 does much better on benchmarks with high-concurrency. It might do worse on benchmarks with low-concurrency. I am not surprised as this has been true across many releases. The question is whether anything can be done to reverse it. When testing 5.6 I filed bugs 68825 and 69236 for this problem. Maybe it is time for new bug reports. I measure the following from sysbench for InnoDB but must add the disclaimer that I have yet to explain these results and I am wary of unexplained benchmark results. And also note that these overheads are for the sql-bench workload. Your workload might have a smaller overhead. By the same token, if you have less overhead when running TPC-D queries that doesn't mean there isn't a performance regression for short-running queries at low concurrency.
  • First for 5.7.5 compared to 5.0.85
    • 5.7.5 is 1.45X slower than 5.0.85 when the PS is enabled
    • 5.7.5 is 1.33X slower than 5.0.85 when the PS is disabled
    • Enabling PS for 5.7.5 makes it 1.1X slower
  • Next for 5.6.21 compared to 5.0.85
    • 5.6.21 is 1.45X slower than 5.0.85 when the PS is enabled
    • 5.6.21 is 1.16X slower than 5.0.85 when the PS is disabled
    • Enabling PS for 5.6.21 makes it 1.25X slower

I previously compared MySQL 5.6 to older releases to document performance regressions for single-threaded workloads. This is an important workload even if high-concurrency benchmarks get all of the PR. Database reload and replication apply are cases where low-concurrency performance matters a lot. Others have made a similar case for the importance (Percona, Yoshinori).

In this post I present results from sql-bench from the full result and test-wisconsin. I used the same client binary to test all servers -- sql-bench from MySQL 5.5. I think I used the same compiler options and library runtimes for MySQL 5.0, 5.1, 5.5, 5.7 and 5.7. All servers were linked with jemalloc. I tested the following binaries:
  • 5.0.85, 5.1.63, 5.5.40, 5.6.21, 5.7.5 - MySQL with PS disabled in my.cnf when supported
  • 5.5.40-ps, 5.6.21-ps, 5.7.5-ps - MySQL with PS enabled in my.cnf but no instruments configured
  • 5.6.21-ps2, 5.7.5-ps2 - MySQL with PS enabled and performance_schema_instrument="wait/io/%"
soapbox

I think it is odd to publish benchmark results with the PS disabled. It should be enabled for benchmarks with basic instruments configured (rows changed/read/inserted/deleted, IO requests and time per table, similar metrics per user) because the PS does two things -- per user/table monitoring which is always enabled and performance debugging for things like mutex contention which isn't always enabled. There is one time when benchmarks can be run with PS disabled, and that is when searching for the PS overhead.

sql-bench test-wisconsin

This result is from the Wisconsin benchmark included in sql-bench for InnoDB. This shows a gradual increase in response time and 5.7.5 is about 10% slower than 5.0.85. The overhead from the performance schema for this workload is small which matches what I have seen before. The PS overhead can be significant for short running queries but is small for others.

binary        query-seconds
5.0.85           70
5.1.63           70
5.5.40           70
5.5.40-ps        71      
5.6.21           71
5.6.21-ps        73      
5.6.21-ps2       73      
5.7.5            75
5.7.5-ps         77
5.7.5-ps2        77 

sql-bench full test innodb

This is the result from a full run of sql-bench for all of the MySQL versions with InnoDB. The create test is much slower starting in 5.6.21 and most of the overhead is in the create_key+drop subtest. The insert, select and transaction tests get slower gradually from 5.0 to 5.7. Only one test, alter table, got faster from 5.0 to 5.7. There is also a significant penalty (25% in 5.6, 10% in 5.7) from enabling the performance schema but no penalty from using performance_schema_instrument="wait/io/%" after enabling the PS.

binary        total seconds
5.0.85            924
5.1.63            912
5.5.40            919
5.5.40-ps         988
5.6.21           1080
5.6.21-ps        1341
5.6.21-ps2       1339
5.7.5            1231
5.7.5-ps         1343
5.7.5-ps2        1338

This shows the total time relative to the time in seconds for MySQL 5.0.85. Thus the relative time is about 1 for 5.0.85, 5.1.63 and 5.5.40.


This has data for all of the test types excluding wisconsin that takes 5 to 7 seconds.

version alter-table  ATIS  big-tables  connect create  insert  select transact
5.0.85      17        7       5          62      122     528     172     8
5.1.63      16        7       9          63      102     524     179     8
5.5.40      17        7       8          65      101     530     178     9
5.5.40-ps   18        6       8          68      114     579     181     9
5.6.21      18        7       8          64      219     553     196     10
5.6.21-ps   18        7       8          70      406     601     213     11
5.6.21-ps2  18        7       9          66      406     604     212     11
5.7.5       11        8       9          69      264     630     225     11
5.7.5-ps    11        8       9          73      335     660     225     12
5.7.5-ps2   11        8       9          71      332     663     226     12

sql-bench full test MyISAM

I repeated the test for MyISAM with the PS disabled. There continues to be a regression from 5.0.85 to 5.7.5 but it is smaller than the one for InnoDB -- 1.33X versus 1.16X. So the regression isn't limited to InnoDB. Likely suspects are the parser, optimizer and performance schema (which doesn't narrow the problem much).


setup

Command line for test-wisconsin:
 ./test-wisconsin --server=mysql --host=127.0.0.1 --log --create-options="engine=innodb" --loop-count=1000
Command line for full test:

./run-all-tests --server=mysql --host=127.0.0.1 --log --create-options="engine=innodb"
This is the configuration for MySQL 5.7.5 with the PS disabled. A similar my.cnf file was used for other versions with changes as required:

sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES 

table-definition-cache=1000
table-open-cache=2000
table-open-cache-instances=8
max_connections=20000
key_buffer_size=200M
metadata_locks_hash_instances=256 
query_cache_size=0
query_cache_type=0
server_id=9
performance_schema=0

binlog_format=row
skip_log_bin

innodb_buffer_pool_instances=8
innodb_io_capacity=1000
innodb_lru_scan_depth=1000
innodb_checksum_algorithm=CRC32
innodb_thread_concurrency=0
innodb_buffer_pool_size=100G
innodb_log_file_size=1900M
innodb_flush_log_at_trx_commit=2
innodb_doublewrite=0
innodb_flush_method=O_DIRECT
innodb_thread_concurrency=0
innodb_max_dirty_pages_pct=80
innodb_file_format=barracuda
innodb_file_per_table
datadir=/data/orig575/var


1 comment:

  1. As result means that 5.0.85 is more indicatade for performance in microservices?

    ReplyDelete