In other words, benchmarking answers the question
“How well does this perform?” and profiling answers the question “Why does it per-
form the way it does?”
We show detailed examples of real-life profiling code we’ve used in production to help
analyze application performance. We also show you how to log MySQL’s queries,
analyze the logs, and use MySQL’s status counters and other tools to see what
MySQL and your queries are doing.
• Measure how your application currently performs. If you don’t know how fast it
currently runs, you can’t be sure any changes you make are helpful. You can also
use historical benchmark results to diagnose problems you didn’t foresee.
• Validate your system’s scalability. You can use a benchmark to simulate a much
higher load than your production systems handle, such as a thousand-fold
increase in the number of users.
• Plan for growth. Benchmarks help you estimate how much hardware, network
capacity, and other resources you’ll need for your projected future load. This can
help reduce risk during upgrades or major application changes.
• Test your application’s ability to tolerate a changing environment. For example,
you can find out how your application performs during a sporadic peak in con-
currency or with a different configuration of servers, or you can see how it han-
dles a different data distribution.
• Test different hardware, software, and operating system configurations. Is RAID
5 or RAID 10 better for your system? How does random write performance
change when you switch from ATA disks to SAN storage? Does the 2.4 Linux
kernel scale better than the 2.6 series? Does a MySQL upgrade help perfor-
mance? What about using a different storage engine for your data? You can
answer these questions with special benchmarks.
Benchmarking Strategies
full-stack and
single-component benchmarking
• You’re testing the entire application, including the web server, the application
code, and the database. This is useful because you don’t care about MySQL’s
performance in particular; you care about the whole application.
• MySQL is not always the application bottleneck, and a full-stack benchmark can
reveal this.
• Only by testing the full application can you see how each part’s cache behaves.
• Benchmarks are good only to the extent that they reflect your actual applica-
tion’s behavior, which is hard to do when you’re testing only part of it.
Sometimes, however, you don’t really want to know about the entire application.
You may just need a MySQL benchmark, at least initially. Such a benchmark is use-
ful if:
• You want to compare different schemas or queries.
• You want to benchmark a specific problem you see in the application.
• You want to avoid a long benchmark in favor of a shorter one that gives you a
faster “cycle time” for making and measuring changes.
It’s also useful to benchmark MySQL when you can repeat your application’s que-
ries against a real dataset. The data itself and the dataset’s size both need to be realis-
tic. If possible, use a snapshot of actual production data.
You need to identify your goals before you start benchmarking—indeed, before you
even design your benchmarks. Frame your goals as a questions, such
as “Is this CPU better than that one?” or “Do the new indexes work better than the
current ones?”
TPC-C (see http://www.tpc.org) are widely quoted
Common mistakes
- Using a subset of the real data size, such as using only one gigabyte of data when
the application will need to handle hundreds of gigabytes, or using the current
dataset when you plan for the application to grow much larger. -
Using incorrectly distributed data, such as uniformly distributed data when the
real system’s data will have “hot spots.” (Randomly generated data is often unrealistically distributed.) - Using unrealistically distributed parameters, such as pretending that all user profiles are equally likely to be viewed
- Using a single-user scenario for a multiuser application.
Benchmarking a distributed application on a single server. - Failing to match real user behavior, such as “think time” on a web page. Real
users request a page and then read it; they don’t click on links one after another
without pausing. - Running identical queries in a loop. Real queries aren’t identical, so they cause
cache misses. Identical queries will be fully or partially cached at some level. - Failing to check for errors. If a benchmark’s results don’t make sense—e.g., if a
slow operation suddenly completes very quickly—check for errors. You might
just be benchmarking how quickly MySQL can detect a syntax error in the SQL
query! Always check error logs after benchmarks, as a matter of principle. - Ignoring how the system performs when it’s not warmed up, such as right after a
restart. Sometimes you need to know how long it’ll take your server to reach
capacity after a restart, so you’ll want to look specifically at the warm-up period.
Conversely, if you intend to study normal performance, you’ll need to be aware
that if you benchmark just after a restart many caches will be cold, and the
benchmark results won’t reflect the results you’ll get under load when the caches
are warmed up. - Using default server settings.
The first step in planning a benchmark is to identify the problem and the goal. Next,
decide whether to use a standard benchmark or design your own.
Your documentation method might be as simple as a spreadsheet or
notebook, or as complex as a custom-designed database (keep in mind that you’ll
probably want to write some scripts to help analyze the results, so the easier it is to
process the results without opening spreadsheets and text files, the better).
You may find it useful to make a benchmark directory with subdirectories for each
run’s results. (or copy -v mode of unit tests in files) You can then place the results, configuration files, and notes for each
run in the appropriate subdirectory. If your benchmark lets you measure more than
you think you’re interested in, record the extra data anyway. It’s much better to have
unneeded data than to miss important data, and you might find the extra data useful
in the future. Try to record as much additional information as you can during the
benchmarks, such as CPU usage, disk I/O, and network traffic statistics; counters
from SHOW GLOBAL STATUS; and so on.
The best way to get accurate results is to design your benchmark to answer the ques-
tion you want to answer.
Try to change as few parameters as possible each time you run a benchmark. This is
called “isolating the variable” in science. If you must change several things at once,
you risk missing something.
We see a lot of benchmarks that try to predict performance after a migration, such as
migrating from Oracle to MySQL. These are often troublesome, because MySQL
performs well on completely different types of queries than Oracle. If you want to
know how well an application built on Oracle will run after migrating it to MySQL,
you usually need to redesign the schema and queries for MySQL.
You can’t get meaningful results from the default MySQL configuration settings
either, because they’re tuned for tiny applications that consume very little memory.
It’s usually a good idea to automate the benchmark runs. Doing so will improve your
results and their accuracy, because it will prevent you from forgetting steps or acci-
dentally doing things differently on different runs. It will also help you document
how to run the benchmark
When you have it set up correctly, benchmarking can be a one-step
process. If you’re just running a one-off benchmark to check some-
thing quickly, you might not want to automate it.
Full-Stack Tools
http://httpd.apache.org/docs/2.0/programs/ab.html It’s a very simple tool, but its usefulness is also limited because it just hammers one URL as fast as it can.
http://www.acme.com/software/http_load/
You can create an input file with many different URLs, and
http_load will choose from among them at random. You can also instruct it to
issue requests at a timed rate, instead of just running them as fast as it can.
http://jakarta.apache.org/jmeter/
JMeter is much more complex than ab and http_load. For example, it has features that let you simulate real users more flexibly, by controlling such parame-
ters as ramp-up time. It has a graphical user interface with built-in result
graphing, and it offers the ability to record and replay results offline.
http://wwwsearch.sourceforge.net/mechanize/ mechanize witch python Doctest is good for automation of unit tests and benchmarks (is used in test driven development at eecho)
Single-Component Tools
Here are some useful tools to test the performance of MySQL and the system on
which it runs. We show example benchmarks with some of these tools in the next
section:
mysqlslap
mysqlslap (http://dev.mysql.com/doc/refman/5.1/en/mysqlslap.html) simulates
load on the server and reports timing information. It is part of the MySQL 5.1
server distribution, but it should be possible to run it against MySQL 4.1 and
newer servers. You can specify how many concurrent connections it should use,
and you can give it either a SQL statement on the command line or a file con-
taining SQL statements to run. If you don’t give it statements, it can also auto-
generate SELECT statements by examining the server’s schema.
sysbench
sysbench (http://sysbench.sourceforge.net) is a multithreaded system benchmark-
ing tool. Its goal is to get a sense of system performance, in terms of the factors
important for running a database server. For example, you can measure the per-
formance of file I/O, the OS scheduler, memory allocation and transfer speed,
POSIX threads, and the database server itself. sysbench supports scripting in the
Lua language (http://www.lua.org), which makes it very flexible for testing a vari-
ety of scenarios.
Database Test Suite
The Database Test Suite, designed by The Open-Source Development Labs
(OSDL) and hosted on SourceForge at http://sourceforge.net/projects/osdldbt/, is a
test kit for running benchmarks similar to some industry-standard benchmarks,
such as those published by the Transaction Processing Performance Council
(TPC). In particular, the dbt2 test tool is a free (but uncertified) implementation
of the TPC-C OLTP test. It supports InnoDB and Falcon; at the time of this writ-
ing, the status of other transactional MySQL storage engines is unknown.
MySQL Benchmark Suite (sql-bench)
MySQL distributes its own benchmark suite with the MySQL server, and you
can use it to benchmark several different database servers. It is single-threaded
and measures how quickly the server executes queries. The results show which
types of operations the server performs well.
The main benefit of this benchmark suite is that it contains a lot of predefined
tests that are easy to use, so it makes it easy to compare different storage engines
or configurations. It’s useful as a high-level benchmark, to compare the overall
performance of two servers. You can also run a subset of its tests (for example,
just testing UPDATE performance). The tests are mostly CPU-bound, but there are
short periods that demand a lot of disk I/O.
The biggest disadvantages of this tool are that it’s single-user, it uses a very small
dataset, you can’t test your site-specific data, and its results may vary between
runs. Because it’s single-threaded and completely serial, it will not help you
assess the benefits of multiple CPUs, but it can help you compare single-CPU
servers.
Perl and DBD drivers are required for the database server you wish to bench-
mark. Documentation is available at http://dev.mysql.com/doc/en/mysql-
benchmarks.html/.
Before getting started, read the included README file, which explains how to use
the suite and documents the command-line arguments. To run all the tests, use com-
mands like the following:
$ cd /usr/share/mysql/sql-bench/
sql-bench$ ./run-all-tests –server=mysql –user=root –log –fast
Test finished. You can find the result in:
output/RUN-mysql_fast-Linux_2.4.18_686_smp_i686
Super Smack
Super Smack (http://vegan.net/tony/supersmack/) is a benchmarking, stress-
testing, and load-generating tool for MySQL and PostgreSQL. It is a complex,
powerful tool that lets you simulate multiple users, load test data into the data-
base, and populate tables with randomly generated data. Benchmarks are con-
tained in “smack” files, which use a simple language to define clients, tables,
queries, and so on.
The following configuration sample will enable the log, capture all queries that take
more than two seconds to execute, and log queries that don’t use any indexes. It will
also log slow administrative statements, such as OPTIMIZE TABLE:
log-slow-queries = <file_name>
long_query_time = 2
og analysis tools
Now that you’ve logged some queries, it’s time to analyze the results. The general
strategy is to find the queries that impact the server most, check their execution
plans with EXPLAIN, and tune as necessary. Repeat the analysis after tuning, because
your changes might affect other queries. It’s common for indexes to help SELECT que-
ries but slow down INSERT and UPDATE queries, for example.
You should generally look for the following three things in the logs:
Long queries
Routine batch jobs will generate long queries, but your normal queries shouldn’t
take very long.
High-impact queries
Find the queries that constitute most of the server’s execution time. Recall that
short queries that are executed often may take up a lot of time.
New queries
Find queries that weren’t in the top 100 yesterday but are today. These might be
new queries, or they might be queries that used to run quickly and are suffering
because of different indexing or another change.
If your slow query log is fairly small this is easy to do manually, but if you’re logging
all queries (as we suggested), you really need tools to help you. Here are some of the
more common tools for this purpose:
mysqldumpslow
MySQL provides mysqldumpslow with the MySQL server. It’s a Perl script that
can summarize the slow query log and show you how many times each query
appears in the log. That way, you won’t waste time trying to optimize a 30-
second slow query that runs once a day when there are many other shorter slow
queries that run thousands of time per day.
The advantage of mysqldumpslow is that it’s already installed; the disadvantage
is that it’s a little less flexible than some of the other tools. It is also poorly docu-
mented, and it doesn’t understand logs from servers that are patched with the
microsecond slow-log patch.
mysql_slow_log_filter
This tool, available from http://www.mysqlperformanceblog.com/files/utils/mysql_
slow_log_filter, does understand the microsecond log format. You can use it to
extract queries that are longer than a given threshold or that examine more than
a given number of rows. It’s great for “tailing” your log file if you’re running the
microsecond patch, which can make your log grow too quickly to follow with-
out filtering. You can run it with high thresholds for a while, optimize until the
worst offenders are gone, then change the parameters to catch more queries and
continue tuning.
Here’s a command that will show queries that either run longer than half a sec-
ond or examine more than 1,000 rows:
$ tail -f mysql-slow.log | mysql_slow_log_filter -T 0.5 -R 1000
mysql_slow_log_parser
This is another tool, available from http://www.mysqlperformanceblog.com/files/
utils/mysql_slow_log_parser, that can aggregate the microsecond slow log. In
addition to aggregating and reporting, it shows minimum and maximum values
for execution time and number of rows analyzed, prints the “canonicalized”
query, and prints a real sample you can EXPLAIN. Here’s a sample of its output:
### 3579 Queries
### Total time: 3.348823, Average time: 0.000935686784017883
### Taking 0.000269 to 0.130820 seconds to complete
### Rows analyzed 1 – 1
SELECT id FROM forum WHERE id=XXX;
SELECT id FROM forum WHERE id=12345;
mysqlsla
The MySQL Statement Log Analyzer, available from http://hackmysql.com/
mysqlsla, can analyze not only the slow log but also the general log and “raw”
logs containing delimited SQL statements. Like mysql_slow_log_parser, it can
mysql> SHOW SESSION STATUS LIKE 'Created%';
+————————-+——-+
| Variable_name | Value |
+————————-+——-+
| Created_tmp_disk_tables | 0 |
| Created_tmp_files | 0 |
| Created_tmp_tables | 5 |
+————————-+——-+
It’s nice to see that the query didn’t need to use the disk for the temporary tables,
because that’s very slow. But this is a little puzzling; surely MySQL didn’t create five
temporary tables just for this one query?
SHOW PROFILE
SHOW PROFILE is a patch Jeremy Cole contributed to the Community version of
MySQL, as of MySQL 5.0.37.* Profiling is disabled by default but can be enabled at
the session level. Enabling it makes the MySQL server collect information about the
resources the server uses to execute a query. To start collecting statistics, set the
profiling variable to 1:
mysql> SET profiling = 1;
Now let’s run a query:
mysql> SELECT COUNT(DISTINCT actor.first_name) AS cnt_name, COUNT(*) AS cnt
-> FROM sakila.film_actor
-> INNER JOIN sakila.actor USING(actor_id)
-> GROUP BY sakila.film_actor.film_id
-> ORDER BY cnt_name DESC;
…
997 rows in set (0.03 sec)
This query’s profiling data was stored in the session. To see queries that have been
profiled, use SHOW PROFILES:
mysql> SHOW PROFILES\G
*************************** 1. row ***************************
Query_ID: 1
Duration: 0.02596900
Query: SELECT COUNT(DISTINCT actor.first_name) AS cnt_name,…
You can retrieve the stored profiling data with the SHOW PROFILE statement. When you
run it without an argument, it shows status values and durations for the most recent
statement:
mysql> SHOW PROFILE;
+————————+———–+
| Status | Duration |
+————————+———–+
| (initialization) | 0.000005 |