All Projects → felixge → Sqlbench

felixge / Sqlbench

Licence: mit
sqlbench measures and compares the execution time of one or more SQL queries.

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to Sqlbench

Sqlfiddle3
New version based on vert.x and docker
Stars: ✭ 242 (-24.14%)
Mutual labels:  sql, postgresql
Xo
Command line tool to generate idiomatic Go code for SQL databases supporting PostgreSQL, MySQL, SQLite, Oracle, and Microsoft SQL Server
Stars: ✭ 2,974 (+832.29%)
Mutual labels:  sql, postgresql
Sql Lint
An SQL linter
Stars: ✭ 243 (-23.82%)
Mutual labels:  sql, postgresql
Massive Js
A data mapper for Node.js and PostgreSQL.
Stars: ✭ 2,521 (+690.28%)
Mutual labels:  sql, postgresql
Web Tooling Benchmark
JavaScript benchmark for common web developer workloads
Stars: ✭ 290 (-9.09%)
Mutual labels:  performance, benchmarking
Sequelize Auto Migrations
Migration generator && runner for sequelize
Stars: ✭ 233 (-26.96%)
Mutual labels:  sql, postgresql
Swifql
💎 A Swift DSL for type-safe, extensible, and transformable SQL queries.
Stars: ✭ 250 (-21.63%)
Mutual labels:  sql, postgresql
Scany
Library for scanning data from a database into Go structs and more
Stars: ✭ 228 (-28.53%)
Mutual labels:  sql, postgresql
Sqlc
Generate type-safe code from SQL
Stars: ✭ 4,564 (+1330.72%)
Mutual labels:  sql, postgresql
Shiba
Catch bad SQL queries before they cause problems in production
Stars: ✭ 277 (-13.17%)
Mutual labels:  sql, performance
Quickperf
QuickPerf is a testing library for Java to quickly evaluate and improve some performance-related properties
Stars: ✭ 231 (-27.59%)
Mutual labels:  sql, performance
Squeal
Squeal, a deep embedding of SQL in Haskell
Stars: ✭ 308 (-3.45%)
Mutual labels:  sql, postgresql
Sqliterally
Lightweight SQL query builder
Stars: ✭ 231 (-27.59%)
Mutual labels:  sql, postgresql
Ultimatepp
U++ is a C++ cross-platform rapid application development framework focused on programmer's productivity. It includes a set of libraries (GUI, SQL, Network etc.), and integrated development environment (TheIDE).
Stars: ✭ 237 (-25.71%)
Mutual labels:  sql, performance
Mobilitydb
MobilityDB is a geospatial trajectory data management & analysis platform, built on PostgreSQL and PostGIS.
Stars: ✭ 229 (-28.21%)
Mutual labels:  sql, postgresql
E Commerce Db
Database schema for e-commerce (webstores) sites.
Stars: ✭ 245 (-23.2%)
Mutual labels:  sql, postgresql
Db
Data access layer for PostgreSQL, CockroachDB, MySQL, SQLite and MongoDB with ORM-like features.
Stars: ✭ 2,832 (+787.77%)
Mutual labels:  sql, postgresql
Knex
A query builder for PostgreSQL, MySQL, CockroachDB, SQL Server, SQLite3 and Oracle, designed to be flexible, portable, and fun to use.
Stars: ✭ 15,083 (+4628.21%)
Mutual labels:  sql, postgresql
Jsqlparser
JSqlParser parses an SQL statement and translate it into a hierarchy of Java classes. The generated hierarchy can be navigated using the Visitor Pattern
Stars: ✭ 3,405 (+967.4%)
Mutual labels:  sql, postgresql
Loukoum
A simple SQL Query Builder
Stars: ✭ 305 (-4.39%)
Mutual labels:  sql, postgresql

sqlbench

sqlbench measures and compares the execution time of one or more SQL queries.

screen recording

The main use case is benchmarking simple CPU-bound query variants against each other during local development.

Only PostgreSQL is supported at this point, but pull requests for MySQL or other databases are welcome.

Install

You can download a binary from the release page.

If you have Go 1.13 or later installed, you can install or update sqlbench from source:

$ go get -u github.com/felixge/sqlbench

Windows Users: You may also install the chocolatey package maintained by picolino:

$ choco install sqlbench

Examples

Below are a few one-liners to get you started. They assume you're running sqlbench from the directory of a clone of this repo.

# Benchmark a few queries until ctrl+c is hit. Output results in realtime.
sqlbench examples/sum/*.sql

# Benchmark using client wallclock time (instead of explain) until ctrl+c.
sqlbench -m client examples/sum/*.sql

# Run for 3 seconds and only print results once at the end.
sqlbench -t 3 -s examples/sum/*.sql

# Run for 1000 iterations and only print verbose results once at the end
sqlbench -n 1000 -s -v examples/sum/*.sql

# Record the results for 1000 iterations into a csv file.
sqlbench -n 1000 -o baseline.csv examples/sum/*.sql

# Compare 1000 iterations to a baseline recording.
sqlbench -n 1000 -i baseline.csv examples/sum/*.sql

Usage

Usage of sqlbench:
  -c string
    	Connection URL or DSN for connecting to PostgreSQL as understood by pgx [1].
    	E.g.: postgres://user:[email protected]:5432/my_db?sslmode=disable
    	
    	Alternatively you can use standard PostgreSQL environment variables [2] such as
    	PGHOST, PGPORT, PGPASSWORD, ... .
    	
    	[1] https://pkg.go.dev/github.com/jackc/pgx/v4/stdlib?tab=doc
    	[2] https://www.postgresql.org/docs/current/libpq-envars.html
    	(default "postgres://")
  -i string
    	Input path for CSV file with baseline measurements.
  -m string
    	Method for measuring the query time. One of: "client", "explain" (default "explain")
  -n int
    	Terminate after the given number of iterations. (default -1)
  -o string
    	Output path for writing individual measurements in CSV format.
  -p	Include the query planning time. For -m explain this is accomplished by adding
    	the "Planning Time" to the measurement. For -m client this is done by not using
    	prepared statements.
  -s	Silent mode for non-interactive use, only prints stats once after terminating.
  -t float
    	Terminate after the given number of seconds. (default -1)
  -v	Verbose output. Print the content of all SQL queries, as well as the
    	PostgreSQL version.
  -version
    	Print version and exit.

How It Works

sqlbench takes a list of SQL files and keeps executing them sequentially, measuring their execution times. By default the execution time is measured by prefixing the query with EXPLAIN (ANALYZE, TIMING OFF) and capturing the total Execution Time for it.

The query columns are ordered by mean execution time in ascending order, and the relative difference compared to the fastest query is shown in parentheses. If you provide a baseline csv via -i, the relative differences are comparing the corresponding queries in the baseline rather than the current queries with each other.

If the -m client flag is given, the time is measured using the wallclock time of sqlbench which includes network overhead.

Planning time is excluded by default, but can be included using the -p flag.

The filenames init.sql and destroy.sql are special, and are executed once before and after the benchmark respectively. They can be used to setup or teardown tables, indexes, etc..

Tutorial

Let's say you want to compare three different queries for computing the running total of all numbers from 1 to 1000. Your first idea is to use a window function:

SELECT i, sum(i) OVER (ORDER BY i) AS sum
FROM generate_series(1, 1000) g(i);

Then you decide to get fancy and implement it as a recursive CTE:

WITH RECURSIVE sums AS (
	SELECT 1 AS i, 1 AS sum
	UNION
	SELECT i+1, sum+i FROM sums WHERE i <= 1000
)

SELECT * FROM sums;

And finally you become wise and remember that 9 year old Gauss could probably beat both approaches:

SELECT i, (i * (i + 1)) / 2 AS sum
FROM generate_series(1, 1000) g(i);

Now that you have your queries in window.sql, recursive.sql, gauss.sql, you want to summarize the performance differences for your colleagues. However, you know they're a pedantic bunch, and will ask you annoying questions such as:

  • How many times did you run each query?
  • Were you running other stuff on your laptop in the background?
  • How can I reproduce this on my local machine?
  • What version of PostgreSQL were you running on your local machine?
  • Are you sure you're not just measuring the overhead of EXPLAIN ANALYZE?

This could normally be quite annoying to deal with, but luckily there is sqlbench. The command below lets you run your three queries 1000 times with EXPLAIN ANALYZE and report the statistics, the PostgreSQL version and even the SQL of your queries:

$ sqlbench -v -s -n 1000 examples/sum/*.sql | tee explain-bench.txt
         | gauss |    window     |   recursive    
---------+-------+---------------+----------------
  n      |  1000 |          1000 |          1000  
  min    |  0.35 | 1.31 (3.79x)  | 1.80 (5.22x)   
  max    |  4.18 | 23.76 (5.68x) | 11.41 (2.73x)  
  mean   |  0.50 | 1.94 (3.85x)  | 2.67 (5.30x)   
  stddev |  0.16 | 0.81 (4.93x)  | 0.63 (3.87x)   
  median |  0.53 | 2.02 (3.80x)  | 2.91 (5.49x)   
  p90    |  0.67 | 2.53 (3.80x)  | 3.41 (5.12x)   
  p95    |  0.68 | 2.57 (3.81x)  | 3.50 (5.18x)   

Stopping after 1000 iterations as requested.

postgres version: PostgreSQL 11.6 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit
sqlbench -v -s -n 1000 examples/sum/gauss.sql examples/sum/recursive.sql examples/sum/window.sql

==> examples/sum/gauss.sql <==
SELECT i, (i * (i + 1)) / 2 AS sum
FROM generate_series(1, 1000) g(i);

==> examples/sum/window.sql <==
SELECT i, sum(i) OVER (ORDER BY i) AS sum
FROM generate_series(1, 1000) g(i);

==> examples/sum/recursive.sql <==
WITH RECURSIVE sums AS (
	SELECT 1 AS i, 1 AS sum
	UNION
	SELECT i+1, sum+i FROM sums WHERE i <= 1000
)

SELECT * FROM sums;

And finally, you can use the -m client flag to measure the query times without EXPLAIN ANALYZE to see if that had a significant overhead:

$ sqlbench -s -n 1000 -m client examples/sum/*.sql | tee client-bench.txt
         | gauss |    window    |  recursive    
---------+-------+--------------+---------------
  n      |  1000 |         1000 |         1000  
  min    |  0.66 | 1.44 (2.18x) | 2.03 (3.08x)  
  max    |  5.66 | 7.31 (1.29x) | 4.34 (0.77x)  
  mean   |  0.83 | 1.72 (2.08x) | 2.35 (2.83x)  
  stddev |  0.23 | 0.33 (1.41x) | 0.27 (1.18x)  
  median |  0.78 | 1.65 (2.11x) | 2.26 (2.89x)  
  p90    |  0.98 | 1.98 (2.03x) | 2.68 (2.75x)  
  p95    |  1.05 | 2.13 (2.03x) | 2.89 (2.76x)  

Stopping after 1000 iterations as requested.

Indeed, it appears that from the client's perspective the gauss query is a bit slower, while the others are a bit faster when measuring without EXPLAIN ANALYZE. Whether that's a rabbit hole worth exploring depends on you, but either way you now have a much better sense of the errors that might be contained in your measurements.

Todos

Below are a few ideas for todos that I might implement at some point or would welcome as pull requests.

  • [ ] Dynamically adjust unit between ms, s, etc.
  • [ ] Support specifying benchmarks using a single YAML file.
  • [ ] Support for other databases, e.g. MySQL.
  • [ ] Capture query plans for each query, ideally one close to the median execution time.
  • [ ] Provide an easy way to capture all inputs and outputs in a single tar.gz file or GitHub gist.
  • [ ] Plot query times as a histogram (made a proof of concept for this, but didn't like it enough yet to release)
  • [ ] Maybe add db name to verbose output, see request.
  • [x] Compare benchmark results between PG versions
  • [x] Oneliner examples for README
  • [x] Warmup phase (can be done via init.sql and pg_prewarm()
  • [x] Use TIMING OFF to reduce EXPLAIN overhead.
  • [x] A flag to include planning time in -m explain mode.
  • [x] A flag to use prepared queries in -m client mode.

License

sqlbench is licensed under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].