A sample database with an integrated test suite, used to test your applications and database servers
This repository was migrated from Launchpad.
See usage in the MySQL docs
The database requires MySQL 5.0+ or PostgreSQL 12+. The following versions are tested in CI using ProxySQL/dbdeployer on a weekly schedule:
| Vendor | Versions |
|---|---|
| MySQL | 5.6, 5.7, 8.0, 8.4, 9.0, 9.2, 9.5, 9.6 |
| Percona Server | 8.0, 8.4 |
| MariaDB | 10.11, 11.4, 12.1 |
| PostgreSQL | 16, 17 |
Starting with MySQL 9.5, the SOURCE command requires the --commands flag on the client:
mysql --commands < employees.sql
Starting with MySQL 9.6, the MD5() and SHA() functions have been removed from the server.
The integrity test files test_employees_md5.sql and test_employees_sha.sql will not work on 9.6+.
Use test_employees_sha2.sql instead, which uses SHA2(..., 256) and is compatible with all versions:
mysql -t < test_employees_sha2.sql
The SHA-256 checksums are identical across all supported MySQL, Percona, MariaDB, and PostgreSQL versions.
The database requires MySQL 5.0+ or compatible server. The following versions are tested in CI using ProxySQL/dbdeployer on a weekly schedule:
| Vendor | Versions |
|---|---|
| MySQL | 5.6, 5.7, 8.0, 8.4, 9.0, 9.2, 9.5, 9.6 |
| Percona Server | 8.0, 8.4 |
| MariaDB | 10.11, 11.4, 12.1 |
Starting with MySQL 9.5, the SOURCE command requires the --commands flag on the client:
mysql --commands < employees.sql
Starting with MySQL 9.6, the MD5() and SHA() functions have been removed from the server.
The integrity test files test_employees_md5.sql and test_employees_sha.sql will not work on 9.6+.
Use test_employees_sha2.sql instead, which uses SHA2(..., 256) and is compatible with all versions:
mysql -t < test_employees_sha2.sql
The SHA-256 checksums are identical across all supported MySQL, Percona, and MariaDB versions.
The original data was created by Fusheng Wang and Carlo Zaniolo at Siemens Corporate Research. The data is in XML format. http://timecenter.cs.aau.dk/software.htm
Giuseppe Maxia made the relational schema and Patrick Crews exported the data in relational format.
The database contains about 300,000 employee records with 2.8 million salary entries. The export data is 167 MB, which is not huge, but heavy enough to be non-trivial for testing.
The data was generated, and as such there are inconsistencies and subtle problems. Rather than removing them, we decided to leave the contents untouched, and use these issues as data cleaning exercises.
You need a MySQL database server (5.0+) and run the commands below through a user that has the following privileges:
SELECT, INSERT, UPDATE, DELETE,
CREATE, DROP, RELOAD, REFERENCES,
INDEX, ALTER, SHOW DATABASES,
CREATE TEMPORARY TABLES,
LOCK TABLES, EXECUTE, CREATE VIEW
- Download the repository
- Change directory to the repository
Important
If you are using MySQL client >= 9.x, the new default value of the --commands parameter is set to FALSE. Therefore, the SOURCE command won't work. You need to happen --commands to each mysql command line call.
Then run
mysql < employees.sql
If you want to install with two large partitioned tables, run
mysql < employees_partitioned.sql
After installing, you can run one of the following integrity tests:
mysql -t < test_employees_sha2.sql # SHA-256 (works on all versions including 9.6+)
mysql -t < test_employees_md5.sql # MD5 (MySQL 8.0–9.5 only)
mysql -t < test_employees_sha.sql # SHA-1 (MySQL 8.0–9.5 only)
For example:
mysql -t < test_employees_md5.sql
+----------------------+
| INFO |
+----------------------+
| TESTING INSTALLATION |
+----------------------+
+--------------+------------------+----------------------------------+
| table_name | expected_records | expected_crc |
+--------------+------------------+----------------------------------+
| employees | 300024 | 4ec56ab5ba37218d187cf6ab09ce1aa1 |
| departments | 9 | d1af5e170d2d1591d776d5638d71fc5f |
| dept_manager | 24 | 8720e2f0853ac9096b689c14664f847e |
| dept_emp | 331603 | ccf6fe516f990bdaa49713fc478701b7 |
| titles | 443308 | bfa016c472df68e70a03facafa1bc0a8 |
| salaries | 2844047 | fd220654e95aea1b169624ffe3fca934 |
+--------------+------------------+----------------------------------+
+--------------+------------------+----------------------------------+
| table_name | found_records | found_crc |
+--------------+------------------+----------------------------------+
| employees | 300024 | 4ec56ab5ba37218d187cf6ab09ce1aa1 |
| departments | 9 | d1af5e170d2d1591d776d5638d71fc5f |
| dept_manager | 24 | 8720e2f0853ac9096b689c14664f847e |
| dept_emp | 331603 | ccf6fe516f990bdaa49713fc478701b7 |
| titles | 443308 | bfa016c472df68e70a03facafa1bc0a8 |
| salaries | 2844047 | fd220654e95aea1b169624ffe3fca934 |
+--------------+------------------+----------------------------------+
+--------------+---------------+-----------+
| table_name | records_match | crc_match |
+--------------+---------------+-----------+
| employees | OK | ok |
| departments | OK | ok |
| dept_manager | OK | ok |
| dept_emp | OK | ok |
| titles | OK | ok |
| salaries | OK | ok |
+--------------+---------------+-----------+
The database is also available for PostgreSQL 12+. The schema and data are identical
to the MySQL version. All files are in the postgresql/ directory.
- ENUM type: MySQL
ENUM('M','F')is replaced withCHAR(1) CHECK (gender IN ('M','F')) - Stored procedures: MySQL's
delimiter //syntax is replaced with PostgreSQL dollar-quoting ($...$ LANGUAGE plpgsql) show_departments(): Implemented as a function returning TABLE (useSELECT * FROM show_departments();instead ofCALL show_departments();)- User variables: MySQL's
@var := valuepattern is replaced with PL/pgSQL local variables - Integrity tests: Use the same incremental hashing approach but via PL/pgSQL helper functions instead of MySQL user variables
The SHA-256 checksums are identical between MySQL and PostgreSQL. This is verified in CI:
the same expected values in test_employees_sha2.sql and postgresql/test_employees_sha2.sql
produce matching results on both databases. The MySQL version uses SHA2(..., 256) while
PostgreSQL uses encode(digest(..., 'sha256'), 'hex') from the pgcrypto extension.
-
Download the repository
-
Install PostgreSQL (12+)
-
Run the loading script:
cd postgresql bash load_employees_db.shTo use a custom psql command (e.g., from a dbdeployer sandbox):
PSQL=/path/to/psql bash load_employees_db.sh
psql -d employees < postgresql/test_employees_sha2.sql # SHA-256 (recommended)
psql -d employees < postgresql/test_employees_md5.sql # MD5
psql -d employees < postgresql/test_employees_sha.sql # SHA-1 (requires pgcrypto)
psql -d employees < postgresql/objects.sql
Available functions: emp_name(), emp_dept_name(), emp_dept_id(), current_manager(),
show_departments() (use SELECT * FROM show_departments();), employees_help().
To the best of my knowledge, this data is fabricated and it does not correspond to real people. Any similarity to existing people is purely coincidental.
This work is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.