If your application is performing lot of deletes and updates on MySQL database, then there is a high possibility that your MySQL data files are fragmented.
This will result in lot of unused space, and also might affect performance.
So, it is highly recommended that you defrag your MySQL tables on an ongoing basis.
This tutorial explains how to optimize MySQL to defrag tables and reclaim unused space.
1. Identify Tables for Optimization
The first step is to identify whether you have fragmentation on your MySQL database.
Connect to your MySQL database, and execute the following query, which will display how much unused space are available in every table.
mysql> use thegeekstuff; mysql> select table_name, round(data_length/1024/1024) as data_length_mb, round(data_free/1024/1024) as data_free_mb from information_schema.tables where round(data_free/1024/1024) > 500 order by data_free_mb; +------------+----------------+--------------+ | table_name | data_length_mb | data_free_mb | +------------+----------------+--------------+ | BENEFITS | 7743 | 4775 | | DEPARTMENT | 14295 | 13315 | | EMPLOYEE | 21633 | 19834 | +------------+----------------+--------------+
In the above output:
- This will display list of all tables that has minimum of 500MB of unused space. As we see above, in this example, there are 3 tables that has more than 500MB of unused space.
- data_length_mb column displays the total table size in MB. For example, EMPLOYEE table size is around 21GB.
- data_free_mb column displays the total unused space in that particular table. For example, EMPLOYEE table has around 19GB of unused space in it.
- All these three tables (EMPLOYEE, DEPARTMENT AND BENEFITS) are heavily fragmented and it needs to be optimized to reclaim the unused space.
From the filesystem level, you can see the size of the individual table files as shown below.
The file size will be the same as what you see under “data_length_mb” column in the above output.
# ls -lh /var/lib/mysql/thegeekstuff/ .. -rw-rw----. 1 mysql mysql 7.6G Apr 23 10:55 BENEFITS.MYD -rw-rw----. 1 mysql mysql 14G Apr 23 12:53 DEPARTMENT.MYD -rw-rw----. 1 mysql mysql 22G Apr 23 12:03 EMPLOYEE.MYD ..
In this example, the EMPLOYEE.MYD file is taking up around 22GB at the filesystem level, but it has lot of unused space in it. If we optimize this table, the size of this file should go down dramatically.
2. Defrag using OPTIMIZE TABLE command
There are two ways to optimize a table.
The first method is to use Optimize table command as shown below.
The following example will optimize EMPLOYEE table.
mysql> use thegeekstuff; mysql> OPTIMIZE TABLE EMPLOYEE;
You can also optimize multiple tables in a single command as shown below.
mysql> OPTIMIZE TABLE EMPLOYEE, DEPARTMENT, BENEFITS
Few points to keep in mind about optimize table:
- Optimize table can be performed for InnoDB engine, or MyISAM engine, or ARCHIVE tables.
- For MyISAM tables, it will analyze the table, it will defragment the corresponding MySQL datafile, and reclaim the unused space.
- For InnoDB tables, optimize table will simply perform an alter table to reclaim the space.
- If you have indexes, it will also resort the index pages, and update the statistics.
During optimization, MySQL will create a temporary table for the table, and after the optimization it will delete the original table, and rename this temporary table to the original table.
In the above optimization, the EMPLOYEE table is an MyISAM table.
For this example, before the optimization, you’ll see the following .MYD file for the table.
# ls -lh /var/lib/mysql/thegeekstuff/EMPLOYEE.* -rw-rw----. 1 mysql mysql 22G Apr 23 12:03 EMPLOYEE.MYD
When the “OPTIMIZE TABLE” command is running, you can see that it has created a temporary file for this table with extension .TMD. The size of this temporary file will keep growing until the optimize table is running.
# ls -lh /var/lib/mysql/thegeekstuff/EMPLOYEE.* -rw-rw----. 1 mysql mysql 22G Apr 23 12:03 EMPLOYEE.MYD -rw-rw----. 1 mysql mysql 500M Apr 23 14:10 EMPLOYEE.TMD
After the optimize table command finishes, you’ll not see the temporary table. Instead, you’ll see the original EMPLOYEE.MYD file that is optimized and with reduced file size.
# ls -lh /var/lib/mysql/thegeekstuff/EMPLOYEE.* -rw-rw----. 1 mysql mysql 2G Apr 23 14:20 EMPLOYEE.MYD
3. Defrag using mysqlcheck command
The second method to optimize a table is using mysqlcheck command as shown below.
The following example will optimize the DEPARTMENT table. You’ll execute this command from the Linux prompt (and not on MySQL prompt).
# mysqlcheck -o thegeekstuff DEPARTMENT -u root -pMySQLSecretPwd99 thegeekstuff.DEPARTMENT OK
Note: Internally mysqlcheck command uses “OPTIMIZE TABLE” command.
In the above example:
- mysqlcheck is the command that is executed from the Linux prompt.
- -o option is to indicate that mysqlcheck should perform “optimize table” operation.
- thegeekstuff is the database
- DEPARTMENT is the table inside thegeekstuff database that should be optimized
- -u root indicates that mysqlcheck command should use “root” as the mysql user to connect
- -p indicates the password for the root account of mysql. Please note that there is no space between -p option and the password.
Apart from optimization, you can also use mysqlcheck command to check, analyze and repair tables in your mysql database.
4. Defrag All Tables or All Databases
If you want to optimize all the tables in a particular MySQL database, use the following command.
The following command will optimize all the tables located in thegeekstuff database.
mysqlcheck -o thegeekstuff -u root -pMySQLSecretPwd99
If you have multiple database running on your system, you can optimize all the tables located under all the database on your system using the following command.
The following will optimize ALL database on your system.
mysqlcheck -o --all-databases -u root -pMySQLSecretPwd99
5. After Optimization
After the optimization, using the following query, check the total-size and unused-space-size for the three tables that we optimized in this example.
mysql> use thegeekstuff; mysql> select table_name, round(data_length/1024/1024) as data_length_mb, round(data_free/1024/1024) as data_free_mb from information_schema.tables where table_name in ( 'EMPLOYEE', 'DEPARTMENT', 'BENEFITS' ); +------------+----------------+--------------+ | table_name | data_length_mb | data_free_mb | +------------+----------------+--------------+ | BENEFITS | 2968 | 0 | | DEPARTMENT | 980 | 0 | | EMPLOYEE | 1799 | 0 | +------------+----------------+--------------+
As we see from the above output, the data_length_mb is drastically reduced for these tables after the optimization. Also, the data_free_mb is now at 0, as there are no fragmentation anymore.
The file size for these tables are now lot less when compared to the original size. We have now reclaimed lot of unused space at the filsystem level for these tables.
# ls -lh /var/lib/mysql/thegeekstuff/ .. -rw-rw----. 1 mysql mysql 3G Apr 23 14:23 BENEFITS.MYD -rw-rw----. 1 mysql mysql 980M Apr 23 14:30 DEPARTMENT.MYD -rw-rw----. 1 mysql mysql 2G Apr 23 14:45 EMPLOYEE.MYD ..
In this example, we’ve reclaimed around 37GB of unused space after optimizing these three tables.
Comments on this entry are closed.
It’s a great idea to defragment tables on a routine basis though it’s important to remember that doing so will take the table off-line. This isn’t a big deal if you’re not relying on the system in a production environment. However, in production, a zero-downtime deployment method should be used to avoid the outage. When doing this type of optimization in a replication ring, I will typically turn off binary logging of these statements before doing the alter or optimize table.
Hi,
when i try to optimize my sql dB, i have this note: Table does not support optimize, doing recreate +analyse instead . On which version of mysql your command is use, my version is 5.1.73.
yhanks
Can we run optimize table while running database in production environment..?
Agreed that it will be a big deal if doing this on production machine and ppl trying to reduce downtime. It would be good to “time” the defrag command and record the duration taken for defragmenting X size of database. It gives a rough ideal of duration required and can be added into planned production maintenance downtime such as patching etc.
Defragmentation will not only recover space, it will also help the queries run faster. Using partitions is a better way if you want to avoid frequent downtime for optimization.