Clone a table in MySQL without killing your server

So, I recently ran into one of those situations where a customer complains that his MySQL database is slow, and “it worked great until about two weeks ago”. The first thing I look at in these situations is not the queries or any code, but the indexes. Once I determined that the indexes were almost nonsensical, I had a look at the queries themselves, which proved that the indexes were, in fact, nonsensical. Running the queries as written in the code, from a mysql shell, with EXPLAIN, I was able to further confirm that the indexes (most of them, anyway) were never used.

Easy right? Just reindex the table!

NOTE: I’m going to skip the talk about all of the database design issues on this gig and just go straight to solving the immediate problem at hand. Just know that I had nothing to do with the design.

But, supposing this table has 15 million rows and is running on a machine with 2GB of RAM and only a single (well, mirrored) drive? It will take forever to reindex that table, and the machine can’t be made completely unavailable at any time, either by driving the load up so high that the Linux OOM killer reaps the mysql process, or by putting a lock on the table for, oh, 3 days or so 🙂

The solution is to create a brand new table, which I’ll call “good” using the “SHOW CREATE TABLE” output from the bad table, which I’ll call “bad”. I do this right in the shell, actually. I run “SHOW CREATE TABLE bad”, cut and paste the output, remove the part that defines the indexes, rename the table to “good”, and bam! New, empty table.

Of course, you still have to define your new indexes, so run whatever statements are needed to do that. You might even want to populate it with some test data (say, 10000 rows from the bad table) to test your new indexes are being used as expected (or that they can be by altering the query and getting back the same results… but faster).

Once done, it’s time to start copying rows from bad to good. I’ve written a shell script to help with that part. It’s designed to run on a Linux host running MySQL.

The variables at the top of the script are pretty self-explanatory, except to note that there are separate NEWDB and OLDDB variables in case your new table also lives in a new database. The INCREMENT is the number of rows you want to copy over at a time. If you set INCREMENT to 1000, it’ll copy 1000 rows, check the load average, and if it’s under MAXLOAD, it’ll copy over another 1000 rows. It also keeps track of the number of rows in each database as it goes, since writes are still happening on the bad table while this is going on.

So here’s my nibbler script, in shell. I would’ve written it in Python, but it wasn’t my machine, and I couldn’t install the python mysql module :-/

#!/bin/bash

###
### Written by Brian K. Jones (bkjones@gmail.com)
###
### Takes an increment, old db, and new db, and copies rows from olddb to newdb.
### Along the way, it’ll check system load and sleep if it’s too high.
### There’s too much hard-coding right now, but it’s a useful template, and
### has been tested. The script takes no CLI arguments.
###

INCR=10000
NEWDB=shiny
OLDDB=busted
OLDTABLE=bad
NEWTABLE=good
MAXLOAD=3
DBUSER=mydbuser
DBPASS=mydbpass

rows_old=`mysql -N -D ${OLDDB} -u ${DBUSER} -p${DBPASS} -e “select count(*) from ${OLDTABLE}”`
echo “rows_old is now ${rows_old}”

rows_new=`mysql -N -D ${NEWDB} -u ${DBUSER} -p${DBPASS} -e “select count(*) from ${NEWTABLE}”` ## num. rows in new table
echo “rows_new is now ${rows_new}”

for (( rows_new=$rows_new; rows_new < $rows_old; rows_new+=$INCR )); do if [ $((rows_old - (rows_new + INCR))) -gt 0 ]; then ## Check to see if there are at least $INCR rows left to copy over. mysql -N -D ${NEWDB} -u ${DBUSER} -p${DBPASS} -e "INSERT INTO ${NEWTABLE} SELECT * FROM ${OLDDB}.${OLDTABLE} LIMIT ${rows_new},${INCR}" while [ "`awk -v max=${MAXLOAD} '$1 > max {print “TRUE”}’ /proc/loadavg`” = “TRUE” ]; do
echo “sleeping due to load > ${MAXLOAD}”
sleep 30
done
# we update rows_old because it’ll be growing while this script runs.
rows_old=`mysql -N -D ${OLDDB} -u ${DBUSER} -p${DBPASS} -e “select count(*) from ${OLDTABLE}” `
rows_new=`mysql -N -D ${NEWDB} -u ${DBUSER} -p${DBPASS} -e”select count(*) from ${NEWTABLE}”`
time=`date +%R`
echo “${time} — rows_new = ${rows_new}, rows_old = ${rows_old}”

else ## There are < $INCR rows left. Select remaining rows. remaining=$((rows_old - rows_new)) mysql -N -D ${NEWDB} -u ${DBUSER} -p${DBPASS} -e "INSERT INTO ${NEWTABLE} SELECT * FROM ${OLDDB}.${OLDTABLE} LIMIT ${rows_new},${remaining}" echo "All done!" exit fi done [/sourcecode]

Share this: