I had to take backup for a Mysql 5.1 Database that was running for Wordpress application with Hyperdb plugin. The application had almost 93000 blogs and as each blog consisted of 8 tables, there were 786000 tables in the database. All of the tables were created using MyISAM Storage Engine. The number of blogs were increasing day by day.
The structure of the DB as follows :-
SHOW DATABASES
imported_data
rdb
rdb0
rdb1
....
rdb93
Here,
imported_data - used to store data that were imported from legacy site.
rdb - used to store all common tables for wordpress and blog 1.
rdbN - used to store data for blogs. Blogs were partitioned in different databases using the formula blog_id/1000. That is, blog 0 was stored at db0 and blog 92134 was stored at db92.
At first, I tried with mysqldump program and it took 22 hours to complete the whole database backup. I was in a fix. Then I go with the following steps.
Modified at mysqldump.c
....
static char *get_actual_table_name(const char *old_table_name, MEM_ROOT *root)
{
....
DBUG_PRINT("Rumman", ("SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_NAME = \'%s\'",old_table_name));
my_snprintf(query, sizeof(query), "SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_NAME = \'%s\'",old_table_name
);
...
}
...
Compiled mysql at /root/dbbackup/mysql_r
Backup Scripts
I used t he following two scripts for digitalsports backup:
1. init_backup.sh : This script is used to initiate backup for all the blogs ( at present 93000). It creates backup slots and then execute the backup_script.sh according to the slots in parallel. Some important variables in this scripts are -
SCRIPT - denotes where the backup_script.sh exists
LOGDIR - directory where all the log files should create
PARALLEL - how many processes we want to start simultaneously for backup
2. backup_script.sh - this script is used call the modified mysqldump. It takes two parameters $START_BLOG_ID and $END_BLOG_ID. Within a for loop, it takes backup for the blogs and create separate sql file for each blog with the name such as blog_.sql.
MYSQLDUMP_ds - denotes where the modified mysqldump binary is located
PRODUCTION_TIME - denotes backup processes is stopped after this time
Both these scripts are added below.
Backup process works as follows;
- Take the maximum blog id at first
- Make slots for backup scripts using $PARALLEL and $STEP
- Make $BACKUP_DIR and other sub-directories. For each day there is a sub-directory with rdb_<%m%d%y>
- Call backup_script.sh and start mysqldump for each blog
- if stop.tmp is found in the $BACKUP_DIR, then backup processes will be stopped
- if server time pass the $PRODUCTION_TIME, then backup processes will be stopped
- To resume backup, we need to execute the init_backup.sh again and it will remove stop.tmp and start backup reading progress_NUMBER files from $BACKUP_DIR
-------------------------
init_backup.sh
-------------------------
#!/bin/bash
SCRIPT=/root/dbbackup/backup_scripts/backup_script.sh
LOGDIR=/BACKUP1/log
MIN_BLOG_ID=50
LAST_BLOG_ID=$MIN_BLOG_ID
PARALLEL=4
DBHOST=0.0.0.0
DB=rdb
DBUSER=ruser
DBUSERPWD=********
DBPORT=3306
DT=$(date +%m%d%y)
BACKUP_DIR=/BACKUP1/rdb_"$DT"
LOGDIR=$BACKUP_DIR/log_dir
if [ ! -d $BACKUP_DIR ]; then
echo "Making backup directory at $BACKUP_DIR with sub-directories"
mkdir -p $BACKUP_DIR
mysql -h $DBHOST -P $DBPORT -u $DBUSER --password=$DBUSERPWD -e "show databases like 'rdb%'" -N -s > $BACKUP_DIR/dblist.lst
for v_dbname in `cat $BACKUP_DIR/dblist.lst`
do
BACKUP_DIR=/BACKUP1/rdb_"$DT"/"$v_dbname"
mkdir -p $BACKUP_DIR
done #for v_dbname in `cat $BACKUP_DIR/dblist.lst`
BACKUP_DIR=/BACKUP1/rdb_"$DT"/r_imported_data
mkdir -p $BACKUP_DIR
BACKUP_DIR=/BACKUP1/rdb_"$DT"
echo "done"
echo "Making log directory"
LOGDIR=$BACKUP_DIR/log_dir
mkdir -p $LOGDIR/error
echo "done"
else
rm -f $BACKUP_DIR/stop.tmp
echo "Resuming Backup"
fi
if [ ! -f "$BACKUP_DIR"/MAX_BLOG ];
then
echo "MAX_BLOG file not found"
MAX_BLOG_ID=$(mysql -h $DBHOST -P $DBPORT -u $DBUSER --password=$DBUSERPWD -e "select max(blog_id) from rdb.wp_blogs" -N -s )
echo "$MAX_BLOG_ID" > $BACKUP_DIR/MAX_BLOG
else
echo "MAX_BLOG file found"
MAX_BLOG_ID=$(cat "$BACKUP_DIR"/MAX_BLOG)
fi # if [ -f $BACKUP_DIR/MAX_BLOG ];
echo "MAX BLOG = " $MAX_BLOG_ID
let STEP=$MAX_BLOG_ID/$PARALLEL
echo "$SCRIPT -1 0 $BACKUP_DIR >> $LOGDIR/script_rdb_and_r_imported_data.log.$(date +%m%d%y) &"
$SCRIPT -1 0 $BACKUP_DIR >> $LOGDIR/script_rdb_and_r_imported_data.log.$(date +%m%d%y) &
while [ $MIN_BLOG_ID -le $MAX_BLOG_ID ]
do
let LAST_BLOG_ID="$MIN_BLOG_ID+$STEP"
if [ "$LAST_BLOG_ID" -gt "$MAX_BLOG_ID" ];
then
let LAST_BLOG_ID="$MAX_BLOG_ID"
fi
if [ -f $BACKUP_DIR/progress_"$LAST_BLOG_ID" ];
then
MIN_BLOG_ID=$(cat $BACKUP_DIR/progress_"$LAST_BLOG_ID")
echo $MIN_BLOG_ID
fi #if [ -f $BACKUP_DIR/progress_$MIN_BLOG_ID_$LAST_BLOG_ID ];
echo "$SCRIPT $MIN_BLOG_ID $LAST_BLOG_ID $BACKUP_DIR >> $LOGDIR/script_"$LAST_BLOG_ID".log.$(date +%m%d%y) &"
$SCRIPT $MIN_BLOG_ID $LAST_BLOG_ID $BACKUP_DIR >> $LOGDIR/script_"$LAST_BLOG_ID".log.$(date +%m%d%y) &
let MIN_BLOG_ID="$LAST_BLOG_ID+1"
done #end while [ $LAST_BLOG_ID -lt $MAX_BLOG_ID]
--------------------------
backup_script.sh
--------------------------
#!/bin/bash
DBHOST=0.0.0.0
DB=rdb_old
DBUSER=ruser
DBUSERPWD=******
DBPORT=3306
PRODUCTION_TIME=090000
MYSQLDUMP_r=/root/dbbackup/mysql_r/bin/mysqldump
if [ -z $1 ]; then
echo "!!!Start Blog ID for backup cannot be blank!!!"
exit
elif [ $1 -ne "-1" ] && [ -z $2 ]; then
echo "!!!End Blog ID for backup cannot be blank!!!"
exit
fi
START_BLOG_ID=$1
END_BLOG_ID=$2
PARENT_BACKUP_DIR=$3
LOGDIR=$PARENT_BACKUP_DIR/log_dir
OUTFILE=/tmp/blogs_"$START_BLOG_ID"_"$END_BLOG_ID".out
DT=$(date +%m%d%y)
if [ $START_BLOG_ID -eq "-1" ];
then
BACKUP_DIR=$PARENT_BACKUP_DIR/rdb
if [ ! -f $BACKUP_DIR/rdb.sql.$(date +%m%d%y) ];
then
echo "rdb backup started on the $(date +%m%d%y_%H%M) ..."
DB=rdb
$MYSQLDUMP_r $DB -h $DBHOST -u $DBUSER -P $DBPORT --password=$DBUSERPWD --log_error=$LOGDIR/error/rdb.sql.err > $BACKUP_DIR/rdb.sql.$(date +%m%d%y)
mysqldump_success=$?
if [ $mysqldump_success -eq '0' ];
then
rm -f $LOGDIR/error/rdb.sql.err
else
cat $LOGDIR/error/rdb.sql.err
fi #if [ $mysqldump_success -eq '0' ];
echo "rDB backup completed on the $(date +%m%d%y_%H%M)"
touch $LOGDIR/backup_compled_rdb_$(date +%m%d%y_%H%M).tmp
if [ -f /BACKUP1/rdb_"$DT"/stop.tmp ];
then
echo "Stopped"
touch $LOGDIR/stopped_after_rdb
exit 1
fi #if [ -f /BACKUP1/rdb_"$DT"/stop.tmp ];
fi #if [ ! -f $BACKUP_DIR/rdb.sql.$(date +%m%d%y) ];
BACKUP_DIR=/BACKUP1/rdb_"$DT"/r_imported_data
if [ ! -f $BACKUP_DIR/r_imported_data.sql.$(date +%m%d%y) ];
then
echo "r_imported_databackup started on the $(date +%m%d%y_%H%M) ..."
$MYSQLDUMP_r r_imported_data -h $DBHOST -u $DBUSER -P $DBPORT --password=$DBUSERPWD --log_error=$LOGDIR/error/r_imported_data.sql.err > $BACKUP_DIR/r_imported_data.sql.$(date +%m%d%y)
mysqldump_success=$?
if [ $mysqldump_success -eq '0' ];
then
rm -f $LOGDIR/error/r_imported_data.sql.err
else
cat $LOGDIR/error/r_imported_data.sql.err
fi
echo "r_imported_data backup completed on the $(date +%m%d%y_%H%M)"
touch $LOGDIR/backup_compled_r_imported_data_$(date +%m%d%y_%H%M).tmp
exit
fi #if [ ! -f $BACKUP_DIR/r_imported_data.sql.$(date +%m%d%y) ];
fi
#Per Blog backup
rm -f $OUTFILE
mysql $DB -h $DBHOST -u $DBUSER -P $DBPORT --password=$DBUSERPWD -e "select blog_id from wp_blogs where blog_id between $START_BLOG_ID and $END_BLOG_ID " -s -N > $OUTFILE
LAST_BLOG_HYPERDB=$(mysql -h $DBHOST -P $DBPORT -u $DBUSER --password=$DBUSERPWD -e "select config_value from r_imported_data.r_config where config_key='last_blog_clustered'" -N -s )
echo $LAST_BLOG_HYPERDB
if [ -s $OUTFILE ]; then
echo "... Backup started for blog ID $START_BLOG_ID to $END_BLOG_ID on the $(date +%m%d%y_%H%M) ..."
for i in `cat $OUTFILE`
do
#If time passed after $PRODUCTION_TIME then stop
if [ $(date +%H%M%S) -gt $PRODUCTION_TIME ];
then
echo "Stoppped as time passed $PRODUCTION_TIME"
touch $PARENT_BACKUP_DIR/stop.tmp
fi #if [ $(date +%H%M%S) -gt $PRODUCTION_TIME ];
#Check for stop
if [ -f $PARENT_BACKUP_DIR/stop.tmp ];
then
echo "Stopped"
touch $LOGDIR/stopped_blog_"$i"
exit 1
fi
#Find database name
if [ $i -le $LAST_BLOG_HYPERDB ];
then
let DBNUM="$i"/1000
DB=rdb"$DBNUM"
else
DB=rdb_old
fi
BACKUP_DIR=$PARENT_BACKUP_DIR/"$DB"
echo $BACKUP_DIR
echo $DB
echo "Backup Blog $i ..."
$MYSQLDUMP_r $DB -h $DBHOST -u $DBUSER -P $DBPORT --password=$DBUSERPWD \
wp_"$i"_commentmeta \
wp_"$i"_comments \
wp_"$i"_links \
wp_"$i"_options \
wp_"$i"_postmeta \
wp_"$i"_posts \
wp_"$i"_term_relationships \
wp_"$i"_term_taxonomy \
wp_"$i"_terms --skip-lock-tables --log-error=$LOGDIR/error/blog_"$i".sql.err > $BACKUP_DIR/blog_"$i".sql.$(date +%m%d%y)
mysqldump_success=$?
if [ $mysqldump_success -eq '0' ];
then
rm -f $LOGDIR/error/blog_"$i".sql.err
echo "done"
else
cat $LOGDIR/error/blog_"$i".sql.err
echo "!!!Failed!!!"
fi
echo "$i" > $PARENT_BACKUP_DIR/progress_"$END_BLOG_ID"
done
echo "... Backup end for blog ID $START_BLOG_ID to $END_BLOG_ID on the $(date +%m%d%y_%H%M) ..."
touch $LOGDIR/backup_compled_"$START_BLOG_ID"_"$END_BLOG_ID"_$(date +%m%d%y_%H%M).tmp
fi