Tuesday, December 17, 2013

Wordpress Import Blogs From Production to Development Environment

0 comments
We have a wordpress blog environment for a school sports portal where every state, league, school and team has their own blog. We have almost 160000 blogs in the system so far and which is growing. Often for development and QA purposes, the development team needs to have one or two blogs in their local environment. At the begining, as a DBA, I had to take backup of the blog’s tables and then import those tables in the QA environment and also create records at wp_blogs table. Besides, we have a sports adminitrative system and in wordpress database we setup some link tables with that system with tables like -
 
admin_school - related to school info
admin_team - related to team info
admin_league - related to league info
admin_region - related to state info


When I found that, it has become a regular task to import blogs from latest backup of production to development environment, I created a script “import_blog.py to automate the task.
python import_blog.py -p <config_file> blog_1 blog 2 … blog_N

I created the script on python 2.6. It reads a configuration files as below:

import_blog.conf
….
source_server='0.0.0.0'
source_os_user=root
source_os_password='xxxx'
source_db_backup_root_dir='/BACKUP'
source_domain='digitalsports.com'
source_mysql_user='dsuser'
source_mysql_passwd='dsuser'
source_mysql_db='dsdb'
source_mysql_port='3306'


import_blog_dir='/tmp/import_blogs'


target_server='10.0.0.1'
target_mysql_user='wpuser'
target_mysql_password='xxxx'
target_mysql_port='3307'
hyperdb_factor = 1000



It connects with source Mysql DB Server where we keep our daily backups which consist of one mysqldump file for each blog and one large dump file for common tables like wp_blogs. It finds the latest backup using the function :


def find_backup_dir(ssh):    
 stdin, stdout, stderr = ssh.exec_command('cd /BACKUP; ls')
 cmd_out = stdout.readlines()
 cmd_out= ','.join(cmd_out)
 backup_dirs = re.findall(r'dsdb_\w+', cmd_out)
 backup_dirs.sort()
 backup_dir = backup_dirs[len(backup_dirs)-2]
 backup_dir=source_dbbackupdir + '/' + backup_dir
 #print backup_dir
 return backup_dir


Then it tries to get the given blogs backup files and the common dump files using the function
find_backup_file_for_blogs  which calls find_backup_file_for_blog. Then it makes a tar file of the blog backup files and copy that tar file in target db server using ssh. I used python paramiko library for executing ssh commands.


Then at the target server, the script extracts the copied tar file to “import_blog_dir” config value using function “extract_import_blogs_gz” and modify the url of the backup files contains from production to development using sed tool in the function “make_dumpfiles_for_target_domain”.


def make_dumpfiles_for_target_domain(import_blog_dir,target_domain,source_domain):
 dump_files =  os.listdir(import_blog_dir)
 os.chdir(import_blog_dir)
 for f in dump_files:
  cmd = "sed  -i 's/" + source_domain + "/" + target_domain + "/g' " + f
  #print cmd
  os.system(cmd)
  cmd = "sed  -i 's/photos." + target_domain + "/photos." + source_domain + "/g' " + f
  #print cmd
  os.system(cmd)
  cmd = "sed  -i 's/videos." + target_domain + "/videos." + source_domain + "/g' " + f
  #print cmd
  os.system(cmd)
  cmd = "sed  -i 's/media." + target_domain + "/media." + source_domain + "/g' " + f
  #print cmd
  os.system(cmd)
  return import_blog_dir


Then it connects to target mysqldb and import those files to the target database and also import the wp_blogs record for that blog.

The script is available at
https://github.com/irumman/Wordpress-Import-Blog-from-Prod-To-QA









Leave a Reply

Labels