Wednesday, June 20, 2012

Character Set Conversion for Wordpress MySql

In my wordpress database, there were some tables which were running with character set latin1 while others were running with utf8.
As our application use SET NAMES UTF8 before every query, we were not facing any trouble from client side. But it was not standard that without any requirement we had the database with multiple character set. So I decided to modify all tables character set to utf8.

To modify the character set of tables, we had to look for any char, varchar or text data type column in those tables. If any type of these columns exists, we had to modify it to BLOB and then convert the character set and then again modify to its original data type. In this way, we can ensure no loss of any character.

As primary and unique indexes do not allow to modify the data type of a column, we had to drop indexes from the columns to achive the above task.

Our wordpress database has been running with Hyperdb configuration. So we have 'db' as the common database and then 'dbN' for each 1000 blogs.

I followed the below steps:-

Step 1: Stop connetions

Change database user password to stop connection

Step 2: Create Drop indexes Script

vi 2_drop_indexes.sql
alter table db.table1 drop  index index1;
select 1;
alter table db.table2 drop  index index1;
select 2;
alter table db.table3 drop  primary key;
select 16;

Step 3: Create Script tp Modify Columns to BLOB

concat('alter table `', table_schema, '`.`', table_name,'` modify ',column_name,' BLOB;') as cmd_before
from information_schema.columns as c
where table_schema = 'db'
and data_type in ('char', 'varchar','text')
and exists
select 1
from information_schema.tables as t
where table_schema = 'db'
and t.table_type = 'BASE TABLE'
and t.table_collation <> 'utf8_general_ci'
and t.table_name = c.table_name
and t.table_schema = c.table_schema
INTO OUTFILE '3_col_text_to_blob.sql' ;

Step 4: Create Script tp Change table character set

#ALTER TABLE table_name CONVERT TO CHARACTER SET charset_name [COLLATE collation_name];
select concat('ALTER TABLE `',table_schema, '`.`', table_Name,'` CONVERT TO CHARACTER SET utf8;') as cmd
from information_schema.tables
where table_schema = 'db'
and table_type = 'BASE TABLE'
and table_collation <> 'utf8_general_ci'
INTO OUTFILE '4_table_latin_to_utf8.sql' ;

Step 5: Create Script to Modify wp_N_tems_relationships

select concat('ALTER TABLE `',table_schema, '`.`', table_Name,'` CONVERT TO CHARACTER SET utf8;') as cmd
from information_schema.tables
where table_name like 'wp_%_term_relationships'
and table_type = 'BASE TABLE'
and table_collation <> 'utf8_general_ci'
INTO OUTFILE '5_table_tems_relationships_to_utf8.sql' ;

Step 6: Create Script to Revert column character set

select concat('alter table `',table_schema, '`.`', table_name,'` modify ',column_name,' ', column_type
,';') as cmd_after
from information_schema.columns as c
where table_schema = 'db'
and data_type in ('char', 'varchar','text')
and exists
select 1
from information_schema.tables as t
where table_schema = 'db'
and t.table_type = 'BASE TABLE'
and t.table_collation <> 'utf8_general_ci'
and t.table_name = c.table_name
and t.table_schema = c.table_schema
INTO OUTFILE '6_col_blob_to_text.sql' ;

Step 7: Create Script to Create index

vi 7_crate_index.sql
alter table table1 add unique key index1(col1, col2);
select 1;
alter table db.table2 add unique index index1 (index1);
select 2;
alter table db.table3 add primary key (col1);
select 16;

Step 8: Create Script to Analyze Table

vi 8_analyze_tables.sql
analyze table table1;
select 1;
analyze table db.table2;
select 2;
analyze table db.table3 ;
select 16;

Step 9: Execute Scripts

mysql db -s -N < 2_drop_indexes.sql
mysql db -s -N < 3_col_text_to_blob.sql
mysql db -s -N < 4_table_latin_to_utf8.sql
mysql db -s -N < 5_table_tems_relationships_to_utf8.sql
mysql db -s -N < 6_col_blob_to_text.sql
mysql db -s -N < 7_crate_index.sql
mysql db -s -N < 8_analyze_tables.sql

Step 10: Modify database parameters

vi my.cnf

mysql restart

Continue reading →
Sunday, June 17, 2012

Convert MS Sql script to MySql

Yesterday I got a work to convert a MS SQL CREATE TABLE script to MySql. There were some other stuffs like DROP PROCEDURE, ALTER TABLE in the script. But my client wanted me to convert only the CREATE TABLE parts. So rather converting this manually, I made a python script to achieve the task.
The script was like as follows:-

exec sp_dboption N'testdb', N'autoclose', N'true'
exec sp_dboption N'testdb', N'bulkcopy', N'false'
/****** Object:  Stored Procedure dbo.testproc    Script Date: 11/01/2003 15:25:29 ******/
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[testproc]') and OBJECTPROPERTY(id, N'IsProcedure') = 1)
drop procedure [dbo].[testproc]
CREATE TABLE [dbo].[table1] (
[OrgCode] [varchar] (5) NOT NULL ,
[OrgName] [varchar] (60) NOT NULL ,
[Logo] [image] NULL ,
[Col1] [bit] NOT NULL ,
[Col2] [varchar] (25) NOT NULL ,
[col3] [timestamp] NULL 

/****** Object:  Table [dbo].[table2]    Script Date: 11/01/2003 15:27:18 ******/
CREATE TABLE [dbo].[table2] (
[ORG_CODE] [varchar] (5) NOT NULL ,
[col1] [varchar] (10) NOT NULL ,
[col2] [timestamp] NULL 

I found that every statement has been ended with GO command. I selected it as a delimiter for the script. At first, I read the whole file in my buffer. Then I extracted only the CREATE TABLE ... GO parts. I found there were 233 tables. Then I modified the CREATE TABLE parts by adding ` before column and table names, replacing [ and ] and also adding longblob and decimal(19,4) data type instead of the image and money data types.

The as follows:

#!/usr/bin/python -tt

import sys
import commands
import os
import re

def make_list_create_table(script_text):
  list = script_text.split('\nGO')
  list_create_table = []
  for i in range(len(list)):
    match ='CREATE TABLE',list[i])
    if match:

  return list_create_table
def convert_create_table_for_mysql(list_create_table):
  for i in range(len(list_create_table)):
    s = list_create_table[i]
    m ='create table(.*)\([\r\n a-zA-Z0-9,\t_\)\( \[\]]+\)',s,flags=re.DOTALL|re.IGNORECASE)
    if m:
      s =
      #replace [] from column name and add `
      s = re.sub(r'''([\(,]+[\r\n\t ]*)\[([ a-zA-Z_0-9]+)\]''', r'''\1`\2`''', s, flags=re.DOTALL|re.IGNORECASE)
      #replace all other []
      s = s.replace('[','')
      s = s.replace(']','')
      #replace dbo. and add ` for table name
      s = re.sub(r'''dbo.([a-zA-Z0-9_]+) ''', r'''`\1`''',s, flags=re.DOTALL|re.IGNORECASE)
      s = s + ' ENGINE=InnoDB DEFAULT CHARSET=latin1;\n\n'
      list_create_table[i] = s
      print "!!!not match!!!"
  return   list_create_table

def write_file(file_path,str):
  f = open(file_path,'w')
def data_type_conversion(list_create_table):
  for i in range(len(list_create_table)):
    s = list_create_table[i]
    s = s.replace(' image ',' longblob ');
    s = s.replace(' money ',' decimal (19,4) ');
    #s = s.replace(' numeric ',' decimal ');
    list_create_table[i] = s
  return list_create_table
def main():
  file_path = 'E:\mssql_scriptsql'
  f = open(file_path,'r')
  script_text =

  list_create_table = make_list_create_table(script_text)
  list_create_table = convert_create_table_for_mysql(list_create_table)
  list_create_table = data_type_conversion(list_create_table)
if __name__ == '__main__':

Continue reading →
Sunday, June 10, 2012

Baseball Statistics Data Model

A few months ago I was working in a sports portal project. As a database developer, my first task was designing data model for Baseball sports. I am living in Bangladesh where I have not ever played the game. Even I did not watch the game in television let alone my knowledge on game rules and statistics. But I have to solve the task.
So what I did I read the following two links :-
Baseball Rules
Baseball Statistics

These two links help to gather knowledge on the game. But theoretical knowledge was not enough for me and as I am a sports lover from my childhood, I collect the game BASEBALL 2005 by EA Sports. Playing the game I got myself interested in Baseball and also I got the knowledge about it.
Finally, I designed the data model for the development team which I added here below:-

I designed the data model in 3 layers-
1. Common Layer : 
This layer was designed to support player profile entry, player roster management, teams, stadium. It was named Common as we planned to re-use this part for other games also. The tables are  - Players
- Team_Rosters
- Season
- Roster_details
- Team
- Stadium
- Position
- Game
- Game_Roster
2. Game Layer:
This layer was designed specifically to support Baseball game. Tables are
- Half_innings
- Batter_at_bat
- Pitcher
- Plate_outcome
- Plate_outcome_types
- Picher_statistics_at_half_innigs
- Batter_statistics_at_half_innigs
3. Statistics Layer:
This layer was designed to summarize the game layer's data to display the statistics.Tables are

The rest of the logic was written in application layer.

Continue reading →
Friday, June 8, 2012

Postgresql dblink connect with random port


I was working in a project where I have to build a function which will use dblink to connect to other Postgresql databases. It could be easily done unless my client wanted to use port address dynamically from the function. More precisely, he would going to connect to other databases using dblink_connect and the critical part was in his environment different Postgresql clusters were running in different ports. He was the project owner and he told me that the developers had no idea in which ports the Postgresql instances were running. So he wanted me to write a function which would find out the port of the Postgresql instance from where the dblink_connect had been called. The developer would pass the database name, the user name and the password and my function would use these variables to make the developer connect for operation in that Postgresql instance.

So I build the function as below :-

CREATE OR REPLACE FUNCTION dblink_connect_with_dynamic_port(prm_dbname text, prm_user_name text, prm_password text) RETURNS text AS $$
      str VARCHAR(100);
      v_port int;
       execute 'show port' into v_port ;
       str := '''dbname = '||prm_dbname||' user= '||prm_user_name||' password = '||prm_password||'  port=' ||v_port || '''';
      str :=  'select dblink_connect(\'new_conn\','||str||')';
       execute str;
       return str;
$$ LANGUAGE plpgsql;

Continue reading →
Friday, June 1, 2012

Sql Server Index Rebuild


I had to tune our application database that was running on Sql Server 2005. The first step that I choose to rebuild the indexes. I used sys.dm_db_index_physical_stats view to get size and fragmentation information for the data and indexes of the specified table or view.

As data in this view are refreshed with server restart, I created a table to hold this information.
CREATE TABLE testdb.[dbo].[index_stat](
    [id] [int]  IDENTITY(1,1),
    [objectid] [int] NULL,
    [indexid] [int] NULL,
    [partitionnum] [int] NULL,
    [frag] [float] NULL,
    [start_time] [datetime] NULL,
    [completed_time] [datetime] NULL
Saving data of the above view in the table where avg_fragmentation_in_percent is greater than 10. 
use testdb
insert into testdb.dbo.index_stat([objectid],[indexid],[partitionnum],[frag])
    object_id AS objectid,
    index_id AS indexid,
    partition_number AS partitionnum,
    avg_fragmentation_in_percent AS frag
FROM sys.dm_db_index_physical_stats (5, NULL, NULL , NULL, 'LIMITED')
WHERE avg_fragmentation_in_percent > 10.0 AND index_id > 0;

Now I created the index_rebuild_proc to rebuild the indexes where fragmentation is above 30.0 and reorganize where fragmentation is between 10.0 and 30.0.

-- Ensure a USE statement has been executed first.
CREATE PROCEDURE index_rebuild_proc
DECLARE @id int;
DECLARE @objectid int;
DECLARE @indexid int;
DECLARE @partitioncount bigint;
DECLARE @schemaname nvarchar(130);
DECLARE @objectname nvarchar(130);
DECLARE @indexname nvarchar(130);
DECLARE @partitionnum bigint;
DECLARE @partitions bigint;
DECLARE @frag float;
DECLARE @command nvarchar(4000);
-- Conditionally select tables and indexes from the sys.dm_db_index_physical_stats function
-- and convert object and index IDs to names.

-- Declare the cursor for the list of partitions to be processed.
DECLARE indexes_cur CURSOR FOR SELECT id, objectid, indexid, partitionnum, frag FROM testdb.dbo.index_stat where completed_time is NULL ;
-- Open the cursor.
OPEN indexes_cur;
-- Loop through the partitions.
WHILE (1=1)
        FETCH NEXT
           FROM indexes_cur
           INTO @id, @objectid, @indexid, @partitionnum, @frag;
        IF @@FETCH_STATUS < 0 BREAK;
        SELECT @objectname = QUOTENAME(, @schemaname = QUOTENAME(
        FROM sys.objects AS o
        JOIN sys.schemas as s ON s.schema_id = o.schema_id
        WHERE o.object_id = @objectid;
        SELECT @indexname = QUOTENAME(name)
        FROM sys.indexes
        WHERE  object_id = @objectid AND index_id = @indexid;
        SELECT @partitioncount = count (*)
        FROM sys.partitions
        WHERE object_id = @objectid AND index_id = @indexid;
-- 30 is an arbitrary decision point at which to switch between reorganizing and rebuilding.
        IF @frag < 30.0
            SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REORGANIZE';
        IF @frag >= 30.0
            SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REBUILD';
        IF @partitioncount > 1
            SET @command = @command + N' PARTITION=' + CAST(@partitionnum AS nvarchar(10));
        if getdate() < cast(CONVERT(VARCHAR(20),GETDATE(),101) + ' 05:00:00' as datetime)
            update testdb.dbo.index_stat set start_time = getdate() where id = @id;
            --EXEC (@command);
            PRINT N'Executed: ' + @command;
            update testdb.dbo.index_stat set completed_time = getdate() where id = @id;
-- Close and deallocate the cursor.
CLOSE indexes_cur;
DEALLOCATE indexes_cur;

This procedure will update start_time column of testdb.dbo.index_stat as soon as it starts rebuilding of that index and similarly update completed_time column of the table after completion of rebuilding process. This helps me to monitor the progress of the procedure. Again, as I am going to add this procedure in job scheduler, so I add a condition so that the process might stop as soon as server time passes 05:00 AM.

Now, execute the procedure.
exec index_rebuild_proc
Continue reading →