Tuesday, December 17, 2013

Wordpress Import Blogs From Production to Development Environment

We have a wordpress blog environment for a school sports portal where every state, league, school and team has their own blog. We have almost 160000 blogs in the system so far and which is growing. Often for development and QA purposes, the development team needs to have one or two blogs in their local environment. At the begining, as a DBA, I had to take backup of the blog’s tables and then import those tables in the QA environment and also create records at wp_blogs table. Besides, we have a sports adminitrative system and in wordpress database we setup some link tables with that system with tables like -
admin_school - related to school info
admin_team - related to team info
admin_league - related to league info
admin_region - related to state info

When I found that, it has become a regular task to import blogs from latest backup of production to development environment, I created a script “import_blog.py to automate the task.
python import_blog.py -p <config_file> blog_1 blog 2 … blog_N

I created the script on python 2.6. It reads a configuration files as below:



hyperdb_factor = 1000

It connects with source Mysql DB Server where we keep our daily backups which consist of one mysqldump file for each blog and one large dump file for common tables like wp_blogs. It finds the latest backup using the function :

def find_backup_dir(ssh):    
 stdin, stdout, stderr = ssh.exec_command('cd /BACKUP; ls')
 cmd_out = stdout.readlines()
 cmd_out= ','.join(cmd_out)
 backup_dirs = re.findall(r'dsdb_\w+', cmd_out)
 backup_dir = backup_dirs[len(backup_dirs)-2]
 backup_dir=source_dbbackupdir + '/' + backup_dir
 #print backup_dir
 return backup_dir

Then it tries to get the given blogs backup files and the common dump files using the function
find_backup_file_for_blogs  which calls find_backup_file_for_blog. Then it makes a tar file of the blog backup files and copy that tar file in target db server using ssh. I used python paramiko library for executing ssh commands.

Then at the target server, the script extracts the copied tar file to “import_blog_dir” config value using function “extract_import_blogs_gz” and modify the url of the backup files contains from production to development using sed tool in the function “make_dumpfiles_for_target_domain”.

def make_dumpfiles_for_target_domain(import_blog_dir,target_domain,source_domain):
 dump_files =  os.listdir(import_blog_dir)
 for f in dump_files:
  cmd = "sed  -i 's/" + source_domain + "/" + target_domain + "/g' " + f
  #print cmd
  cmd = "sed  -i 's/photos." + target_domain + "/photos." + source_domain + "/g' " + f
  #print cmd
  cmd = "sed  -i 's/videos." + target_domain + "/videos." + source_domain + "/g' " + f
  #print cmd
  cmd = "sed  -i 's/media." + target_domain + "/media." + source_domain + "/g' " + f
  #print cmd
  return import_blog_dir

Then it connects to target mysqldb and import those files to the target database and also import the wp_blogs record for that blog.

The script is available at

Continue reading →

pgHA - Postgresql+Pgbouncer High Availability Tools

I was working on a project to setup easy failover system for Postgresql Master-Slave replication where all app servers were connected to the dbserver through pgbouncer. I had to configure such a infrastructure where we could easily promote Slave to master in case of failover.

To achieve this, I found a good tool named pgHA, developed by OpenSCG (https://bitbucket.org/openscg/pgha).

pgHA - working mechanism

pgHA is a perl based tool and it has a file called failoverd.pl which is actually the heart of the tool. Installation procedures described at README file.

pgHA works at the slave server. It reads some configuration parameters from pgha.conf and works as a background process in the server.
When we executes
perl failoverd.pl -c ../cfg/pgha.conf --start
it modifies the app servers pgbouncer config files with pgbouncer_failover_config and starts a background process. In order to modify the pgbouncer.ini files in the app servers, pgHA requires password less access to the app servers from slave db server.

When we executes
perl failoverd.pl -c pgha.conf --pushButton
it creates the trigger file in the slave which allows the slave to act as master and restarts the app servers' pgbouncer so that it reads pgbouncer.ini directing to new master.

pgHA also supports automatic failover.

It also reads some configuration parameters to kill all database connections through pgbouncer to the old master and then resume with the new master db server. These parameters are -

# Which databases should be part of failover
# pgBouncer command to pause / fence the db's
# pgBouncer command to reload the config file
# pgBouncer command to unlock the db's

Plug pgHA with Postgresql 8.4:

pgHA was designed to work with Postgresql Streaming Replication and pgbouncer infrastructure. In my end, we were running with Postgresql 8.4 Log Shipping Replication. As pgHA was executing queries in the Slave server to find out if the slave is alive or not, we were not able to implement it at our site. So I decided to modify the pgHA source a bit to work with Postgresql 8.4 Log Shipping.

In Postgresql 8.4, it was not possible to execute queries on slave as it had no hot_standby feature. So I added a function in the source of pgHA, which checked that if any service was running at the slave with the given port and if it would find yes, then it would read the pg_controldata and looked for 'in archive recoverymode'. With this function, I also added some parameters which were required for the execution of the function.

The function was:

# Sub: CheckPgDatabase
# Inputs:
#       dbport- Slave Postgres db port  
# Output:
#       isDBUp      - (Boolean) 1 if database is up, 0 if it is down
#       Check if Postgresql is running on that port using netstat command. The OS user that is running pgha should have the permission to
#       execute netstat command. If it founds the Postgresql is running on that port, to make sure it parses the value from pg_controldata and
#       check "Database cluster state". The value should be "in archive recovery" for a running Postgresql instance with pg_standby.
sub CheckSlaveStatusInArchiveRecoveryMode
my ( $dbport ) = @_;
  my $isDBUp=1;
  $logger->debug("Attempting connect: ".$connStr);  
  if ($isDBUp==1)
  #Checking if there is any service running on the dbport 
 my $cmd = 'netstat -a | grep '. $dbport;
 my $var = qx{$cmd};
 if ($var)
  #print "\n" . $cmd; 
   #print "\n\t Slave is running" ;
 else #if ($var)
  #print "\n\t Slave is NOT running" ; 
 } #if ($var)
  } #if ($isDBUp==1)
  if ($isDBUp==1)
  # if the above passed, then check what the pg_controldata says
 my $cmd = $v_slavePgengine . '/pg_controldata ' .  $v_slavePgdata . ' | grep "Database cluster state"';
 my $var = qx{$cmd};
 if ($var =~ /in archive recovery/)
   print "\n\t Database is in archive recovery mode\n";
  print "\n\t ". $cmd;
  print "\n\t Database in down\n";
  } #if ($isDBUp==1)
  return $isDBUp;
} #sub CheckSlaveStatusInArchiveRecoveryMode

The pgHA code was written so self-descriptive that I was able to push my desired function in the code. I communicated with the pgHA developer and he really liked the solution and ready to push it in his code. That was really inspirational for me.


Continue reading →
Tuesday, December 3, 2013

Postgresql online space reclaim using pg_repack

Working with the Postgresql databases, often we hear the term table bloat and index bloat. Postgresql uses MVCC technology, for which deleted or updated rows remain in the table and it creates index and table bloat. Earlier, we used to work with VACUUM FULL or CLUSTER operation, in order to, reclaim the unused space of a table. VACUUM FULL and CLUSTER both locks the table so that users cannot make any changes on it.
So in 24X7 production system, it was difficult to to execute those operation. Searching a bit, I found the Postgresql extension project pg_repack.

What is pg_repack:

Accoring to https://github.com/reorg/pg_repack,
pg_repack is a PostgreSQL extension which lets you remove bloat from tables and indexes, and optionally restore the physical order of clustered indexes. Unlike CLUSTER and VACUUM FULL it works online, without holding an exclusive lock on the processed tables during processing. pg_repack is efficient to boot, with performance comparable to using CLUSTER directly.
pg_repack 1.1.8 was released as a drop-in replacement for pg_reorg, addressing some of the shortcomings of the last pg_reorg version (such as support for PostgreSQL 9.2 and EXTENSION packaging) and known bugs.

Install pg_repack:

Download pg_repack from   http://pgxn.org/dist/pg_repack/.
Installation of pg_repack quite easy and simple. pg_repack looks for pg_config, so make sure pg_config is available.

cp -R pg_repack /source/postgresql-9.2.4/contrib/pg_repack
cd pg_repack
make install

Now at psql client

psql# select * from pg_available_extensions where name = 'pg_repack';
name              | pg_repack
default_version   | 1.1.8
installed_version | 1.1.8
comment           | Reorganize tables in PostgreSQL databases with minimal locks

pg_repack creates a schema "repack" in the database;

Tables in repack:

\dv repack.*
            List of relations
 Schema |     Name     | Type |  Owner
 repack | primary_keys | view | postgres
 repack | tables       | view | postgres
(2 rows)

Functions in repack:

\df repack.*
 Schema |           Name            |  Type
 repack | array_accum                | agg
 repack | conflicted_triggers        | normal
 repack | disable_autovacuum         | normal
 repack | get_assign                 | normal
 repack | get_columns_for_create_as  | normal
 repack | get_compare_pkey           | normal
 repack | get_create_index_type      | normal
 repack | get_create_trigger         | normal
 repack | get_drop_columns           | normal
 repack | get_enable_trigger         | normal
 repack | get_index_columns          | normal
 repack | get_index_keys             | normal
 repack | oid2text                   | normal
 repack | repack_apply               | normal
 repack | repack_drop                | normal
 repack | repack_indexdef            | normal
 repack | repack_swap                | normal
 repack | repack_trigger             | trigger
 repack | version                    | normal
 repack | version_sql                | normal
(20 rows)                                                                                                                                                 

Example with a dummy table:

Create a table:


Populate the table:

INSERT INTO t1 SELECT c , 'a'||c FROM generate_series(1,10000000) as c;
INSERT 0 10000000

Check table size: 

 select pg_size_pretty(pg_table_size('t1')); pg_size_pretty
 539 MB
(1 row)

Delete records from table:

DELETE FROM t1 WHERE i < 100000;
DELETE 99999
SELECT pg_size_pretty(pg_table_size('t1'));
 539 MB
(1 row)
DELETE FROM t1 WHERE i < 900000;
DELETE 400000
select pg_size_pretty(pg_table_size('t1'));
 539 MB
(1 row)

Update some records:
Update t1 SET nam = 'c'||i WHERE i < 100000; 

Check table size: 

SELECT pg_size_pretty(pg_table_size('t1'));
 828 MB
(1 row)

Execute pg_repack command:

pg_repack -d postgres -t t1 --no-order

Online update during repack:

update t1 set nam = 'c'||i where i = 1000000;

Check table size after repack:

select pg_size_pretty(pg_table_size('t1'));
490 MB
(1 row)

Moitoring locks:

Using the following query, we monitored the lock during repack

 SELECT locktype, mode, relation, c.relname
 FROM pg_locks as l
 INNER JOIN pg_class as c
   ON l.relation = c.oid;

 locktype |       mode       | relation |          relname

 relation | AccessShareLock  |    33608 | t1_pkey
 relation | AccessShareLock  |    33605 | t1
 relation | RowExclusiveLock |    33641 | log_33605_pkey
 relation | RowExclusiveLock |    33634 | log_33605
 relation | AccessShareLock  |    11069 | pg_locks
 relation | AccessShareLock  |     2663 | pg_class_relname_nsp_index
 relation | AccessShareLock  |     2662 | pg_class_oid_index
 relation | AccessShareLock  |     1259 | pg_class
 relation | AccessShareLock  |    11069 | pg_locks
 relation | SIReadLock       |    33605 | t1
 relation | SIReadLock       |    33634 | log_33605
(11 rows)

After repack operation, locks were -
SELECT locktype, mode, relation, c.relname
FROM pg_locks as l
INNER JOIN pg_class as c
  ON l.relation = c.oid;

 locktype |      mode       | relation |          relname

 relation | AccessShareLock |     2663 | pg_class_relname_nsp_index
 relation | AccessShareLock |     2662 | pg_class_oid_index
 relation | AccessShareLock |     1259 | pg_class
 relation | AccessShareLock |    11069 | pg_locks
(4 rows)

Internal procedures:

pg_repack follows the steps in order to reclaim the unused space with minimal locking

1. Setup workspaces and a trigger.                                                                                      
2. Copy tuples into temp table.                                                                                        
3. Create indexes on temp table.                                                                                        
4. Apply log to temp table until no tuples are left in the log and all of the old transactions are finished.                 When All old transactions are finished, go to next step.
5. Swap - original table with new temp table                                                                                                 6. Drop - temporary tables and triggers and functions
7. Analyze - the new table                                                                                                            

Triggers on table:

pg_repack operation creates trigger on the specified table:

\d t1
      Table "public.t1"
 Column |  Type   | Modifiers
 i      | integer | not null
 nam    | bytea   |
    "t1_pkey" PRIMARY KEY, btree (i)
Triggers firing always:
    z_repack_trigger BEFORE INSERT OR DELETE OR UPDATE ON t1 FOR EACH ROW EXECUTE PROCEDURE repack.repack_trigger('INSERT INTO repack.log_33648(pk, row) VALUES( CASE WHEN $1 IS NULL THEN NULL ELSE (ROW($1.i)::repack.pk_33648) END, $2)')



Continue reading →
Friday, October 11, 2013

Generate pgbadger report from minimal information

I was working for a client who was experiencing slowness in his application and I need to find out if it was database related.
He provided me some data of query logs in a table generated by application as follows:

2012-11-12 06:29:51.982 PST 11834.483 SELECT col1, id FROM table1
2012-11-12 08:09:30.066 PST 339736.318 SELECT id, event_type_cd, event_ts, source_id         2012-11-12 11:59:53.900 PST 54808.922 SELECT id, * FROM report re WHERE ...
2012-11-12 12:03:41.625 PST 53794.590 SELECT id, * FROM report re WHERE ...

He had two large log files only which they got from their Postgresql 8.4.

First thing, I was looking for pgbadger or pgfoiune report which they had not and it made my life difficult.
I, then, decided to load the logs information into a table so that I can query my own.
I created a table:

\d logtab
                   Table "public.logtab"
      Column       |            Type             | Modifiers
 t_stamp           | timestamp without time zone |
 execution_time_ms | double precision            |
 query             | text                        | 

Then, I restored log data in logtable using COPY command.

There was log from Feb 2012 to till date.

To find out the exact time of slowness, I planned to generate a pgbadger report using the above information.
Usually, pgbadger wants log_line_prefix to be "log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '".
And log file looks like:
2013-09-01 00:00:57 EDT [0010]: [10-1] db=dbname,user=postgres LOG: duration: 23.755 ms statement:  SELECT relname,indexrelname,schemaname,pg_relation_size('"'||schemaname||'"."'||indexrelname||'"') as rawsize,idx_scan, idx_tup_read, idx_tup_fetch from pg_stat_user_indexes;

Here, I had
%t = t_stamp = 2013-09-01 00:00:57 EDT
Duration = execution_time_ms = 23.755
Statement = query =  SELECT relname,indexrelname,schemaname,pg_relation_size('"'||schemaname||'"."'||indexrelname||'"') as rawsize,idx_scan, idx_tup_read, idx_tup_fetch from pg_stat_user_indexes;

So, I planned to generate %p - the process ID and %l - the session number using the following query:

select t_stamp || ' EDT [' ||
lpad( (row_number() OVER (ORDER BY t_stamp))::text,4,'0') || ']: ' ||
'['|| (row_number() OVER ( PARTITION BY substr(t_stamp::text,1,10) ORDER BY t_stamp))::text || '-1] db=dbname,user=postgres '||
'LOG:  duration: '|| execution_time_ms::text||' ms  statement: '||
 as c
from logtab as a  
where t_stamp >= '2013-09-01 00:00:00'
limit 100

Now, I generated a log file using the query:

(select t_stamp || ' EDT [' ||
lpad( (row_number() OVER (ORDER BY t_stamp))::text,4,'0') || ']: ' ||
'['|| (row_number() OVER ( PARTITION BY substr(t_stamp::text,1,10) ORDER BY t_stamp))::text || '-1] db=dbname,user=postgres '||
'LOG:  duration: '|| execution_time_ms::text||' ms  statement: '||
 as c
from logtab as a  
where t_stamp >= '2013-05-01 00:00:00'
) to '/tmp/pglog.log'

The next step is to generate a pgbadger report.
I got it from http://dalibo.github.io/pgbadger/index.html, installed in my machine and generated report using:

pgbadger /tmp/pglog.log  -o /tmp/pgbadger.html

I got the pgbadger report with graphs for example average queries duration graph

From here we find out a the dates where the db worked slowly and found the exact date where the client pushed some new codes in their application which modified db schema.

Continue reading →
Thursday, September 26, 2013

Postgresql Radius Based Search using Latitude and Longitude for a Zip Code

We have a table called "image_info" as follows. We have to find the images that are nearest to a given zip code.

\d image_info
                                    Table "public.image_info"
      Column       |       Type        |                         Modifiers                         
 id                | bigint            | not null default nextval('image_info_id_seq'::regclass)
 content_type      | character varying |
 image_name        | character varying |
 zip_code          | character varying |
    "image_info_pkey" PRIMARY KEY, btree (id)
    "zip_code_5digit_idx" btree (substr(zip_code::text, 1, 5))
    "zip_code_idx" btree (zip_code)
To achieve this, we used  data from commercial version of Maxmind's GeoIP Software.
Using Maxmind's GeoIP software, we can narrow down the LONG/LAT of an IP address to relative accuracy within 25 miles around 80% of the time. For more details, please visit

In crawledphoto database, we created a table named "zip_code_based_lng_lat"
\d zip_code_based_lng_lat
Table "public.zip_code_based_lng_lat"
Column | Type | Modifiers
zip | character varying(50) |
state | character varying(2) |
city | character varying(100) |
type | character varying(2) |
lat | double precision |
lng | double precision |
"zip_code_based_lat_lng_gist_idx" gist (ll_to_earth(lat, lng))
"zip_code_based_lng_lat_zip_idx" btree (zip)

ZIP — ZIP Codes identify specific geographic delivery areas. A ZIP code can represent an area within a state, one which
crosses state boundaries, or a single building.
State — The USPS® standard 2-letter state or territory abbreviation.
City — The city, community, station or other name by which a 5-digit ZIP area is known.
Type — Defines the ZIP Code for delivery purposes.
      P = A ZIP code used only for P.O. Boxes.
      U = A unique* ZIP code.
      M = Identifies an APO/FPO ZIP Code.
      Blank = A standard ZIP Code.

*These are assigned to specific organizations and not to a city as a whole. Unique ZIP's are assigned to organizations such as Reader's Digest, educational institutions and government facilities. Use the UNQ.DAT file to find the organization name.
County FIPS — A Federal Information Processing Standard number is assigned to each county in the United States. Use this number to find the county name in the CNTY.DAT file. FIPS numbers begin with the 2-digit state code and are followed by the 3-digit county code.

Latitude — The geographic coordinate of a point measured in degrees north or south of the equator.
Longitude — The geographic coordinate of a point measured in degrees east or west of the Greenwich meridian

This table had two indexes.
zip_code_based_lng_lat_zip_idx - used to match the exact zip code
zip_code_based_lat_lng_gist_idx - a GIST index using "ll_to_earth" functionality of "earth_distance" extension.

CREATE INDEX zip_code_based_lat_lng_gist_idx ON zip_code_based_lng_lat USING gist(ll_to_earth(lat::float, lng::float)) ;

Here, "ll_to_earth" returns the location of a point on the surface of the Earth given its latitude (argument 1) and longitude (argument 2) in degrees.

Example data:

  zip  | state |         city         | type |   lat   |   lng  
 00501 | NY    | HOLTSVILLE           | U    | 40.8172 | 73.0451
 00501 | NY    | I R S SERVICE CENTER | U    | 40.8172 | 73.0451

Now, we created a function which will return data using two parameters as zip code and radius distance in miles.

CREATE OR REPLACE FUNCTION get_zip_code_based_records_fn(prm_zip_code varchar, prm_distance_in_miles int default 10) RETURNS SETOF RECORD AS $$
 cur1 cursor ( vlat float, vlng float, v_area_km float) IS
SELECT g.*, z.state, z.city, z.type
earth_distance(ll_to_earth(vlat, vlng), ll_to_earth(lat::float, lng::float))::float/(1000*1.609) as distance_in_mile
FROM image_info as g,
zip_code_based_lng_lat as z
WHERE substr(g.zip_code,1,5) = z.zip
AND earth_box(ll_to_earth(vlat, vlng), v_area_km) @> ll_to_earth(lat::float, lng::float)
AND earth_distance(ll_to_earth(vlat, vlng), ll_to_earth(lat::float, lng::float)) < v_area_km
vlat float;
vlng float;
v_area_km float;
  SELECT lat, lng
  INTO vlat, vlng
  FROM zip_code_based_lng_lat
  WHERE zip = prm_zip_code
  LIMIT 1;
  v_area_km := (prm_distance_in_miles*1000)::float*1.609;
  FOR rec IN cur1(vlat, vlng, v_area_km)
    RETURN NEXT rec;
$$ LANGUAGE 'plpgsql';

Here two parameters:
  • prm_zip_code - takes the zip code around which we want to get the data
  • prm_distance_in_miles  - takes an integer data denoting the circle distance radius around the "prm_zip_code" in miles; default 10 miles
In the function, we used a cursor using the query:

SELECT g.*, z.state, z.city, z.type,
earth_distance(ll_to_earth(vlat, vlng), ll_to_earth(lat::float, lng::float))::float/(1000*1.609) as distance_in_mile
FROM image_info as g,
          zip_code_based_lng_lat as z
substr(g.zip_code,1,5) = z.zip
AND earth_box(ll_to_earth(vlat, vlng), v_area_km) @> ll_to_earth(lat::float, lng::float)
AND earth_distance(ll_to_earth(vlat, vlng), ll_to_earth(lat::float, lng::float)) < v_area_km

This cursor takes parameters such as
  • vlat - latitude of the given zip code at "prm_zip_code"
  • vlng - longitudeof the given zip code at "prm_zip_code"
  • v_area_km - circle area around "prm_zip_code"", user passes "prm_distance_in_miles" and the function converts it to KM and uses in the cursor

And uses two functions -
  • earth_box - Returns a box suitable for an indexed search using the cube @> operator for points within a given great circle distance of a location. Some points in this box are further than the specified great circle distance from the location, so a second check using earth_distance should be included in the query.
  • earth_distance - Returns the great circle distance between two points on the surface of the Earth
For more information, please visit http://www.postgresql.org/docs/9.2/static/earthdistance.html
Now, to get the data around 10 miles distance of zip code 18938, we use -
FROM get_zip_code_based_records_fn('18938') as t
id bigint, content_type character varying,
          image_name character varying,
zip_code character varying ,
          distance_in_mile float
ORDER BY distance_in_mile

It uses the indexes that described above and give result within a second.

Continue reading →
Friday, August 16, 2013

Postgresql 9.2 Streaming Replication Setup

Node 1 :
Node 2:

i) Prepare Node 1:
1. Check postgresqls running with the following parameters:

listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 32
hot_standby = on

2. Modify pg_hba.conf for replication

host all replicator trust
host replication replicator trust

3. Create required user for replication


4. Reload pg_hba

pg_ctl reload

ii) Prepare Node 2:

1. Install Postgresql 9.2 at Node2

2. Modfiy postgresql.conf for replication

listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 32
hot_standby = on

3. Modify pg_hba.conf for replication

host all replicator trust
host replication replicator trust

4. Restart database

pg_ctl -mf restart

5. Create script to sync with other node and act as slave

vi sync_with_master.sh
#! /bin/sh
psql -h $MASTER -p $MASTER_PGPORT -U $REPL_USER -d postgres -c "select pg_start_backup('replication_backup')"
rsync -azv "$MASTER":"$MASTER_PGDATA"/* "$SLAVE_PGDATA"/ --exclude="postmaster.pid" --exclude="*.conf*"
psql -h $MASTER -p $MASTER_PGPORT -U $REPL_USER -d postgres -c "select pg_stop_backup()"
echo "standby_mode = 'on' # enables stand-by (readonly) mode" > $SLAVE_PGDATA/recovery.conf
echo "primary_conninfo = 'host= $MASTER port= $MASTER_PGPORT user= $REPL_USER'" >> $SLAVE_PGDATA/recovery.conf
echo "trigger_file = '/tmp/pg_failover_trigger'" >> $SLAVE_PGDATA/recovery.conf

iii) Ensure password less ssh access for "postgres" user between the nodes

iv) Make Node 2 slave:

 sh sync_with_master.sh

v) Check replication:

At master:

SELECT pg_xlog_location_diff(pg_current_xlog_location(), '0/0') AS offset;

At Slave:

SELECT pg_xlog_location_diff(pg_last_xlog_receive_location(), '0/0') AS receive, pg_xlog_location_diff(pg_last_xlog_replay_location(), '0/0') AS replay;

If we get the same number in above two queries then we can assume that the replication is working.

Even, we may use "CREATE DATABASE db1" at Master and see if we get it at Slave or not.

v) Make Node 2 Master:

touch /tmp/pg_failover_trigger

At this stage, Node 2 starts act as Master while Node 1 is still alive as a stand-alone server.
Modify some data in Node 2. Now we can make Node 1 as a slave for Node 2.

Continue reading →
Friday, June 14, 2013

Nagios Script to find unused databases

I created to nagios script to find out unused databases in our data centers. It gives us an alert like :

    ***** Nagios *****

    CRITICAL: 4 unused database(s) found at that were last accessed 7 days ago :-

      1) Database db1 last accessed on the 2013-05-22 04:02:08 that is 22 days ago

      2) Database db2 last accessed on the 2013-05-29 04:02:08 that is 15 days ago

      3) Database db3 last accessed on the 2013-05-17 04:02:08 that is 27 days ago

      4) Database db4 last accessed on the 2013-06-05 04:02:07 that is 8 days ago

    Notification Type: PROBLEM

    Service: Check Postgresql Unused Database

    State: CRITICAL

    Alias: dbserver1

    Host: dbserver1


    Date/Time: Thu Jun 13 23:00:09 EDT 2013

The script is available at check-postgres-unused-db.sh. It takes some parameters :-

    Check Postgresql Unused Database

    Description: Create alert base on the number of databases older than given mode,time and units for a given Postgrsql database server

    Author: www.rummandba.com

    # This script expects psql to be in the PATH.

    Usage: ./check_postgres_least_vacuum [-h] [-v][ -H <host> ] [ -P <port> ] [ -U user ] [ -D dbname] [ -m <mode of time> ] [ -t <time value> ] [ -x <units> ] [-w <warn count>] [-c <critical count>]

    -h --help help

    -v --verbose verbose or debug mode

    -H --host host (default

    -P --port port (default 5432)

    -U --user database user (default postgres)

    -S --password database user password

    -D --dbname dbname to connect with postgresql (default postgres)

    -m --mode mode of time to check with current time to mark as unused (c = create, a = access, m = modified; default : a)

    -t --time number of times after that database should be marked as unused

    -x --units units of measurement to compare (s = seconds; m = minutes; h = hours; D = days; M = months; Y = year)

    -w --warning warning threshold; number of databases older than given mode,time and units (default 1 )

    -c --critical critical threshold;number of databases older than given mode,time and units (default 3 )

Based on the db connection parameters, it connects to the Postgresql server and creates a function "check_pg_unused_db_f" and and after completion it removes the function.

The function uses the plpythonu language. So we must have Postgresql plpythonu language available in the dbserver.

    CREATE OR REPLACE FUNCTION check_pg_unused_db_f(stat_type varchar, dbid int)
      RETURNS text As \$BODY\$
       import sys
       import os
       import datetime
       afilename = 'base/' + str(dbid)
       (mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime) = os.stat(afilename)
       if stat_type == 'atime':
         return datetime.datetime.fromtimestamp(atime)
       elif stat_type == 'mtime':
         return datetime.datetime.fromtimestamp(mtime) 
       elif stat_type == 'ctime':
         return datetime.datetime.fromtimestamp(ctime)
         return 'UnknownVariable'   
      \$BODY\$ Language plpythonu;

The function takes stat_type ( a = access, m = modified , c = created) and find out that specific time for the given dbid. The query gives the desired result:

    SELECT OID, datname, replace(check_pg_unused_db_f('"$TIME_MODE"', OID::int ),' ','_') as check_time FROM pg_database WHERE datallowconn and NOT datistemplate and datname NOT  IN ('postgres')

Each database's check_time compares with the following three parameters :-

    -m --mode mode of time to check with current time to mark as unused (c = create, a = access, m = modified; default : a)

    -t --time number of times after that database should be marked as unused

    -x --units units of measurement to compare (s = seconds; m = minutes; h = hours; D = days; M = months; Y = year)

For example, if we want to find those databases that have not been accessed for last 15 days and if there exists any in the server we should get an alert, we may configure the script like :-

check_postgres_unused_db.sh -H -P 5432 -D postgres -S password -m a -t 15 -x D -w 1 -c 3

After collecting all the data, check_postgres_unused_db script compare the total number of databases found and the number given as warning and critical threshold and provides the required alert.

Sample Nagios Configuration:

    define command{

         command_name check_postgres_unused_db

          command_line $USER1$/check_postgres_unused_db.sh   -H $HOSTADDRESS$ -P $ARG1$ -U $ARG2$ -D $ARG3$ -m $ARG4$ -t $ARG5$ -x $ARG6$ -w $ARG7$ -c $ARG8$


    define service{

            use                                    generic-service

            host_name                         dbserver

            service_description            Check Postgresql Unused Database

            check_command               check_postgres_unused_db!5432!postgres!postgres!a!15!D!1!3

            contact_groups                  admins,dev-admins

            check_period                   non-workhours


Continue reading →
Monday, June 10, 2013

Nagios to Check Postgresql Least Vacuum Time

Nagios is one of the finest tools to monitor system. We use it in our environment to monitor all our systems like databases, application servers, services etc. check_postgres is one of the tools to monitor different aspect of Postgresql database service. It has a function named check_postrges_last_vacuum which works well to get notified if there is any critical distance between last vacuum date of a table of a database.
But our requirement was a bit different. We configured manual vacuum each night during backup and need a method which would notify us the least vacuum date for a table in any database in a server where we were not able to give any specific database name. The reason was such that if for some reason vacuum failed for one database we should be notified.
So we created a nagios plugin named check_postgres_least_vacuum and now its working fine.
The script is available at:

Then we defined  command in command.cfg:
define command{
       command_name  check-postgres-least-vacuum
       command_line  $USER1$/check_postgres_least_vacuum -H $HOSTADDRESS$ -P $ARG1$ -U $ARG2$ -D $ARG3$ -x $ARG4$ -w $ARG5$ -c $ARG6$

Then we defined service in dbserver.cfg:

define service{
        use                             generic-service
        host_name                       DBSERVER1
        service_description             Check Postgresql DB Server Least Vacuum time
        check_command                   check-postgres-least-vacuum!5432!postgres!postgres!D!2!3
        contact_groups                  admins,dev-admins
        check_period                    non-workhours

Continue reading →
Wednesday, March 13, 2013

Insert records instead of delete to reclaim space of a large table

A few days back, I got a task to reclaim space for a database server by deleting a good number of records from a table. It was at Postgresql 9.1.
Initially, the developers planned to delete the rows and execute a vacuum full on the database.

delete from usage where cacheid in (
select cacheid from cache_tab
where distid in ('1','2','3')
AND createddatetime < '2011-05-01');

delete from cache_tab where distid in ('1','2','3') AND createddatetime < '2011-05-01';

Vacuum Full Analyze;

The table had  612613 records. The table description was -

search_engine2=# \d cache_tab
                                           Table "public.cache_tab"
         Column         |            Type             |                          Modifiers                          
 id                     | integer                     | not null default nextval('cache_tab_id_seq'::regclass)
 cachekey               | character varying(60)       | not null
 userid                 | bigint                      |
 col1                   | character varying(255)      |
 col2                   | bigint                      | not null
 query                  | character varying(5000)     | not null
 resultxml              | text                        | not null
 col4                   | character varying(50)       | not null
 createddatetime        | timestamp without time zone | not null default now()
 col6                   | character varying(255)      | not null
 distid                 | character varying(1000)     |
 col7                   | character varying(1000)     |
 col8                   | character varying(1000)     |
 col9                   | character varying(10)       |
 col10                  | timestamp without time zone | default now()
 col11                  | integer                     | not null default 0
 col12                  | bigint                      |
 col13                  | character varying(15)       |
 col14                  | timestamp without time zone |
 col15                  | boolean                     | default false
 col16                  | character varying(1000)     |
 col17                  | boolean                     | default false
    "cache_tab_pkey" PRIMARY KEY, btree (id)
    "cache_tab_ccol1_pkey" UNIQUE CONSTRAINT, btree (col1)
    "cache_tab_col2_index" btree (col2)
    "cache_tab_col3_index" btree (col3)
    "cache_tab_col4_index" btree (col4)
    "cache_tab_col5_index" btree (col5)
    "cache_tab_query_index" btree (query)
    "cache_tab_distid_index" btree (distid)

Referenced by:
    TABLE "usage" CONSTRAINT "usage_cacheid_fkey" FOREIGN KEY (cacheid) REFERENCES cache_tab(id)

The table size was 88 GB where the whole database size was 90 GB. Another table usage which was small, but the tricky part was it had a foreign key constaring references to cache_tab data.

 \d usage
                                          Table "public.usage"
        Column        |            Type             |                         Modifiers                        
 id                   | integer                     | not null default nextval('usage_id_seq'::regclass)
 hitid                | bigint                      | not null
 cacheid        | bigint                      | not null
 selectionref         | character varying(1000)     |
 createddatetime      | timestamp without time zone | not null default now()
 populateddatetime    | timestamp without time zone |
 populationstatuscode | character varying(20)       | default 'PENDING'::character varying
 lastmodifieddatetime | timestamp without time zone | default now()
    "usage_pkey" PRIMARY KEY, btree (id)
    "usage_hitid_index" btree (hitid)
    "usage_populationstatuscode_index" btree (populationstatuscode)
    "usage_cacheid_index" btree (cacheid)
Foreign-key constraints:
    "usage_hitid_fkey" FOREIGN KEY (hitid) REFERENCES sch_hit(id)
    "usage_cacheid_fkey" FOREIGN KEY (cacheid) REFERENCES cache_tab(id)

As vacuum full will lock the table and only delete would not going to reclaim space, I changed the plan to different way.

Instead of deleteing records, I planned to insert active records in a new table. My plan was -

1. Make a create table script for cache_tab using pg_dump
2. Disable Foreign Key with sch_hituse
3. Rename existing cache_tab to cache_tab_old
4. Create cache_tab from the script
5. Identify indexes from cache_tab_old and drop those indexes
6. Create missing indexes
7. Insert into cache_tab from schresultcache_old according to filter in small chunk
8. Delete data from sch_hituse that no longer needed
9. Make a dump backup of sch_hitresultcache_old and drop the table to reclaim space
10. Enable Foreign key

Step 1: Make a create table script for cache_tab using pg_dump


pg_dump search_engine2 -t cache_tab -s > cache_tab.sql.schema


Step 2: Disable Foreign Key at usage

ALTER TABLE usage  drop constraint usage_cacheid_fkey;

Step 3: Rename existing cache_tab to cache_tab_old

ALTER TABLE cache_tab rename to cache_tab_old;


Step 4. Create cache_tab from the script

psql search_engine2 < cache_tab.sql.schema

This created the table without index. It gave error that the indexes were already exist. I was aware of this error.
But I created the table at this moment to make the downtime very less. At this moment the application was ready to go and as the new table is very small, it would not effect the performance running withtout indexes.

Step 5: Identify indexes from cache_tab_old and drop those indexes

Created drop index script -
SELECT 'DROP INDEX '|| c2.relname || ';' as index_name ,pg_catalog.pg_get_indexdef(i.indexrelid, 0, true) as cmd
FROM pg_catalog.pg_class c, pg_catalog.pg_class c2, pg_catalog.pg_index i
  LEFT JOIN pg_catalog.pg_constraint con ON (conrelid = i.indrelid AND conindid = i.indexrelid AND contype IN ('p','u','x'))
WHERE c.relname = 'cache_tab' AND c.oid = i.indrelid AND i.indexrelid = c2.oid
ORDER BY i.indisprimary DESC, i.indisunique DESC, c2.relname;

Executed the drop index command.
Used the output of cmd for next step.

Step 6. Create missing indexes

Modified cache_tab_old to cache_tab in CREATE INDEX command and executed to create the indexes.

Step 7. Insert into cache_tab from cache_tab_old according to filter in small chunk

At this moment, I started to transfer data from cache_tab_old to cache_tab. I modified the WHERE condition given by the developers and also made a function to do this in small chunk.
insert into cache_tab
select * from cache_tab_old
where (createddatetime >= :START_TIME and  createddatetime < :END_TIME )
and distid not in ('1','2','3') 

I used time frame of two months like -
2011-05-01  - 2011-07-01
2011-07-01  - 2011-09-01
2013-03-01 - 2013-05-01

Before executing the insert statement, I created an index on cache_tab_old on (createddatetime,distid).

It took almost an hour to insert all 444279 records to the table.

Step 8. Delete data from sch_hituse that no longer needed

I created a backup copy of usage as  usage_bak and removed data from usage.

Step 9. Make a dump backup of cache_tab_old and drop the table to reclaim space

Before creating a dump backup of cache_tab_old for archiving, I deleted those records that are in cache_tab.
Again, instead of deletion, I created another table.
CREATE TABLE cache_tab_deleted_data as
select * from cache_tab_old where distid in ('1','2','3') AND createddatetime < '2011-05-01';
DROP TABLE cache_tab_old;

pg_dump search_engine2 -t cache_tab_deleted_data -Fc > cache_tab_deleted_data.sql.mar122013
pg_dump search_engine2 -t sch_hituse_old -Fc > sch_hituse_old.sql.mar122013

10. Enable Foreign key

Created the foreign key again that I disabled in step 2.

Now, the table cache_tab  size was 26 GB which was 88 GB earlier.

In this way, I reclaimed 62 GB of space without any downtime or locking.

Continue reading →