Implementing Yes/No to continue
Here is a quick how-to for doing something like “Do you wish to continue Yes/No” within shell scripting:
echo "Will you like to delete $dir under $PWD for a fresh extract? (y/n)"
read opt
case $opt in
Y|y) cleanfolder $dir ;;
[Yy][Ee][Ss]) cleanfolder $dir ;;
N|n) exit ;;
[Nn][Oo]) exit ;;
*) exit;;
esac
This will look for almost all kinds of combination of Yes/No options and do the appropriate action depending on the user input.
--
Shoaib Mir
shoaibmir[@]gmail.com
Option parser in Python
I have just started playing around with Python like a few weeks back and having lots of fun scripting in it. Today at work got to know of a very nice feature with Python which lets you easily configure the options taken from command line, which is called option parser.
Firstly, you need to include the library with ‘import’:
from optparse import OptionParser
Now you can use it like this:
usage = "usage: %prog -H hostname -p port -d diskpath -t threshold"
parser = OptionParser(usage=usage, version="%prog 1.0")
parser.add_option("-H", "--host", action="store", type="string", dest="hostname",
help="input host name for database server")
parser.add_option("-d", "--disk", action="store",
type="string", dest="diskpath", help="input disk path")
parser.add_option("-p", "--port", action="store", type="string",
dest="dbport", help="input port for database server")
parser.add_option("-t", "--thresh", action="store", type="int",
dest="thresh", help="input threshold for disk size ")
Where every add_option means adding a new argument on command line to check for and the best thing is you just put in the help section with the add_option function call and it automagically builds the full help section for the program to be use with “-h”, so that means you cant have an option with “-h” as that clashes with the default help option for option parser, and that was the reason I had to go with “-H” for hostname
In order to get the values into variables just simply use:
(options, args) = parser.parse_args()
if options.hostname:
hostname=options.hostname
if options.diskpath:
diskpath=options.diskpath
if options.dbport:
dbport=options.dbport
if options.thresh:
thresh=options.thresh
Now when you run the program with “-h” option you get the following:
shoaib@shoaib-desktop:~/Desktop/Scripts$ ./mgmnt -h
Usage: mgmnt -H hostname -p port -d diskpath -t threshold
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-H HOSTNAME, --host=HOSTNAME
input host name for database server
-d DISKPATH, --disk=DISKPATH
input disk path
-p DBPORT, --port=DBPORT
input port for database server
-t THRESH, --thresh=THRESH
input threshold for disk size
–
Shoaib Mir
shoaibmir[@]gmail.com
GCC tricks for the Linux Kernel
While debugging some GCC specific linking problems, came across this very handy resource for gcc tips and tricks… very helpful indeed!
Check it out at –> http://www.ibm.com/developerworks/linux/library/l-gcc-hacks/index.html
–
Shoaib Mir
shoaibmir[@]gmail.com
Scripting within psql
In one of my previous posts I was told of using PSQL instead of shell scripting to make things much simpler. That turned out to a good tip and with some help from my colleague at work I am now so used to PSQL in these kind of scenarios.
Just today I had to get the dead tuple percentages (using pgstattuple contrib) for tables within all databases on the server and here is how I was to get it done using PSQL and a simple shell script….
#!/bin/bash
PSQL="psql -A -t"
DBLIST=$($PSQL postgres -c "select datname from pg_database
where not datistemplate and datname not in ('postgres')")
for DB in $DBLIST; do
echo $DB;
psql $DB < gettupd.sql > $DB.out
done
exit 0
gettupd.sql
\a
\t
\o /tmp/tupd.sql
select 'SELECT \'' || relname || '\', * from pgstattuple('''||schemaname||
'.'||relname||''');' from pg_stat_user_tables ;
\o
\i /tmp/tupd.sql
\! rm /tmp/tupd.sql
\a
\t
PSQL is just the way to go…
–
Shoaib Mir
shoaibmir[@]gmail.com
A quick way to change index tablespace
On one of the database servers, I had to quickly move all indexes from all the databases on to a different disk in order to free up some space from the main disk. I thought of sharing it so that someone else wanting to do the same thing can get an idea for doing the same….
A quick how to,
First made a query to get all the indexes within a database using the following:
select pg_namespace.nspname||'.'||pg_class.relname from pg_class, pg_namespace where pg_class.relnamespace = pg_namespace.oid and pg_class.relkind = 'i' and pg_class.relname not like 'pg_%'
and then using a simple script, plugging in the the above mentioned query to do an ALTER INDEX for each index by setting the new table space:
#!/bin/bash
DNAME_LIST=$(psql postgres -A -t -c "select datname from pg_database where
not datistemplate and datname != 'postgres'")
for DATABASE in $DNAME_LIST; do
INAME_LIST=$($PSQL $DATABASE -A -t -c "select pg_namespace.nspname||'.'||pg_class.relname
from pg_class, pg_namespace where
pg_class.relnamespace = pg_namespace.oid and pg_class.relkind = 'i' and pg_class.relname
not like 'pg_%'")
( for INDEX in $INAME_LIST; do echo "ALTER INDEX $INDEX SET TABLESPACE
new_disk_index;"; done ) | psql $DATABASE
done
–
Shoaib Mir
shoaibmir[@]gmail.com
Row size and Btree index
At work today saw an error while inserting data into a table:
ERROR: index row size 2962 exceeds btree maximum, 2713
Looking at the HINT in the error, found out that you cant have the row size of more then 1/3th of the PostgreSQL block size (8K by default) for a btree index on the column.
The best is to start using PostgreSQL full text search option for these kind of scenarios as it will be much faster and easier… but in my case that was not an option so had to somehow get around this problem without using text search engine.
After a little bit of research two possible solutions I could find out were (where column was of text data type):
- Create a hash index on the same column as it will let you store more then 1/3rd of the block size, but a couple of things you need to consider before you go that route…
- It lets you get the data inserted for more then the 1/3rd of the block size but that again has limitation, you cant go above the PG block size and even closer to that.
- With Hash indexing you wont be able to use the text pattern matching functions “LIKE” and “ILIKE”, you are left with “=”
- The other solution is to use the partial index where CREATE INDEX will look something like this…
create index .... where length(clm_name) < 500
…this will require changing the queries as well, now you will have to add “where length(clm_name) < 500″ while doing SELECTS.
–
Shoaib Mir
shoaibmir[@]gmail.com
Finding top read tables in the database
A quick and good way to find the most used tables in a database with regards to reads can be:
PostgreSQL provides a system catalog view “pg_stat_all_tables” for helping DBAs with finding such statistics.
The view looks like this:
Columns ------------- relid relname seq_scan seq_tup_read idx_scan idx_tup_fetch n_tup_ins n_tup_upd n_tup_del
This view has one entry for each table in the database. The seq_scan column tells you how many sequential scans have been performed for a given table and seq_tup_read tells you how many rows were used during those sequence scan. The idx_scan and idx_tup_fetch gives the same kind of information for the index scan usage. The n_tup_ins, n_tup_upd and n_tup_del gives you information on how many rows were inserted, update and deleted.
We can use this view in the following way to get the top 10 most read tables…
SELECT relname, idx_tup_fetch + seq_tup_read as TotalReads from pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0 order by TotalReads desc LIMIT 10;
(With thanks to my favorite book "PostgreSQL by Korry Douglas")
–
Shoaib Mir
shoaibmir[@]gmail.com