About Me

My photo
Singapore, Singapore, Singapore
I am currently working as a DB Consultant with Ashnik. We are into Open Source and Cloud based enterprise class solutions. Prior to this I was working as a DBA @ Misys India (Banking Services). Working on IBM DB2, Oracle Microsoft SQL Server. Selected as IBM Champion for Data Management 2011. I like bit of photography, cycling, poetry and I just love gadgets!!

Connect to me

See my profile on LinkedIn Visit my Ashnik Website

Optimize your Tablespaces and Bufferpools-1: Tablespace Design design

hey people... I am back after a long time... Office had kept me busy, and it will continue to do so for next few quarters... After all we have promised ourselves to build a world class product, which got to be best!!!

Well so getting back to db2... In this post I will discussing little bit on the physical database design...

Lets discuss the tablesapces first...

So lets start with pagesize... If u are really concerned about growth of your data, you should better go for a 32K pagesize, offering maximum storage... For configuration data and static data though you can choose to go with a 4k/8k pagesize and you might see some improvement in bufferpool and disk utilization (especially if you have a table with random access for records)...

Next is file-system caching... DB2 always caches the data (and caches it in more usable format for it db-engine) in bufferpools... So needless to say, if your OS is also caching the data, it is wasting both time and memory... So you better keep filesystem caching off (thank god it is default in 9.7)...

Exten Size is something which should be wisely thought out (esp when using automatic storage)... It is the amount of data/index pages which will be written to a container before going to the next container... If it is too low, you have too much of scattered data whereas if it is too high you have too much of I/O load... In case you do not specify extent size while creating the tablespace it will take it from dft_extent_sz db cfg... A thumb of rule is to use following formula:
RAID disk drives * Disk strip size

Prefetch Size is the no of pages for pre-fetching... It should not be smaller then extent size in any case, other wise you end up wasting I/O cycles... Most optimum suggestion is to keep it same as extent size, but if you are using multiplexing of data pages, it might be a good idea to keep it in multiples of extent size...

Overhead and transfer rate are two mostly ignored aspects of tablespace design and people tend to go with defaults... See generally hardware performances changes much rapidly and moreover organizations tend to change the hardware more frequently compared to s/w (at times it is vice versa)... Hence you transfer rate (which conveys to optimizer the speed @ which data will be tranferred from underlying container) and overhead (which conveys the other disk seek and various latencies related to disk) should be set properly... If they are wrong it might impact as much as deciding for a table scan instead of an index scan just cause the optimizer thinks the seek time will be too much for index pages...

At last let's try to answer the question on "If DMS/SMS/ASM???"... Well I will not bore with what's pros n cons of each as you can get lot of content on internet on this, and I do not want to copy paste stuff... :)
Well I advise going for Automatic Storage, as it is more easy to manage and handles parallelism more efficiently... It can utilize RAW devices more efficiently...

Lastly you can enable I/O parallelism with-in a tablespace (if it is spread across multiple containers)... for example If you have RAID 7 array you can set the below registry parameter using db2set

DB2_PARALLEL_IO=*:7


I shall continue the topic in next post... Where I will discuss on sizing and increment for tablespaces and also the temporary tablespaces...

7 comments:

  1. Nice summary. I would like to include: 1) physical location of files e.g separate out data and log
    2)Write back cache settings

    ReplyDelete
  2. Thanks Jack...

    I had given a summary over here on separation of log and data:
    http://db2performance.blogspot.com/2011/02/well-performance-engineering-for.html

    But in one of coming threads to this series I might be talking more about separation of data and index data and config and transactional data. Also will share little more on bufferpool allocation. Stay tuned!!!

    I could not get your 2nd suggestion and I must admit it is first time I am hearing about "Write back cache setting" or may be I know it by some other name/concept...

    ReplyDelete
  3. A good post as ususal, Sameer! - Sathy

    ReplyDelete
  4. Hi, the write back cache control is the caching method where modifications to the data aren't moved to the source until absolutely necessary. As opposed to the write through method .
    Check http://www.dba-db2.com/2011/07/oltp-performance-checklist.html for a wider list of performance checklist

    ReplyDelete
  5. Overhead and transfer rate ... should be set properly... can you please show how? any rule of thrumb? thanks

    ReplyDelete
  6. Thanks Jack...
    I too got some info on this... I knew a a bit of it, I got the name for this concept/phenomenon... Oracle as well does that, it writes data from buffer to disk (data file) on a commit counter or on a time interval (whichever happens first)...
    I am still not sure if the same can be set (commit counter / interval) for db2 as well...

    ReplyDelete
  7. And hey, overhead and transfer rate can not be decide by thumb of rule... Infact the thumb keeps growing with advancements in technologies... So if you have the most advanced set of I/O devices, make sure you know the disk seek rate and data transfer rate for the disks... You can set the OVERHEAD (as roughly seektime in milliseconds) and TRANSFERRATE (time in milliseconds to read 1 page of data i.e. PAGESIZE/Data tranferrate) accordingly...

    ReplyDelete