Hi guys... so I am back with my blog sequence on reorgs and runstats... So just a recap my last two posts in this sequence had just explained "what is the importance of using RUNSTATs and REORGs" and then I discussed "How to 'effectively' automate RUNSTATS"...
So I will end the "saga" with REORG policy and it could not have come at a better time, when I had just fixed some major I/O anomalies with REORG (how many of you are thinking I am boasting?)...
Ok lets start with a problem statement...
I was told about about two problems, firstly Sudden Spikes in Reads at DB Server and Sudden Spikes in Writes at DB Server...
I really could not justify the Writes (but it struck me much later) but had an idea about reads... Also, at the design I had done for tablespace included some reserve for prefetching/blocked I/O... They were performing poor... Ooopss!!!
What helped me find it out about the bad performance of blocked I/O was this snapshot:
SELECT BP_NAME,
CAST(VECTORED_IOS as decimal(24,10)) / NULLIF((POOL_DATA_P_READS + POOL_INDEX_P_READS), 0) AS VECTORED_IO_FRACTION,
CAST (UNREAD_PREFETCH_PAGES as decimal(24,10)) / NULLIF(PAGES_FROM_BLOCK_IOS,0) AS WASTEFULL_PREFETCH_FRACTION, CAST((POOL_ASYNC_DATA_READ_REQS + POOL_ASYNC_INDEX_READ_REQS) as decimal(24,10)) / NULLIF((POOL_DATA_P_READS+POOL_INDEX_P_READS) - VECTORED_IOS,0) AS ANSYNC_PREFETCH_EFFICIENCY FROM SNAPBP;
So I had to do some analytic... First thing I did was running the runstats on tables and indexes and then I tried to find the amount of defragmentation:
SELECT I.TABNAME TABLE_NAME, I.INDNAME INDEX_NAME, REPLACE(SUBSTR(I.COLNAMES, 2) ,'+',',') AS MEM_COLS,
T.LASTUSED AS TAB_LASTUSED, I.LASTUSED AS INDX_LASTUSED,
I.NUM_EMPTY_LEAFS/NULLIF(I.NLEAF+NUM_EMPTY_LEAFS,0) INDX_EMPTY_LEAF_FRACTION,
T.FPAGES-T.NPAGES TAB_EMPTY_PAGES_CNT, I.NLEAF+NUM_EMPTY_LEAFS/(num_partitions*extent_size) AS INDX_REORG_NEEDED_IF_GT_1,
T.FPAGES/(num_partitions*extent_size) TAB_REORG_NEEDED_IF_GT_1
FROM SYSCAT.INDEXES I JOIN SYSCAT.TABLES T ON T.TABNAME=I.TABNAME AND T.TABSCHEMA=I.TABSCHEMA AND T.TABSCHEMA IN (schemalist) AND T.TYPE='T';
I guess the column names are pretty self explainatory and I need not explain what I got was huge no of index leaves scattered around and tables which were defragemented... So to explain the problems in terms of observation:
1) Too many disk seeks happening for read (even for index scans)
2) The sequential bloced IO fails miserably as the blocks pertaining to same table/index are scatterd
3) The writes, a record inserted/updated might have to re-balance the indexes, which is scattered so write cost increases
4) Finally the free leaves (for indexes)/blocks (for tables) are not available sequentially...
So I decided to do a REORG.. Most of the times people re-org with "REORG TABLE SCHEMA.TABLENAME" Guys it is hardly gonna help... Actually the one should arrange data in such fashion with will be usefull... So it is always good to reorg based on an index which can help you fetch data faster when you are going by that index (so choose the index you use more often)... Wait that is not enough... Before doing a REORG on your tables, you should have REORGed all your indexes on every table...
So I wrote a stored proc to dynamially select an index for every table and generate re-org statements for "REORG INDEXES ALL FOR TABLE" and "REORG TABLE USING INDEX"... Guess what!!! those momentary spikes were past and we had a good stable bufferpool utilization...
Many people while reading my last post would have thought AUTO_REORG as an equally useful method to do reorgs... i.e. to leave it to the database manager... But you might end up messing your OLTP performance and even your batch jobs which do bulk inserts/deletes... So best way is to keep checking weekly/fortnightly/monthly with the above queries on syscat tables to check if reorg is required or not... I really do not rely much on REORGCHCK utility from DB2... It is also important to run reorg on a regular interval based on about of data inflows and gets modified in your system...
So what next??? Probably some SQL Tuning tips!!! Till then HAPPY PREFETCHING!!!
I will be discussing my daily experiences and tips I have learned (from google, my seniors and various other blogs) for better db2 performance...
About Me
- Sameer
- Singapore, Singapore, Singapore
- I am currently working as a DB Consultant with Ashnik. We are into Open Source and Cloud based enterprise class solutions. Prior to this I was working as a DBA @ Misys India (Banking Services). Working on IBM DB2, Oracle Microsoft SQL Server. Selected as IBM Champion for Data Management 2011. I like bit of photography, cycling, poetry and I just love gadgets!!