About Me

My photo
Singapore, Singapore, Singapore
I am currently working as a DB Consultant with Ashnik. We are into Open Source and Cloud based enterprise class solutions. Prior to this I was working as a DBA @ Misys India (Banking Services). Working on IBM DB2, Oracle Microsoft SQL Server. Selected as IBM Champion for Data Management 2011. I like bit of photography, cycling, poetry and I just love gadgets!!

Connect to me

See my profile on LinkedIn Visit my Ashnik Website

RUNSTATS and REORG-3: An effective REORG policy

Hi guys... so I am back with my blog sequence on reorgs and runstats... So just a recap my last two posts in this sequence had just explained "what is the importance of using RUNSTATs and REORGs" and then I discussed "How to 'effectively' automate RUNSTATS"...

So I will end the "saga" with REORG policy and it could not have come at a better time, when I had just fixed some major I/O anomalies with REORG (how many of you are thinking I am boasting?)...
Ok lets start with a problem statement...

I was told about about two problems, firstly Sudden Spikes in Reads at DB Server and Sudden Spikes in Writes at DB Server...

I really could not justify the Writes (but it struck me much later) but had an idea about reads... Also, at the design I had done for tablespace included some reserve for prefetching/blocked I/O... They were performing poor... Ooopss!!!

What helped me find it out about the bad performance of blocked I/O was this snapshot:

SELECT BP_NAME,
CAST(VECTORED_IOS as decimal(24,10)) / NULLIF((POOL_DATA_P_READS + POOL_INDEX_P_READS), 0) AS VECTORED_IO_FRACTION,
CAST (UNREAD_PREFETCH_PAGES as decimal(24,10)) / NULLIF(PAGES_FROM_BLOCK_IOS,0) AS WASTEFULL_PREFETCH_FRACTION, CAST((POOL_ASYNC_DATA_READ_REQS + POOL_ASYNC_INDEX_READ_REQS) as decimal(24,10)) / NULLIF((POOL_DATA_P_READS+POOL_INDEX_P_READS) - VECTORED_IOS,0) AS ANSYNC_PREFETCH_EFFICIENCY FROM SNAPBP;


So I had to do some analytic... First thing I did was running the runstats on tables and indexes and then I tried to find the amount of defragmentation:

SELECT I.TABNAME TABLE_NAME, I.INDNAME INDEX_NAME, REPLACE(SUBSTR(I.COLNAMES, 2) ,'+',',') AS MEM_COLS,
T.LASTUSED AS TAB_LASTUSED, I.LASTUSED AS INDX_LASTUSED,
I.NUM_EMPTY_LEAFS/NULLIF(I.NLEAF+NUM_EMPTY_LEAFS,0) INDX_EMPTY_LEAF_FRACTION,
T.FPAGES-T.NPAGES TAB_EMPTY_PAGES_CNT, I.NLEAF+NUM_EMPTY_LEAFS/(num_partitions*extent_size) AS INDX_REORG_NEEDED_IF_GT_1,
T.FPAGES/(num_partitions*extent_size) TAB_REORG_NEEDED_IF_GT_1
FROM SYSCAT.INDEXES I JOIN SYSCAT.TABLES T ON T.TABNAME=I.TABNAME AND T.TABSCHEMA=I.TABSCHEMA AND T.TABSCHEMA IN (schemalist) AND T.TYPE='T';


I guess the column names are pretty self explainatory and I need not explain what I got was huge no of index leaves scattered around and tables which were defragemented... So to explain the problems in terms of observation:

1)
Too many disk seeks happening for read (even for index scans)
2) The sequential bloced IO fails miserably as the blocks pertaining to same table/index are scatterd
3) The writes, a record inserted/updated might have to re-balance the indexes, which is scattered so write cost increases
4) Finally the free leaves (for indexes)/blocks (for tables) are not available sequentially...

So I decided to do a REORG.. Most of the times people re-org with "REORG TABLE SCHEMA.TABLENAME" Guys it is hardly gonna help... Actually the one should arrange data in such fashion with will be usefull... So it is always good to reorg based on an index which can help you fetch data faster when you are going by that index (so choose the index you use more often)... Wait that is not enough... Before doing a REORG on your tables, you should have REORGed all your indexes on every table...

So I wrote a stored proc to dynamially select an index for every table and generate re-org statements for "REORG INDEXES ALL FOR TABLE" and "REORG TABLE USING INDEX"... Guess what!!! those momentary spikes were past and we had a good stable bufferpool utilization...

Many people while reading my last post would have thought AUTO_REORG as an equally useful method to do reorgs... i.e. to leave it to the database manager... But you might end up messing your OLTP performance and even your batch jobs which do bulk inserts/deletes... So best way is to keep checking weekly/fortnightly/monthly with the above queries on syscat tables to check if reorg is required or not... I really do not rely much on REORGCHCK utility from DB2... It is also important to run reorg on a regular interval based on about of data inflows and gets modified in your system...

So what next??? Probably some SQL Tuning tips!!! Till then HAPPY PREFETCHING!!!

8 comments:

  1. If you define your index as clustering, then it is the equivalent of REORG TABLE USING INDEX ...

    You would just run REORG TABLE SCHEMA.TABLENAME and it will order the data to match your clustering index.

    Norm

    ReplyDelete
  2. Agree... Missed the point!!! But I guess clustering has some insert/update overheads as well... Also, I have an option to choose and change the index for organizing records (I could do that with clustered index as well but involves re-creation of indexes) if my application logic changes...

    But yeah in general it might be beneficial to use one of the indexes as clustered ones... And in such scenario REORG TABLE is enough...

    Thanks Regards
    Sameer Kumar

    ReplyDelete
  3. "at the design I had done for tablespace included some reserve for prefetching/blocked I/O
    "
    you mean block-based bufferpool?
    -sathy

    ReplyDelete
  4. Sathy,

    Yup while designing we had reserved some amount of pages in bufferpool for blokced based IO, infact I was thinking of now increasing it. Why? any issues with it?

    ReplyDelete
  5. There's no column as "LASTUSED" in both SYSCAT.TABLES and SYSCAT.INDEXES. I have DB2 UDB V9.5

    ReplyDelete
  6. I guess it was introduced only in db2 v9.7:

    http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.db2.luw.admin.mon.doc%2Fdoc%2Fc0056434.html&resultof=%22last%22%20%22used%22%20%22us%22%20

    ReplyDelete
  7. Thanks, Samkumar.

    So in your opinion, what's the best way to find "lastused" info for an index in UDB V9.5?

    ReplyDelete
  8. I guess only way is to create an event monitor for tables. I am not aware of any other way to achieve this in v9.5

    ReplyDelete