db2 performance: March 2011

db2exfmt

So in last episode (#45) of db2 night show, people presented db2look, db2move, db2this, db2that (sorry Scott to have stolen your words)... So i continue my thread of my pseudo-presentation... But again i emphasize that do view the actual vedio... now final no 3 vdo has come... It has some really kwel tips... tomorrow is final no 4 (i know I am lagging behind :))... watch it live or catch up with the replay latter... But do watch n vote for whom u find the best...

Well getting back to my "would have been" presentation...

db2exfmt

Well lot many times a query will be slapped back on your face telling you that it is non-performant... its taking too much of CPU or may be I/O or may be sorting (see previous post to know how u can list such queries)...
Well in such cases you always wonder what is it that is causing the problem... Which part of the query is causing the issue!!! :-o
More than that generally what is reported is an part of application or a module which has more response time...

So how do we proceed...
Here is a db2something to rescue you... db2exfmt

This tool will help you format the content of db2 EXPLAIN tables. AIX users can locate this in DB2_HOME/sqllib/misc

Let's see how you can use this tool, we will discuss two scenarios, one where we know the bad performing query and other where we know a part of application which has perormance issues:

Prerequisite
-- You need to create EXPLAIN tables using EXPLAIN.DDL in the above mentioned directory

For first scenario:
-- Use the EXPLAIN statement to generate explain information for a query

Following are the valid db2 queries/commands which can be used in explain statement:
* CALL, Compound SQL (Dynamic), DELETE, INSERT
* MERGE
* REFRESH
* SELECT
* SELECT INTO
* SET INTEGRITY
* UPDATE
* VALUES
* VALUES INTO

Example:
>>> EXPLAIN ALL FOR SELECT Col1 FROM Test_Tab

In explain statement use WITH SNAPSHOT to ensure that snapshot is also recorded with the explain info...

This step will capture the explain information in the explain tables.

-- In case you of scenario 2, that is you want to see performance hotspots in your application... Use SET EXPLAIN SNAPSHOT and EXPLAIN MODE to set /EXPLAINregistry/ /EXPLAIN SNAPSHOT This will enable capturing snapshot/explain info for all the queries which will be fired by application in that session...

-- Once you have the EXPLAIN info captured use db2exfmt to format the explain data

db2exfmt can be either invoked with arguments or can be invoked in interactive mode.

Now lets explore what the various info you can get from the explain tables using exfmt:

>> You can use -g option to generate graph. There are vaiour sub-options to include more detailed info in the graph:
-gT will include cost for each operation in graph
-gI will include I/O for every operation in graph
-gF will include first tuple cost
-gC will include the expected cardinality
One can combine any of the above options e.g. -gCI shall give cardinality and I/O cost for every operation in graph

You can either choose to format the latest explain info or can -w timestamp for explaining a specific timestmp's explain info...

Or you can specify SOURCE_SCHEMA/SOURCE NAME to narrow down you search for hotspots... using -s or -n

-o / -t will let you re-direct the output to an output file or to your terminal.

Well now it is worth mentioning about other tools in db2 available for similar usage

>> with db2expln you can get the explain plan in form of a graph with less hassles, but you need to use it for individual queries. It is much better when you have few queries and when you exactly know the query, else
it can not be used to capture the explain info of all queries being fired on db from application

>> with db2caem (db2 create activity event monitor) tool you can do all these stuff with much more ease... But you need to be on db2 9.7 fix pack 3... Trust me this tool is the best you would have ever used for getting explain plan... An do watch the replay of episode #45, where some one has explain this tool quite well...

Well next time I will be back with a success story... Probably I will share how we monitored and got rid of high I/O contention and memory usage issues...

db2 performance: How to convey performance metrics from SNAPSHOT vi...

db2 performance: How to convey performance metrics from SNAPSHOT vi...: "I had recently participated in db2 has talent and was lucky to be selected for finals... I missed the finals though due to my travel schedul..."

How to convey performance metrics from SNAPSHOT views...

I had recently participated in db2 has talent and was lucky to be selected for finals... I missed the finals though due to my travel schedule... But I will write my next few blog posts with the content I would have presented if I would have reached till last episode... And you do watch the actual show (it has tips much more kwel)

To start with I will write on Performance metrics from snapshot info which can help you identify performance issues (especially before you hit any bottle necks)...

Sometime back I had been involved in a performance tuning exercise. We tuned all the top expensive SQL, we added a few new indexes as well (a lot is identified during development itself, a good practice to have), we had fixed all the individual SQLs which were taking time more than the benchmark. Despite all this, overall system performance still could not come up... I decided to take up the task of monitoring the database while performance load tests... That was an adventurous experience... I had decided to track the following:

1) SQLs which are although below the benchmark but are taking more time compared to
what they should be taking
2) SQLs which are spending too much of time in sorting the data
3) SQLs which are executed with a misguided plan (i.e. not getting proper statistics)
4) SQLs which are spending too much time on I/O and fetching lesser no of rows i.e.
which very high I/O time per row fetched
5) I started looking for no of data pages and index pages in bufferpool that are read
and written by a query (trust me you can not get a better metrics for identifying
table scans)
6) One thing which I had not earlier not thought of but eventually became my concern
was SQL which get executed only once or twice
7) SQLs which are executed over and over and total execution time was too much
8) How efficient is log write and log reads
9) Are there too long transactions and what is the frequency of commit

After this I had to explore a lot on which snapshot view has what info, which can help me arriving at the correct conclusion regarding above points... Going thru all the snapshot views and their columns helped me adding a few of above parameters (which i didnt had on my list when I started)

Following metrics helped me get the stats I was looking for:
Following statements helped me get some metrics for statements:

1) SUM(TOTAL_SORT_TIME)/NULLIF(SUM((STMT_USR_CPU_TIME_S*1000000) + STMT_USR_CPU_TIME_MS) + SUM(TOTAL_SORT_TIME),0 )
AS TOTAL_SORT_TIME_FRACTION

2) SUM((POOL_TEMP_INDEX_L_READS+POOL_TEMP_DATA_L_READS) - (POOL_TEMP_DATA_P_READS+POOL_TEMP_INDEX_P_READS))/NULLIF(SUM(POOL_TEMP_INDEX_L_READS+POOL_TEMP_DATA_L_READS),0)
AS TOTAL_TEMP_BP_UTIL

3) SUM(POOL_DATA_P_READS)/(NULLIF(SUM(POOL_DATA_L_READS),0)) TOTAL_DATA_BP_REFRESH

4)QUERY_CARD_ESTIMATE AS EXPECTED_CARDINALITY,
ROWS_READ, where EXPECTED_CARDINALITY/ROWS_READ>=1.5

5) SUM(CAST(total_sort_time as decimal(24,10)))/NULLIF(SUM((STMT_USR_CPU_TIME_S*1000000) + STMT_USR_CPU_TIME_MS),0 )
AS TOTAL_SORT_TIME_FRACTION

Following metrics helped me check the dynamic sqls (from SNAPDYN_SQL snapshot view):

6) CAST(TOTAL_SORT_TIME*(NUM_EXECUTIONS)as decimal(24,10))/NULLIF(PREP_TIME_BEST * NUM_COMPILATIONS,0) AS SORT_TO_COMPILE_RATIO

7) CAST(((PREP_TIME_WORST + PREP_TIME_BEST)*NUM_COMPILATIONS)as decimal(24,10))/NULLIF(2*NUM_EXECUTIONS,0) COMPILE_EFFICIENCY,

8) cast(SUM((POOL_TEMP_INDEX_L_READS+POOL_TEMP_DATA_L_READS) - (POOL_TEMP_DATA_P_READS+POOL_TEMP_INDEX_P_READS)) as decimal(24,10))/NULLIF(SUM(POOL_TEMP_INDEX_L_READS+POOL_TEMP_DATA_L_READS),0)
AS TOTAL_TEMP_BP_UTIL

Following are helpful metrics for monitoring locking performance (from SNAPDB):

9) LOCK_ESCALS LOCK_ESCALATIONS,

10) DEADLOCKS/NULLIF(LOCK_WAITS,0) DEADLOCK_PER_LOCK,

11) X_LOCK_ESCALS EXCLUSIVE_LOCL_ESCAL,

12) LOCKS_WAITING/NULLIF(LOCKS_HELD,0) LOCK_ISSUES_PER_LOCK,

13) DEADLOCKS/NULLIF(COMMIT_SQL_STMTS,0) DEADLOCK_PER_STMNT

Following are some good metrics to look for while measuring i/o efficiency (from SNAPDB)

14) cast(POOL_DRTY_PG_STEAL_CLNS*1000 as decimal(24,10))/ NULLIF(COMMIT_SQL_STMTS,0)
DIRTY_PAGE_STEAL_FRACTION,

15) cast(DIRECT_READS as decimal(20,10)) / NULLIF(DIRECT_READ_REQS,0) DIRECT_READ_EFFECIENCY,

16) cast(float(NUM_LOG_WRITE_IO)/NULLIF(LOG_WRITES,0) as decimal(20,10)) LOG_IO_PER_WRITE,

17) cast(float(NUM_LOG_READ_IO)/NULLIF(LOG_READS,0) as decimal(20,10)) LOG_IO_PER_READ,

18) cast(NUM_LOG_BUFFER_FULL as decimal(20,10))/NULLIF(LOG_WRITES,0) LOG_BUFF_UTIL,

19) CAST(LOG_WRITES AS DECIMAL(20,10))/NULLIF(COMMIT_SQL_STMTS,0) LOG_WRITE_EFFECIENCY,

20) cast(NUM_LOG_DATA_FOUND_IN_BUFFER as decimal(20,10))/NULLIF(NUM_LOG_READ_IO+NUM_LOG_DATA_FOUND_IN_BUFFER,0) AS LOG_BUFF_READ_EFFICIENCY,

21) cast(LOG_HELD_BY_DIRTY_PAGES *100 as decimal(24,10))/(10240*4096) AS LOG_IOCLEANERS_EFFICIENCY,

22) cast(UNREAD_PREFETCH_PAGES as decimal(24,10))/NULLIF(PREFETCH_WAIT_TIME,0) AS WASTEFULL_PREFETCH_PER_WAIT_MSEC

23) cast(POOL_NO_VICTIM_BUFFER as decimal(24,10))/NULLIF((POOL_INDEX_P_READS+POOL_DATA_P_READS),0) AS BP_CLEANER_FAILURE_FRACTION

24) --Overall Bufferpool Utilization per bufferpool
SELECT
SNAPSHOT_TIMESTAMP,
DB_NAME,
BP_NAME,
TOTAL_LOGICAL_READS,
TOTAL_PHYSICAL_READS,
TOTAL_HIT_RATIO_PERCENT,
INDEX_HIT_RATIO_PERCENT,
DATA_HIT_RATIO_PERCENT,
DATA_LOGICAL_READS,
DATA_PHYSICAL_READS,
INDEX_LOGICAL_READS,
INDEX_PHYSICAL_READS
FROM BP_HITRATIO
where DATA_HIT_RATIO_PERCENT<90 OR INDEX_HIT_RATIO_PERCENT<95;

25)
SELECT
SNAPSHOT_TIMESTAMP,
BP_NAME,
CAST(VECTORED_IOS as decimal(24,10))/NULLIF((POOL_DATA_P_READS+POOL_INDEX_P_READS),0) AS VECTORED_IO_FRACTION,
CAST(PAGES_FROM_VECTORED_IOS as decimal(24,10))/NULLIF((PAGES_FROM_VECTORED_IOS+PAGES_FROM_BLOCK_IOS),0) AS VECTORED_PAGE_READ_FRACTION,
CAST(UNREAD_PREFETCH_PAGES as decimal(24,10))/NULLIF(PAGES_FROM_BLOCK_IOS,0) AS WASTEFULL_PREFETCH_FRACTION,
CAST((POOL_ASYNC_DATA_READ_REQS + POOL_ASYNC_INDEX_READ_REQS) as decimal(24,10))/NULLIF((POOL_DATA_P_READS+POOL_INDEX_P_READS)-VECTORED_IOS,0) AS ANSYNC_PREFETCH_EFFICIENCY,
CAST(POOL_NO_VICTIM_BUFFER as decimal(24,10))/NULLIF((POOL_INDEX_P_READS+POOL_DATA_P_READS),0) AS BP_CLEANER_FAILURE_FRACTION
FROM SNAPBP;

Following parameters helped me get some imp points about how fine are the automatic maintainence parameters working

26)
cast(STATS_FABRICATIONS as decimal(24,10))/NULLIF((ASYNC_RUNSTATS+SYNC_RUNSTATS),0) AS STATS_USE_TO_GATHER_RATIO,

27)
cast(SYNC_RUNSTATS as decimal(24,10)) /NULLIF((SYNC_RUNSTATS+ASYNC_RUNSTATS),0) SYNC_RUNSTAT_FRACTION,

28)
cast(STATS_FABRICATE_TIME as decimal(24,10))/NULLIF(SYNC_RUNSTATS_TIME,0) AS STATS_SYNC_COMPUTE_UTIL,

29)
cast(SYNC_RUNSTATS_TIME*1000000 as decimal(24,10))/NULLIF(ELAPSED_EXEC_TIME_S+ELAPSED_EXEC_TIME_MS*1000000,0) AS SYNC_STAT_COLLECT_TIME_FRACTION

30) Two parameters in SYSIBMADM.SNAPDYN_SQL which can help you see if the queries provided by development is properly parameterized or not:

NUM_EXECUTIONS and NUM_COMPILATIONS

Following are some important monitor elements, which can help you see dynamic (SNAPDYN_SQL) or static sql (from SNAPSTMT) which are having heavy table scan operation

31)
CAST((POOL_DATA_L_READS-POOL_DATA_P_READS) as decimal(24,10))/NULLIF(POOL_DATA_L_READS,0) DATA_BP_UTIL,

32)
CAST(POOL_DATA_P_READS as decimal(24,10))/NULLIF(POOL_DATA_L_READS,0) DATA_BP_REFRESH,

33)
CAST((POOL_INDEX_L_READS-POOL_INDEX_P_READS)as decimal(24,10))/NULLIF(POOL_INDEX_L_READS,0) INDX_BP_UTIL,

34)
CAST(POOL_INDEX_P_READS as decimal(24,10))/NULLIF(POOL_INDEX_L_READS,0) INDX_BP_REFRESH,

35)
CAST((POOL_TEMP_INDEX_L_READS+POOL_TEMP_DATA_L_READS) - (POOL_TEMP_DATA_P_READS+POOL_TEMP_INDEX_P_READS) as decimal(24,10))/NULLIF(POOL_TEMP_INDEX_L_READS+POOL_TEMP_DATA_L_READS,0)
AS TOTAL_TEMP_BP_UTIL,

I hope above metrics and monitor elements help any one who reads this blog. I have kept the column alias as elaborate (to convey the meaning they carry) as possible. I might have ignored the parameters like STMT_USR_CPU_TIME_S and STMT_USR_CPU_TIME_MS but they are equally important. These snapshot elements mentioned by me will help you super tune your application, once you have already tuned the long running sqls and have indexed the database properly!!!

Would like to acknowledge the encouragements received from reader of previous blogs (the increasing no of views every time i opened this page is a boost factor) and Susan Visser (whose tweets inspired me to continue writing the blog)!!!

Will continue writing!!! db2expln is next to come!!!

db2 performance

About Me

Connect to me

db2exfmt

db2 performance: How to convey performance metrics from SNAPSHOT vi...

How to convey performance metrics from SNAPSHOT views...