db2 performance: February 2011

Stuck with a packaged app???

So are you a db2 dba in a production environment and got stuck with a packaged app... You can not change the application nor you can change the database schema... oops...
May be the application is using non-parameterized queries!!! or may be they are sorting the data too much to fetch the first few records say on the basis of some date column... Too much of data is being stored in the same table, the record count forces a table scan!!! :(
and many more problems which could have been avoided at the design time, but you can not help it now... You have to pay for any changes... :-o

What to do???

a few tips... (* make sure you are on latest db2 version)

1) If you find that the application is not using parameterized queries and the same query is being compiled again and again, db2 9.7 has something special for you: db cfg stmt_conc. This ensures that the query parser replaces the literals with a bind variable if a similar query is identified it is used instead of compiling again...

update db cfg using STMT_CONC LITERAL

those of you who are coming from an Oracle Background can refer to cursor_sharing

2) If your application has too many fetches for first few rows and you use ROW_NUMBER for windowing and then filtering the top result... Please do not do that... Use optimize for first N rows or fetch first N rows... Those of you who are do so for ANSI compatibility or are simple stuck with a packaged application, you can got for setting OPTIMIZEFORNROWS in db2cli.ini

3) Well this one is a bit of trade-off... If you feel that the application is using queries which could have been optimized in a much better way, and they are nt ready to accept a change without funds... set the optimization level little higher to allow db2 use more brain for queries....

4) If you see that the issue is with only a few queries, do not change the optimization level rather use optimization profile
Well it is something oracle hints but not exactly the same... I can not explain the whole concept here, but yeah go thru this article on developer works...

5) If you feel the table should have been broken down into several entities, say account table could have been divided in to CREDIT ACCOUNTS and DEBITS ACCOUNT. But clubbing them together has caused an enormous row count... you can either actually break the table and create view on it having the same (older )table name... or a better way is to use db2 range partitioning feature (oh baby you gotta upgrade to v9)

6) You see that a table goes to high no of records with in a single transaction and the same part/module of the application will delete all the rows. Now the problem with this is
i) at any given point of time you table will have 0 records unless
this module is running
ii) Whenver RUNSTATS run (either auto or your cron job) the table
has got 0 records
iii) now when your module populates data all of a sudden the stats
will not be effective immediately but a fetch (probably in same
module) may be immediate
iv) The optimizer thinks the table is empty and generates a wrong
plan (a costly one) and avoids any index scan

well this is a problem you will face with temporary processing tables. The actual business logic could have been achieved with global temp tables or may by retaining the rows for next run of module (rows are deleted in starting and not at end hence your runstats always picks up a non-zero value as record count)... But that can not be done now...
a trick... say
alter table tabname volatile

this will enforce an index scan if the table is indexed

7) Use query compiler environment variable as below to optimize the queries before getting executed:
db2set db2_antijoin=EXTENDED

db2set db2_inlist_to_nljn=yes

8) Use db2_workload to set proper category of your application, e.g. it can be set at once to group several environment variable hence to facilitate a better performance with that application... e.g. it can be set to cognos or sap etc...

Well now m feeling sleeepy... will modify the post if i can recollect more tips for dbas STUCK with packaged apps...

10 Things to have in DB2: which shall improve performance..

Well I support development of a product which "claims" to be RDBMS independent... I know wat it takes to put forward a claim like that... I an more than me my senior colleagues have to break there head every time we have a porting... I remember the things I had faced, right from my senior porting the ER design from one db to other db and then creating the physical design of db and more than that security architecture... And haash how can i forget those incompatible functions and operators... No offences Mr Gates... But SQL Server has some out of the blue function arguement (what will be the most logical sequence of arguments if you wanna convert an expression to other data type, i bet most of you will miss the actual sequence used by SQL Server) and that concatenation operator X-( ... Well all of them have some or the things different... I can go on and on in blog post on that topic... Well lets see what features are good to have in DB2 (may or may not be inspired by other vendors), especially to from the performance angle... Lets try to list 10 of them (it's gonna be tuff for me, esp after the viper and cobra series)...

1) SQL Server 2008 has something called filter based indexes: Many of you might have faced an issue where you have multiple status to be stored in same column, but you will generally query on one/two of them as filter, other status are basically for display in inquiry. So how good it wud had been if you can index only those two status which are your filter criteria and save indexing cost and space... Your queries will be optimized for those two status only (like you say check and do not to enforce it and query is optimized for valid values)

2) Hierarchical Queries: I have worked with Oracle Hierarchical queries and have found them to be performing better than the recursive SQLs used to write pseudo-Hierarchical queries (yes that is what i will call them, for records recursive sql is available in Oracle 11g onwards)

3) Something as good as OEM in Oracle to monitor the database (and it should be free, hence optim is ruled out :P)

4) Forced and offline re-org after table/column alter should be fixed (no other RDBMS i have worked with requires this and yes i will count this as a performance issue in addition to being an availability issue)

5) Well there are queries generated by hibernate which uses ROW_NUMBER to window a table and fitler based on rownum<=N to select top N rows over a criteria, say last update date time. These kind of statements are automatically converted to SELECT TOP N for sql server 2008. It would had been nice if DB2 did the same, in an easier way (it is still possible with OPTIMIZEFORNROWS in db2cli.ini)

6) The optimizer should be able to guess, if the column(s) in join predicate is also part of some filter (where) predicate, and should apply the filter twice to get a better cardinality expectation. Well, I am not very sure if any RDBMS does this currently. But try it out and see the difference in expected cardinality yourself...

7) It will be good to have a better concurrency control and locking mechanism which can compete with that of oracle

8) Query Hints should be made more flexible and easy to use in prepared statements, on lines with Oracle and SQL Server

9) There should be function based indexes (currently the feature can be implemented with a little bit of work around by adding a computed column and indexing it)

10) Some good way of indexing low cardinality tables (like oracle has bit map indexes)

Well let's see how much db2 galelio has to offer!!! Have look:

DB2 Galileo Early Adoption Program

Well,

Performance Engineering for Databases... it is not just about identifying and tuning the issues... or removing the bottlenecks... More than that it is avoiding those bottleneck to occur... More of guessing the issues beforehand rather than identifying it latter...

So it involves following...

Having a more scalable Physical design...

A more robust logical schema design...

A more performing application design...

Let me discuss in brief

To start with Physical Design...

PHYSICAL DESING

With DB2 you need to ensure following things in physical design

1) You have tablespaces distributed on different disks

2) With my personal experience I suggest usage of ASM spread over several disks

3) Always have indexes and Data on saperate disks

4) It is good to categorize table and indexes as below

i) Online Transaction Data

ii) Configuraiton Data

iii) Static data for your application to work and load

Place tables of above category in three different tablespaces and their indexes in three

different

5) Always create separate bufferpools for configuration & static data and transaction data

and indexes

6) Always create tablespaces with NO FS option to avoid double caching

7) Always better to use STMM for better memory management

8) Set a proper value for TRANSFERRATE and OVERHEAD for tablespaces, as they tell the optimizer about the time taken for seek and transfer of data from disk. A proper value can at times help the optimizer choose index scan over tables scan (say in a situation when table data may not be too scattered but index leaves are)

SCHEMA DESING

1) Avoid having foreign keys if your application takes care of integrity

2) Always index the columns used for

i) WHERE clause filter

ii) JOIN predicates

iii) SORT operation (ORDER BY, MAX(), MIN() etc)

3) Always keep the STATISTICS updated for all the tables to ensure index scans

4) If the cardinality of any table changes abruptly, consider making the table VOLATILE. this will help you ensure and index scan

APPLICATION DESIGN

1) Always use ANSI JOIN operator for joining tables. This avoids any accidental cross product

2) Try not to use too many sort operations in your query eg multiple MAX etc. instead see if you can use RANKING or windowing functions

3) Avoid doing JOINs with VIEWs, instead join with tables, this will avoid JOINing with any unnecessary underlaying table of view.

4) Avoid self joins and prefer case statements over self joins if possible

5) Use FETCH FIRST N rows to optimize selection for few rows if using a packaged application set OPTIMIZEFORNROWS in db2cli.ini.

6) As a thumb of rule, use parmeterized queries. When working with a packaged application you can set statement concentrator db cfg to force parameterization

7) When working with reporting tools like Crystal Report, use a single query to fetch the data. Never use links and joins provided by tool. The tool might be doing a full fetch and a join locally. This has an impact on IO and network as well.

8) Have a better concurrency in application

CONCURRENCY CONTROL

To improve the concurrency of the application use following:

1) Use Cursor Stability Isolation Level

2) Use LOCKSIZE row for all the tables

3) Use optimistic locking by using ROW CHANGE TIMESTAMP expression

4) Set following parameter

i) db2set db2_skipinterted=ON

ii) db2set db2_skipdeleted=ON

iii) db2set db2_evaluncommitted=ON

iv) db cfg cur_commit

5) Avoid long transaction and ensure the operations are committed regularly

I will wind up this post with a note that setting this right will always help you avoiding any performance issues in future....

I will try to come up with some SQL Tuning Tips...

db2 performance

About Me

Connect to me

Stuck with a packaged app???

10 Things to have in DB2: which shall improve performance..