Wednesday, March 12, 2014

DB2 Performance Issues and Tuning - part 1

There are many reasons as to why DB2 might perform poorly. Some of the most obvious reasons are
1.Inadequate Index : There might be non-matching index as compared to the select statement in query.Explained below ,how placing proper columns in index avoids DB2 sort when ORDER BY clause is present. Later we will check here how to improve index effectively to optimize query.
2.Lock Waits : Application might be waiting for  long time before acquiring lock on a resource which is currently being held by other processes
3. Wrong Clustering Sequence:  A wrong index was chosen as a clustering sequence.As a result there are unnecessary sorts in application.
4.Environmental issues: The size of buffer pool, Disk cache,and others can also impact the query response time.
5. Runstats Out-of-Date:  Runstats generally gathers information about the table space,partitions,index, index spaces.We need these in designing the databases.These information should be updated so as to gain efficient usage of the DB2 resources while designing a new database.Optimizer uses this information in building efficient access path.

There can be many more reasons for poor DB2 performance.Lets see how DB2 performance tuning can be done step by step with simple examples.

1.Effective Index can upgrade the Query response time:

Suppose we have one table CUST having the columns CUSTNO, FNAME,LNAME,CITY.  We  defined index(X1) on the column CITY.
We run a simple query as below:
SELECT LNAME,CUSTNO from CUST 
where FNAME=:FNAME 
            and CITY=:CITY 
order by LNAME.

While executing this, DB2 first performs a SORT and materialize the whole query when the cursor is OPENED because ORDER BY clause is present with LNAME.This sort operation because of the 'order by' clause can take long time when data is huge.
To prevent DB2 from sorting the data, what we can do is declare a new index X2 on the columns CITY,LNAME or modify the current index and add the column LNAME to it, since cursor contains 'ORDER BY LNAME'. Now Optimizer will see the data in requested order without a sort.
So now,DB2 will materialize the query during the FETCH time, ie the Disk I/O s will take place during the fetch time  and not during Open time because results will be in requested order without a sort.

So,How long does it take to run a SQL Statement? How is the response time calculated?
There are many components that contribute to the response time like network time,Disk IO time,cpu time, lock wait etc....
The major components that constitute the Local Response time(LRT)  are the the time spent on performing the IO operations(from disk, disk cache, buffer pool ), processing time in CPU, and wait times.(certain other items like package loading, authorization are insignificant)
LRT can be calculated with VQUBE2 method.(Very quick upper bound estimate v2). We will check that in other posts.
So here we see that there are two indexes.Now the optimizer can choose one index and fix the access path. We can view which Access path is chosen by optimizer with the SQL EXPLAIN.

So,How to use SQL EXPLAIN and PLAN_TABLE to view the access path chosen by DB2?
We need to wrap the SQL EXPLAIN statement around the targeted SQL Query. The Optimizer  writes the access path to the PLAN_TABLE. This PLAN_TABLE is a shared table, or every user can have their own schema of the plan table.
Some of the important columns in the PLAN_TABLE which provides us the desired information are
1.QUERYNO
2.ACCESSTYPE
3.MATCHCOLS
4.ACCESSNAME
5.INDEXONLY
6.SORTx_ORDERBY
7.PREFETCH

Step 1. The SQL statement in blue is the main Query for which we want to know the access path.
So we wrap the statement with the 'EXPLAIN ALL SET QUERYNO = 1 FOR'  like below

EXPLAIN ALL SET QUERYNO = 1 FOR
SELECT CUSTNO, CUSTLNAME                           
FROM CUST                                                            
WHERE CUSTNO LIKE '%0A';

Once we execute the above query, optimizer first writes the access path onto the Plan Table and then gives the output. Step below depicts how we can get the information from plan table

Step 2. Below statement queries the plan table using the QUERYNO used above in step1.

SELECT QUERYNO,METHOD,ACCESSTYPE,MATCHCOLS,
ACCESSNAME,INDEXONLY,PREFETCH
FROM PLAN_TABLE
WHERE QUERYNO = 1;


Brief explanation of the output columns of the the PLAN Table with its possible values: 
QUERYNO:  Gives the Queryno which we set in step1
ACCESSTYPE:  This is of much importance. Value of  'I' indicates data fetching is done by using an Index.Value of 'R' indicates Table scan or Range Scan.
MATCHCOLS :  It indicates the number of matching columns in the index.
ACCESSNAME:  Indicates the Name of the index if index access is selected.
INDEXONLY: Can have value of 'Y' or 'N'.  Value of 'Y' indicates that all data requested in the select query is present in index.  Value of 'N' indicates table access is also needed.
PREFETCH: Value of 'D' is Dynamic prefetch. 'S' is sequential prefetch.

Continue reading the below links if this interests u..!
2. Db2 Prefetch
3.working towards a better index 

1 comment: