Wednesday, December 30, 2015

Passing Data from JCL to COBOL program

We pass data from JCL to cobol program in two ways:
1. Using the PARM keyword 
2. Using SYSIN DD in JCL
The basic difference between these two ways of passing are as follows:
A. Using 'PARM=' on the exec keyword, we can pass only 100 characters of data. Also in cobol we need linkage section to take the values in.
B. When we pass data through SYSIN, we need to have the accept keyword in COBOL. For each row in SYSIN, we need to have corresponding ACCEPT verb in cobol.

Sample cobol program accepting PARM value through linkage
...
......
LINKAGE SECTION.
01 LS-TEST-PARM.
    05 LS-TEST-LENGTH PIC S9(04) USAGE COMP.
    05 LS-VAR1 PIC 9(02).
    05 LS-VAR2 PIC 9(02).

PROCEDURE DIVISION USING LS-TEST-PARM.
DISPLAY LS-VAR1.
DISPLAY LS-VAR2.
DISPLAY LS-TEST-LENGTH.
STOP RUN.

Have a look into the corresponding JCL.
//TTYYTST  JOB(TEST),CLASS=I,MSGCLASS=X,MSGLEVEL=(1,1)
//                NOTIFY=&SYSUID
//*
//JSTEP01  EXEC PGM=SAMPLE3,PARM='1211212'
//STEPLIB  DD DSN=TEST.LOAD.LIB1,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSOUT   DD SYSOUT=*

RESULT:
12
11
007

Sample program accepting data from SYSIN .
...
....
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-TOTAL
    05  WS-VAR1  PIC X(02).
    05  WS-VAR2  PIC X(02).
PROCEDURE DIVISION.
INITIALIZE WS-TOTAL.
ACCEPT WS-VAR1.
ACCEPT WS-VAR2.
DISPLAY WS-VAR1.
DISPLAY WS-VA22 .

Corresponding JCL:
//TTYAATST  JOB(TEST),CLASS=I,MSGCLASS=X,MSGLEVEL=(1,1)
//                NOTIFY=&SYSUID
//*
//JSTEP01  EXEC PGM=SAMPLE1
//STEPLIB  DD DSN=TEST.LOAD.LIB1,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSOUT   DD SYSOUT=
//sysin dd*
12
31
/*
//*
Result:
12
31

If we do not have any data to be passed we can make SYSIN DD DUMMY. I tried passing blank values via SYSIN. Was Expecting an abend, but it did not. It displayed spaces in SYSOUT.

Wednesday, November 25, 2015

COBOL DB2 Program from scratch - step by step guide

This post is just to help/guide to a new programmer in writing a cobol db2 program from scratch.
Let us write  a cobol program  to read the EMPLOYEE table and get  the details of the employee with name 'RYAN'.
I am assuming the table is already created in the user database like below:.

EMPLOYEE
EMPID EMPNAME     DEPARTMENT          SALARY              DESIGNATION
1000     XXXXXXX       XX                     10000                  SE
1001     YYYYYYY          YY                      9000                    SE
1002     ZZZZZZZ       ZZ                     20000                   TL


STEP1:  We need to declare the  Table structure in the Working Storage section. Ideally we should DCLGEN Tool to generate the structure for our DB2 table. The option to go to our DCLGEN tool depends on the ISPF settings. Generally it can be invoked using 'D' option on the ispf menu to display DB2I Default pannel.
DCLGEN Screen 
On pressing ENTER, the DCLGEN will be generated and will look like below.

*****************************************************************
EXEC SQL DECLARE DEV.EMPLOYEE TABLE
 ( EMPID CHAR(10) NOT NULL,
   EMPNAME CHAR(30) NOT NULL,
   DEPARTMENT CHAR(2) NOT NULL,
  SALARY DECIMAL(10,2) NOT NULL,
  DESIGNATION CHAR(4) NOT NULL )
)
END-EXEC.
*************** COBOL DECLARATION FOR TABLE DEV.EMPLOYEE *********
01 EMPOYEE-RECORD.
    05 HV-EMPID PIC X(10).
    05 HV-EMPNAME PIC X(30).
    05 HV-DEPARTMENT PIC X(2).
    05 HV-SALARY PIC S9(8)V99 COMP-3.
    05 HV-DESIGNATION PIC CHAR(4).
*********** THE NUMBER OF COLUMNS IN THIS DECLARATION IS 5 *****
This DCLGEN  needs to be included into the Working Storage Section of our cobol program in the following way:
EXEC SQL
INCLUDE  EMPLOYEE
END-EXEC.
Also, the most important copybook SQLCA needs to be included . Apart from this we wont be able to capture the SQL Return codes

EXEC SQL
INCLUDE  SQLCA
END-EXEC.
Also since our query in the program might return more than single row, we need cursors in our program. Read About cursor programming here
I am not going into the details of cursor programming here, since those are there in other posts.
Once the program is ready and compiled , we need to bind it to a plan in the test region. Once the bind is successful, we can run the program using  the IKJEFT01 utility as below.

cobol db2 run jcl

Wednesday, November 18, 2015

Frequently used CA7 Commands

To Define Jobs in CA7:
Each Job in CA7 must first be defined in its database; the easiest way to do this is through the command 'JOB'. Just type the command JOB or 'DB' in the CA7 Scheduler and it will show the list of available options we have to add the jobs.

Lets have look into some of the day-to-day commands which we use frequently to check the job schedules and other details.

1. LJOB,JOB=jobname,LIST=ALL
It will show you all the details of the job once you press enter.

We can use the below values with the LIST parameter
 LIST=NODD.  It will show the TRIGGERED BY JOS and SUCCESSOR jobs
 LIST=RQMT   to get the Requirement and Network connections
 LIST=SCHD    to know many sch id are assigned to a particular job using the below command
 LIST=TRIG     to know only the triggering jobs

2. Get the list of jobs in tree structure. Use the command
   FRJOB,JOB=jobname

3. Get the tree structure of the downstream job. Use the command
    FSTRUC,JOB=jobname

4.  Get the JCL in the pannel using the below command
 LJCL,JOB=jobname

5.List the active jobs using the command
LACT

6. Want to see the member names in any PDS. Use the below command
LISTDIR,DSN=pds-name

CA7 Guide for beginner's

CA7 is a product of Computer Associates and is a scheduling tool for batch jobs in mainframe.When the number of jobs to be executed is huge in number and jobs need to be executed depending on several parameters (like time dependency,  availability of data sets from other jobs, successful completion of other jobs), a tool like CA7 comes into play. 
Once a job is scheduled /triggered or manually added in CA7, it will start a process of moving through various QUEUES in CA7. CA7 will manage it until its completion.
Below is the diagram which shows in brief the lifecyle of the job inside a CA7 Scheduler.


Lets have a quick look into the Queues:
Request Queue:  When a job enters CA7, it will reside in the request queue until all its requirements have been satisfied. Examples of requirements  can be cited like :
Predecessor job
User holds
Submit time
Data set requirements

Ready Queue:  When a job's requirement have been satisfied, it will move to the ready queue. It will wait until initiators are available and any virtual resource or workload manager requirement is satisfied to enable it to execute.

Active Queue: When job can run it is moved to the active Q where it will remain whilst the job is executing within JES2. Once the job  has completed successfully on JES2 the job will be marked completed within CA7 and is removed from the active Q.

Failure list: If the job fails the it will be moved back to the requirement queue but will need correction prior to rerun. JCL errors are returned to ready Q and other failures are moved to request Q.

How to login to CA7.
Differnet systems use different methods to get into CA7. Some system allow the users to access the CA7 from ISPF menu while others use varied methods. Once we go inside the interface all remains the same.

Refer to This link for list of CA7 Commands.

Thursday, October 29, 2015

REUSE option in VSAM and IDCAM. Drawbacks of Reuse.

I have seen the REUSE parameter in the REPRO statement using IDCAMS. Backtracking the purpose of using it, came to know several features of this keyword which i am sharing here.
Some applications might need temporary files, which can be deleted after the program run is over.You may need these files in the form of a KSDS,ESDS or RRDS. Thus VSAM allows to  create a Reusable cluster or files  using this keyword in the definition of the cluster.

Lets see  what happens if we do not use the REUSE parameter ? 
In this case, we can reload the VSAM only once.Only way to reload the data is to DELETE DEFINE the vsam file and repro the data.Should we reload it again, we need to delete/define it again.  In order to remove this repeated  DELETE DEFINE steps, we can define the vsam with REUSE parameter.
Do we remember the term HURBA ? Check the  listcat  section once how we can know about it.
HURBA stands for  --> Highly Used Relative Byte Area.
When we delete define a VSAM, this HURBA remains as zero. It increments as we add records. Ideally this HURBA indicates the offset of the last byte in the data set.
When we use reuse option, this HURBA is reset back to zero. Logically we are deleting all the records thereby.
Thus,the REUSE parameter allows us to reload the vsam as if it were empty.

Although it sounds simple, but there are certain drawbacks for using REUSE parameter.

We know VSAM performance depends on the CI/CA splits. With this resue option, the CI/CA splits remain in place. Over the period of time the performance degrades.

Few points to remember :
1. We cant use REUSE in the repro when the cluset is not defined with REUSE option.
     However we can Alter the VSAM definition to make it resuable.

  //STEPA    EXEC  PGM=IDCAMS
  //SYSPRINT DD    SYSOUT=A
  //SYSIN    DD    *
             ALTER -
                TEST.VSAMDATASET -
                REUSE
2.  Cluster defined with Reuse keyword can not have alternate index defined on it.

Saturday, September 26, 2015

Including Date field in the output file using SORT

Including Date field in the output file:

Many a times it is required to include the date in the output file. This can be done using DATE parameter in SORT.
There are 3 DATE parameter option available, DATEn, DATEn(c) and DATEnP where n=1, 2 or 3.

Consider the input file INFILE,
1111111111111111111111111111
1111111111111111111111111111
1211111111111111111111111111
1311111111111111111111111111
1411111111111111111111111111
1511111111111111111111111111
1611111111111111111111111111
1711111111111111111111111111
1811111111111111111111111111
1911111111111111111111111111
2011111111111111111111111111
2111111111111111111111111111
The output date is in Zoned Decimal format.
The DATE1 occupies 10 bytes and gives the date in YYYYMMDD format.
The DATE2 occupies 6 bytes and gives the date in YYYYMM format.
The DATE3 occupies 7 bytes and gives year and Julian date (JDT) as YYYYJDT format.
The JCL’s below show the use of DATEn parameter. The current date August 26, 2010(Julian Date 238)
//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT DD DSN=OUTFILE1,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTREC FIELDS=(1,29,&DATE1)
//

The contents of OUTFILE are as below,
OUTFILE:
1111111111111111111111111111 20100826
1111111111111111111111111111 20100826
1211111111111111111111111111 20100826
............

On using DATE1 we have the current date in YYYYMMDD format.
//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT DD DSN=OUTFILE2,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTREC FIELDS=(1,29,&DATE2)
//
The contents of OUTFILE2 are as below,
1111111111111111111111111111 201008
1111111111111111111111111111 201008
1211111111111111111111111111 201008
.........
On using DATE2 we have the output date in YYYYJDT format

//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT DD DSN=OUTFILE2,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTREC FIELDS=(1,29,&DATE3)
//
The contents of the OUTFILE2 is as below,
1111111111111111111111111111 2010238

Using the DATEn(c) parameter:
On using the DATEn(C) parameter, the output date appears in formatted way wherein a character ‘/’ is placed between the year month and date fields.
DATE1(c) occupies 10 bytes and the format is YYYY/MM/DD.
DATE2(c) occupies 7 bytes and the format is YYYY/MM.
DATE3(c) occupies 7 bytes and the format is YYYY/JDT.

Using the DATEnP parameter:
On using DATEnP the output appears in Packed decimal format. So the number of bytes occupied is lesser than DATEn parameter. Other than this there is no difference between DATEn and DATEnP.
The DATE1 occupies 5 bytes and gives the date in YYYYMMDD format.
The DATE2 occupies 4 bytes and gives the date in YYYYMM format.
The DATE3 occupies 4 bytes and gives year and Julian date (JDT) as YYYYJDT format.
1111111111111111111111111111 2010238


Retrieving Information on records having older dates:
Consider a case wherein we need to retrieve records that have yesterday’s date.
The JCL is as below,
//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE1,DISP=SHR
//SORTOUT DD DSN=OUTFILE3,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
INCLUDE COND=(30,8,CH,EQ,&DATE1-1)
OUTREC FIELDS=(1,40)
//

Thursday, September 3, 2015

Splitting Input Files using Sort. Use of SPLIT ,SPLITBY,SPLIT1R commmands

SPLIT command spits the output records one record at a time among output datasets. This happens until all the output records are written. The split happens in rotation among the datasets mentioned in the OUTFIL.
The First record from the output records is written to first dataset mentioned in the OUTFIL group, the Second record from the output records gets written to the second dataset mentioned in the OUTFIL group and so on.
When each OUTFIL dataset has 1 record, the rotation starts again with the dataset mentioned first in the OUTFIL group.
The records are not contiguous in the OUTFIL datasets.
The Below JCL splits the data in INFILE and copies to OUTFILE1 and OUTFILE2 as mentioned above.

Consider the contents of Input File - INFILE as below:

1111111111111111111111111111
1211111111111111111111111111
1311111111111111111111111111
1411111111111111111111111111
1511111111111111111111111111
1611111111111111111111111111
1711111111111111111111111111
1811111111111111111111111111
1911111111111111111111111111
2011111111111111111111111111
2111111111111111111111111111
Let us use the commands and see the outputs.

The Below JCL splits the data in INFILE and copies to OUTFILE1 and OUTFILE2 as mentioned above.
//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT1 DD DSN=OUTFILE1,DISP=SHR
//SORTOUT2 DD DSN=OUTFILE2,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=(SORTOUT1,SORTOUT2),SPLIT
/*

The contents of OUTFILE1 and OUTFILE2 would be as below,
OUTFILE1
1111111111111111111111111111
1311111111111111111111111111
1511111111111111111111111111
1711111111111111111111111111
1911111111111111111111111111
2111111111111111111111111111
OUTFILE2
1211111111111111111111111111
1411111111111111111111111111
1611111111111111111111111111
1811111111111111111111111111
2011111111111111111111111111
OUTFILE1 dataset contains records 1, 3, 5…so on.
OUTFILE2 dataset contains records 2, 4, 6…so on.
Note that the records in the output datasets are not contiguous.

SPLITBY Command:

SPLITBY splits the output records M records at a time in rotation among the datasets mentioned in the OUTFIL. This happens until all the output records are written.
The First Set of records from the output records gets written to first dataset mentioned in the OUTFIL group, the Second Set of records from the output records gets written to the second dataset mentioned in the OUTFIL group and so on.
When each OUTFIL dataset has the specified set of records, the rotation starts again with the dataset mentioned first in the OUTFIL group.
The syntax is SPLITBY=M, where M=1,2,3…so on
The records are not contiguous in the OUTFIL datasets.
SPLITBY=1 is equivalent to SPLIT.
The below JCL splits the data in INFILE and copies to OUTFILE3 and OUTFILE4 as mentioned above.
//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT1 DD DSN=OUTFILE3,DISP=SHR
//SORTOUT2 DD DSN=OUTFILE4,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=(SORTOUT1,SORTOUT2),SPLITBY=3
/*
The contents of OUTFILE3 and OUTFILE4 would be as below,
OUTFILE3
1111111111111111111111111111
1211111111111111111111111111
1311111111111111111111111111
1711111111111111111111111111
1811111111111111111111111111
1911111111111111111111111111
OUTFILE4
1411111111111111111111111111
1511111111111111111111111111
1611111111111111111111111111
2011111111111111111111111111
2111111111111111111111111111
OUTFILE3 contains records (1, 2, 3), (7, 8, 9).
OUTFILE4 contains records (4, 5, 6), (10, 11).
Note that the records in the output datasets are not contiguous.

SPLIT1R splits output records M records at a time in one rotation among the datasets mentioned in the OUTFIL. This happens until all the records are written. In SPLIT1R the rotation happens only once among the OUTFIL datasets.
If on reaching the last OUTFIL, more than M records from the output records is left, all of those would be move to last OUTFIL.
If the input has only M records, then all input records will get moved to the first OUTFIL. The remaining OUTFIL datasets will be empty.
The syntax is SPLIT1R=M, where M=1, 2, 3…so on.
The records are contiguous among the OUTFIL datasets.
The below JCL’s splits the data in INFILE,
JCL1:
//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT1 DD DSN=OUTFILE5,DISP=SHR
//SORTOUT2 DD DSN=OUTFILE6,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=(SORTOUT1,SORTOUT2),SPLIT1R=5

The output files contents are shown below,
OUTFILE5:
1111111111111111111111111111
1211111111111111111111111111
1311111111111111111111111111
1411111111111111111111111111
1511111111111111111111111111
OUTFILE6:
1611111111111111111111111111
1711111111111111111111111111
1811111111111111111111111111
1911111111111111111111111111
2011111111111111111111111111
2111111111111111111111111111
There are two output files, and M=5. The input INFILE contains 11 records.
The OUTFILE5 contains records 1, 2, 3, 4, 5.
The dataset OUTFILE6 contains records 6, 7, 8, 9, 10, 11(i. e all the remaining records)

JCL2:

//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT1 DD DSN=OUTFILE7,DISP=SHR
//SORTOUT2 DD DSN=OUTFILE8,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=(SORTOUT1,SORTOUT2),SPLIT1R=11
//
The output file contents are shown below:

OUTFILE7:
1111111111111111111111111111
1211111111111111111111111111
1311111111111111111111111111
1411111111111111111111111111
1511111111111111111111111111
1611111111111111111111111111
1711111111111111111111111111
1811111111111111111111111111
1911111111111111111111111111
2011111111111111111111111111
2111111111111111111111111111

OUTFILE8
empty as expected.

Convert VB file to FB and Convert FB file to VB using SORT

The below JCL copies the VB file to FB file.

//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INPUTVBFILE,DISP=SHR
//SORTOUT DD DSN=OUTPUTFBFILE,DISP=SHR
//*
//SORTWK01 DD SPACE=(CYL,10),UNIT=SYSDA
//SYSOUT DD SYSOUT=*
//SYSIN DD *
SORT FIELDS=(5,76,CH,A)
OUTFIL FNAMES=SORTOUT,VTOF,BUILD=(5,76)
/*

The INPUTVBFILE is a VB file with record length 80.
The OUTPUTFFBFILE is a FB file of record length 76.
Before executing the JCL it is assumed that both the SORTIN and SORTOUT datasets exists.
SORT FIELDS=(5,76,CH,A) sorts the input VB file.
VTOF will handle copying the VB file to FB file.
It is essential to give BUILD or OUTREC parameter when VTOF parameter is used.

Below JCL will Convert FB file to VB

//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INPUTFBFILE,DISP=SHR
//SORTOF01 DD DSN=OUTPUTVBFILE,
// DISP=(NEW,CATLG,DELETE),
// UNIT=SYSDA,
// DCB=(LRECL=80,RECFM=VB,BLKSIZE=84),
// SPACE=(TRK,(3000,2000),RLSE)
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=SORTOF01,FTOV

It is not essential to give BUILD or OUTREC parameter when FTOV parameter is used.

Tuesday, August 4, 2015

Can we delay the execution of a mainframe job?OPSWAIT is an option

Have you ever tried to keep a job or certain steps in a job to wait for sometime and then execute without using any scheduler ?

When would it be useful to ask mainframes to sleep like this?

Few days back ,came across a specific scenario when it was needed to put a job step on hold for sometime and then execute it.

Say for unit testing you want to execute two jobs in certain instance to check dataset conflict or table conflict.
Suppose we have a job with 3  steps. We want to execute step1 an then wait for sometime, say 1 minute, and then execute the remaining 2 steps. How do we do it ?

OPSWAIT is the option which can help us in this scenario.  (We need CA tool to be installed in our environment).
The OPSWAIT function suspends processing of the program or rule for a specified period in an OPS/REXX program or REQ rule.This  function can be used in OPS/REXX, AOF rules, TSO/E REXX, or CLIST.

//Jobcard..
//PSTEP01 EXEC PGM=TSTPGM
...
//PSTEP02  EXEC PGM=OPSWAIT,PARM='FOR(01:00)'
//PSTEP03  EXEC PGM=TSTPGM2
....
Once you submit this JCL, it till execute the step PSTEP01,then it will cause the mainframe to sleep for 1 min and then execute step 2.
We can try this command going into TSO command (option 6) and typing OPSWAIT 01:00.(01=mins,00=seconds)

We can also use this JCL to put the program on hold.
//SLEEP EXEC PGM=IKJEFT01 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD * 
OPSWAIT FOR(01:10) 
//* 

Saturday, August 1, 2015

Date functions in cobol. Why and when do we use IGZEDT4 ,CEELOCT , CURRENT-DATE ?

We come across certain instance when we need to get the time/date in certain formats in cobol.
We all know that we can get the dates in cobol using ACCEPT verb in the following way.
                        ACCEPT WS-DATE FROM DATE
But we would not get it in the format we prefer to use ,say in the format YYYYMMDD.
DATE has the picture clause of PIC 9(6). So when we use the above cobol statement, we would get the date in the format like below when read  sequentially from left to right:
First 2 digits for the year of the century, next 2 for the month and next 2 digits with the date.
Example: This if the current date is 25-JUL-1994, we would get  as 940725.

VS cobol II does not support the date format giving results in YYYY format. However the results can be interpolated in many ways to get our desired result.Some of the utilities are described below.

For NON LE Cobol Environment 
1.IBM has offered a non-LE callable interface known as  IGZEDT4 which can be used to achieve our
desired function.  When we call it , it will give the desired result in the YYYYMMDD format.

01  WS-YYYYMMDD                         PIC  9(08).
01  WS-DATE-IGZEDT4                     PIC  X(08) VALUE 'IGZEDT4'.
*
CALL WS-IGZEDT4      USING WS-YYYYMMDD

2.  Another way to do this is to use the LE callable function CEELOCT. We need to check with the mainframe infrastructure team to know if LE is optionally installed in our LPAR. If yes, then we can call this function which returns the result in the format of YYYYMMDD.  CEELOCT returns the result in 3 formats.
 a: Lilian  Date
 b.Lilian Seconds
 c. Gregorian  character strings.
It also returns the feedback code(CEE000 if call is successful)  which decides whether the function executed successfully or not.

WORKING-STORAGE SECTION.
01 WS-LILIAN PIC S9(9) COMP.
01 WS-XSECONDS PIC S9(18) COMP.
01 WS-GREGORN PIC X(17).
01 WS-FC.
    03 CEEIGZCT-RC PIC X.
        88 CEE000 VALUE LOW-VALUE.
PROCEDURE DIVISION.
       MAIN-SECTION.
           CALL 'CEELOCT' USING WS-LILIAN
                                WS-XSECONDS
                                WS-GREGORN
                                WS-FC
           IF CEE000 OF WS-FC
             DISPLAY 'Time: '    WS-GREGORN
           ELSE
             DISPLAY 'CEELOCT error' UPON CONSOLE
           END-IF

For LE enabled environment

3.  CURRENT-DATE  FUNCTION has been introduced after cobol II, which makes life much simpler. It is a 21 byte alphanumeric value which contains the below fields when read from left to right .

01  WS-CURRENT-DATE-FIELDS.
             05  WS-CURRENT-DATE.
                 10  WS-CURRENT-YEAR    PIC  9(4).
                 10  WS-CURRENT-MONTH   PIC  9(2).
                 10  WS-CURRENT-DAY     PIC  9(2).
             05  WS-CURRENT-TIME.
                 10  WS-CURRENT-HOUR    PIC  9(2).
                 10  WS-CURRENT-MINUTE  PIC  9(2).
                 10  WS-CURRENT-SECOND  PIC  9(2).
                 10  WS-CURRENT-MS      PIC  9(2).
             05  WS-DIFF-FROM-GMT       PIC S9(4).
 We need to use the cobol code like below and use the appropriate values as per our requirement.
 MOVE FUNCTION CURRENT-DATE TO WS-CURRENT-DATE-FIELDS

Wednesday, July 1, 2015

What is SMS in mainframe ? SMS stands for Storage Management Subsystem

Introduction:
The Storage Management Subsystem (SMS) automates the use of storage for data sets.  The z/OS ,  system is not aware of how much space a dataset will need, where it will be stored(Disk or Tape) ,what will be its record format and other details. Z/oS storage team need to decide whether to backup the large files and also need to recall it when necessary. All these data management activities can be done either manually or through the use of automated processes which is nothing but SMS.  With SMS activated , the z/OS  programmer or storage admin team  may, for example, create model data definitions for typical data sets, so that SMS automatically assigns those  attributes to data sets when they are created.
The data sets allocated through SMS are called system-managed data sets or SMS-managed data sets. For example, suppose you want to create a new data set named DATA.LIST. If SMS is active, you could use JCL like this:
//NEWDS DD DSN=myid.test.dataset1,
// DISP=(NEW,CATLG),
// DATACLAS=DCSL002,
// STORCLAS=SRTCL00

In this case, z/OS can use characteristics from predefined data and storage classes when it creates the DATA.LIST data set.

If SMS is not active or not in use, you need to manually specify the space requirements and storage location for the new data set, and your JCL would look like this :
//NEWDS DD DSN=myid.test.dataset1,
// DISP=(NEW,KEEP),
 // SPACE=(CYL,(1,1)),
 // UNIT=SYSDA,
// VOL=SER=SHARED  

The advantages associated with an SMS environment.
 • The user is relieved of making decisions about resource allocation of data sets, since it is done by SMS.
 • SMS provides the capability of concatenating data sets of unlike devices. For example, a tape data set can be concatenated with a disk data set. This capability is not available in a non-SMS environment.
 • SMS managed data sets cannot be deleted unless they are first uncataloged. Due to this extra step, erroneous deletion of data sets is minimized.
 • Additional features are available in the use of IDCAMS in an SMS environment. For example, the ALTER command can be used to increase the limit of the number of generation data sets in a GDG. This feature is not available in a non-SMS system.
 • VSAM data sets created in an SMS environment offer more flexibility that those created through JCL in a non-SMS environment.

Read about the  SMS Parameters in the next post

An overview of SMS parameters: STORCLAS,DATACLAS,MGMTCLAS,RECORG

1. The STORCLAS Parameter:
STORCLAS is a keyword parameter. It is used to assign a data set to an SMS defined class. 
These parameters discused below have significance only if SMS is active, otherwise it is ignored. 
Here’s the syntax: 
STORCLAS = class
Where class is an installation defined name and can be one to eight characters long. Use of this parameter results in the data set defined in the DD statement within that job being SMS-managed. The VOLUME and UNIT parameters can be omitted, since these values are now SMS supplied.
As we can see in the below example, VOLUME and UNIT parameters are omitted. These parameters will be installation defined for the class SMS1.
//JOB1 JOB (A123), ‘Ryan’
//STEP1 EXEC PGM=PROGRAM1
//DD1 DD DSN=myid.test.dataset1,
// DISP=(NEW, CATLG, DELETE),
// SPACE=(CYL, (1,1), RLSE),
// LRECL=80,
// RECFM=FB,
// STORCLAS=SMS1 

2. The DATACLAS Parameter:
DATACLAS is a keyword parameter. It is used to define any or all of the following parameters for a data set.(Marked in blue)
LRECL, RECORG , RECFM,  RETPD or EXPDT ,VOLCOUNT (coded on the VOL parameter) , 
SPACE , AVGREC
 Volcount is coded on VOL parameter. It is used to specify the number of tape volumes that can be mounted when a tape data set is being created or expanded.
For SMS managed VSAM data sets, DATACLAS can be used to define the following parameters: CISIZE, IMBED ,REPLICATE, SHAREOPTIONS ,FREESPACE 

Here’s the syntax of the DATACLAS parameter:
DATACLAS = class
Class is installation defined with predefined values for any of the above parameters

Suppose a class named GENERAL contains the following definitions:
SPACE = (CYL, (1,1), RLSE),
DCB= (RECFM=FB, LRECL=80)
My job uses the Class GENERAL in JCL

//JOB1 JOB (A123), ‘Ryan’ /
/STEP1 EXEC PGM=PROGRAM1
//DD1 DD DSN=myid.test.dataset1,
// DISP=NEW, CATLG, DELETE),
// UNIT=SYSDA,
// DATACLAS=GENERAL 

3. The MGMTCLAS Parameter 
This parameter is used to provide a management class for the associated data set coded in the DD statement. A management class is used to control the migration of data sets, the frequency of back ups, the number of backups versions, and the retention criteria of backup versions.
Here’s the syntax:
MGMTCLAS = class
It can be one to eight characters long. Parameters defined for this class can not be overridden. Example:
//JOB1 JOB (A123), 'Ryan’
//STEP1 EXEC PGM=PROGRAM1
//DD1 DD DSN=myid.test.dataset2
// DISP=(NEW, CATLG, DELETE),
// DATACLAS=GENERAL,
// MGMTCLAS=ARCHIVE

4. The LIKE Parameter:
 This parameter is used to copy attributes from an existing cataloged data set to a new data set. The following attributes can be copied over: SPACE RECFM AVGREC LRECL

Here’s the syntax:
LIKE = model.dataset
Where model.dataset identifies the name of the model data set.
Example:

//JOB1 JOB (A123), ‘Ryan’
//STEP1 EXEC PGM=PROGRAM1
//DD1 DD DSN=myid.test.dataset3,
// DISP=(NEW, CATLG, DELETE),
// LIKE=TEST.MODEL
 In this example, the  attributes of the data set TEST.MODEL is copied to the new data set called TEST.DATA2.

Wednesday, June 24, 2015

Identity Column in DB2

In order to maintain uniqueness in the values entered in db2, we use the UNIQUE constraint for one of the column in db 2. Creation of  unique index, primary key are also there in the list by which we can maintain data uniqueness. However there are methods to generate unique values in the columns.

1. Creating IDENTITY column while defining the table (introduced in DB2  version 7.1 and later)
2. SEQUENCE (from DB2 version  7.2 onward)

Lets check about IDENTITY column in this post.
Identity columns offer us the possibility to guarantee uniqueness of a column and to be able to automatically generate unique value. An identity column is a Numeric column, either SMALLINT, INTEGER, or DECIMAL with a scale of zero, or a user defined distinct type based on any of these data types, which is UNIQUE and NOT NULL by definition.
As such there are no rules when to use identity columns. Still we can consider the below scenario when we can use IDENTITY column.

When can we use IDENTITY COLUMN
Suppose we have  a multi-column primary key(composite key) and the table has several dependent tables, you have to ‘copy’ many columns to the dependent tables to build the foreign keys. This makes the keys very long. Having many columns in the index also makes the index grow quite large. Instead of using the long key as the primary and foreign key, you can use an artificial unique identifier for the parent table and use that generated value as a primary key and foreign key for the dependent table.
Another use for identity columns is when you just need a generated unique column for a table. If we do not need to reference the rows by this column, then there is no need to even create an index for it, and uniqueness is guaranteed by the system for generated values

The Identity Columns can be used in table in two ways :
    1. The value for identity column is always generated by the DB 2.(GENERATED ALWAYS keyword in the column declaration ; see below for syntax)
    2. The value is inserted explicitly by user. And if used don't specify any value then the value is generated by the DB2.(GENERATED BY DEFAULT)

Syntax :

CREATE TABLE <table_name>
(
    <column1> datatype
    GENERATED ALWAYS/GENERATED BY DEFAULT
    AS IDENTITY
    (
    START WITH <numeric constant>,
    INCREMENT BY <numeric constant>,
    NOMINVALUE / MINVALUE <integer constant>,
    NOMAXVALUE / MAXVALUE <integer constant> 
    NOCYCLE / CYCLE,
    NOCACHE / CACHE <integer constant>,
    NOORDER / ORDER
    ),

    <column2> datatype,
    <column3> datatype,
    ...............................
    ...............................
)

Running a query to create Identity column:

CREATE TABLE PROD_DEAL
( ROW_ID INT NOT NULL GENERATED ALWAYS AS IDENTITY 
                         (START WITH 1, INCREMENT BY 1, NO CACHE), 
 ORDER_NO CHAR(6), 
 INV_COUNT INT WITH DEFAULT 0 
);

Suppose we have run the INSERT QUERY 5 times for five different ORDER_NO.

INSERT INTO PROD_DEAL(ORDER_NO,INV_COUNT) VALUES('A00001',11);
INSERT INTO PROD_DEAL(ORDER_NO,INV_COUNT) VALUES('A00002',22);
INSERT INTO PROD_DEAL(ORDER_NO,INV_COUNT) VALUES('A00003',33);
INSERT INTO PROD_DEAL(ORDER_NO,INV_COUNT) VALUES('A00004',44);
INSERT INTO PROD_DEAL(ORDER_NO,INV_COUNT) VALUES('A00005',55);


Output:
ROW_ID   ORDER_NO  INV_COUNT
------------------------------------------------
11000001  A00001         11
11000002  A00002         22
11000003  A00003         33
11000004  A00004         44
11000005  A00005         55

We can see the ROW_Id generating unique values each time the insert query ran.

Wednesday, June 17, 2015

Copy empty Vsam file using SORT without error - Use of parameter VSAMEMT=YES in SORT

Once came across a scenerio where we had to copy a vsam file to a flat file and then process the flat file in subsequent steps.This is pretty simple and can be achived with a SORT step.But Once the same job abended when the VSAM file was emprty.
Came across this parameter VSAMEMT=YES which can be used with sort to handle this scenerio.

//STEP3 EXEC PGM=SORT,PARM=’VSAMEMT=YES’
//*
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=XXX.TEST.VSAM,DISP=SHR
//SORTOUT DD DSN=TEST.FLATFILE.COPY,DISP=MOD
//SYSIN DD *
SORT FIELDS=COPY
/*

Monday, April 20, 2015

COMP-3 variable in cobol. Calculate the number of bytes for COMP-3 variable.

COMP3 in cobol enables a computer to store 2 digits in each storage position, except for the rightmost position where the sign is stored. Each digit takes half a byte.

Point to remember:  Even if we specify 'S' or not in the variable declaration,  the sign is stored.

Suppose we move the digit 1265875 into a field 9(7). In display mode, cobol will occupy 7 bytes, ie, 7 storage positions.
If we use COMP-3 in the variable declaration,  it will take only 4 Bytes like below.

12
65
87
5C

Had it been a negative number, then The rightmost digit would store 'D' in place of  C.

How many bytes does COMP-3 take ?
To calculate the byte-length of a comp-3 field, start with the total number of digits and divide by 2 giving a result (discarding the remainder if any), then add 1 to the result.
  
Note: For "normal" processing the maximum number of digits is 18 . If we use compiler option as the ARITH(EXTEND) , the compiler will allow us to extend it to 31 bytes

Just to share, if we use Odd number of digits for packed decimal, the processing becomes 5-20 % faster than what the speed would be if we use even number of digits!. (Source : Some Informative journal in comp-3 in google )

Friday, April 10, 2015

Create Dataset name dynamically using system date and time or timestamp - using EZACFSM1

Many times we face the scenario, where we are in need to create a file suffixed with today's date ,time ,ie  the system date and time. It can be done in various ways. In this post we will see how to use one utility EZACSFM1 to achieve the same. It is a good utility and accepts many parameters which can be used tactfully to achieve good results.

//JOHNEZC JOB 1,'TEST', CLASS=I,NOTIFY=&SYSUID
//STEP1        EXEC PGM=EZACSFM1
//SYSOUT DD SYSOUT=(,INTRDR)   <= submit the below job via internal reader
//SYSIN DD DATA,DLM='..'
//TTEST JOB 1,'EZACSFM1',CLASS=I,NOTIFY=&SYSUID
//STEP2        EXEC PGM=IEFBR14
//MYDATA  DD DSN=TEST.DATA.D&DAY.M&LMON ,
//                          DISP=(,CATLG,DELETE),
//                         SPACE=(CYL,(10,10),RLSE),
//                         DCB=(RECFM=FB,LRECL=10,BLKSIZE=0)
(Tested code. Runs fine)
Explanation:
We are submitting the IEFBR14 job to create the dataset  TEST.DATA.DMON.M04 via internal reader as mentioned above.
If you see, i have  kept 'D&DAY' and  'M&LMON' in blue. Just to keep our attention there. The utility EZACSFM1 substitutes the value of day and month values there which will be created dynamically. &day, &lmon are the parameters of the utility.
If we submit the job, two jobs will be submitted. The first job will submit the second job creating the dataset with IEFBR14 dynamically with the date and time parameters.

To display the values in spool in sysout, try this piece of code.We will come across more options.
<Tested code >
//STEP1     EXEC PGM=EZACFSM1      
//SYSOUT    DD  SYSOUT=*          
//SYSIN     DD  *                
&YR.-&LMON.-&LDAY
&YR./&LMON./&LDAY
DATE IS : &YR.&LMON.&LDAY
/*
//
Some more parameters which EZACFSM1 takes. There are many more.

 '&DAY'                
 '&HHMMSS'              
 '&HR'                  
 '&JOBNAME'          
'&SEC'              
'&SEQ'                
'&YYMMDD'            
Test these small piece of code.M sure, you will find it very useful!!!

Thursday, April 9, 2015

USAGE Clause in cobol. COMP, COMP1, COMP2, COMP3 VARIABLES IN COBOL

When Defining the cobol variables, the first thing which we see around is the USAGE clause. It only signifies how the data item will be internally stored. What will be its size.? In most instances we would see that a variable has been declared as(just for example) PIC X(10), and there is no USAGE clause. When the usage clause is not specified ,then it is assumed by the compiler that the USAGE IS DISPLAY.This is the default.
The other kind of USAGE is COMPUTATIONAL. As the name suggests, this is used when we want to perform numeric operations on the data items. I found some of the points to be worth remembering. Will be highlighting them in Red in this post.

Can we use USAGE IS DISPLAY for NUMERIC items? 
For text items, or for numeric items that are not going to be used in a computation (like Phone Numbers,postal code), the USAGE IS DISPLAY will not be harmful; but for numeric items upon which some arithmetic operation will be performed,the default usage is definitely not the best way.
When we use numeric items (like pic 9(3) )with default usage, internally they will be stored in ASCII.
When performing operations on these items, computer need to convert them to binary.Once done, computer need to convert it back to ASCII. These conversions unnecessarily slows down performance. Hence COMP is preferable in these scenarios.

Further more, Computational Usage clause can be classified into the following fields depending on the numeric items we use in our program.
Binary (USAGE BINARY or COMP-4)
Internal floating-point (USAGE COMP-1, USAGE COMP-2)
Internal decimal (USAGE PACKED-DECIMAL or COMP-3)

The maximum length of a comp item is 18 digits.

Q.Why do we use COMP in usage clause. When to use COMP ?
We know computers are Binary machines. So binary numbers will be more favorite to computers for doing calculations.Hence always use COMP for the variables which will be involved in numeric calculations.
Also it is used  for defining internal counters, subscripts.

PIC Clause of a COMP data item will only contain 9 or S.
example:
 01 WS-LEN          PIC S9(02)  COMP.

Now, when we declare the variables in comp, the compiler allocates spaces to them in words.which can be a half word(2 bytes),  Full word(4 bytes), Double word(8 bytes)

So if the PIC specified is 9(1) COMP or 9(2) COMP or 9(3) or  COMP OR 9(4) COMP, the space allocated by the compiler will be half word (2 bytes).

From PIC 9(5) COMP to PIC 9(9) COMP, the space allocated will be 4 bytes

For a PIC clause with more than 9 bytes, compiler will allocate  double word(8 bytes).

In short,
S9(01) - S9(04)      Half word.
S9(05) - S9(09)      Full word.
S9(10) -  S9(18)     Double word.

Q.When to use COMP-1 and COMP-2 ? 
When we want to use Fractional numbers, numbers having decimal digits, then COMP-1 is the choice.Thus USAGE IS COMP-1 is used to store floating point numbers(real numbers)
COMP-1 uses 4 bytes of storage.
Exactly in the same way, we need to use COMP-2 for extremely big numbers requiring high precision.
COMP-2 occupies 8 bytes of storage. Hence no need to use PIC clause since size is already pre defined.
Remember: No picture clause is needed when defining COMP-1 and COMP-2 items.


Read COMP3 Variables Here.

Saturday, March 21, 2015

SEARCH and SEARCH ALL in COBOL with examples

SEARCH is a serial search.It can be used for both sorted and unsorted data inside the table.The table needs to be defined with an index .When we use the search function, the function starts with the current setting of the index and it goes on until to the end of the table.The starting value of the index is left upto the choice of the programmer. We can set the value of the index value just before the search starts using the SET keyword.

IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO.

DATA DIVISION.
WORKING-STORAGE SECTION.
   01 WS-TABLE.
          05 WS-TEST PIC X(1) OCCURS 10 TIMES INDEXED BY I.

   01 WS-SRCH-STRNG PIC A(1) VALUE 'R'.

PROCEDURE DIVISION.
   MOVE 'ABCDEFRJKL' TO WS-TABLE.
   SET I TO 1.
   SEARCH WS-TEST
     AT END DISPLAY   'R NOT FOUND IN TABLE'
     WHEN WS-TEST(I)=WS-SRCH
     DISPLAY    'WE FOUND YOU IN THE TABLE'
   END-SEARCH.  

STOP RUN.

This linear Search will iterate through the entire table until the search-string is found or the end of table is reached.
In this case, since our search string 'R' is there in the table, the SYSOUT will display 'WE Found.. '.

SEARCH ALL is a binary search and the table must be sorted .That is the the prerequisite to use search all. The outcome of SEARCH ALL is either a yes or a no. Hence we can not code multiple search conditions when using SEARCH ALL function.
Contrary to search function, there is no need to initialize the index .

Check the following code:

IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO.

DATA DIVISION.
   WORKING-STORAGE SECTION.
   01 WS-TABLE.
      05 WS-BIN-SRCH OCCURS 10 TIMES ASCENDING KEY IS WS-NUM INDEXED BY I.
      10 WS-NUM  PIC 9(2).
      10 WS-DEP   PIC A(3).

PROCEDURE DIVISION.
   MOVE '12ABC56DEF34GHI78JKL93MNO11PQR' TO WS-TABLE.
   SEARCH ALL WS-BIN-SRCH
     AT END DISPLAY 'BIN-SRCH NOT FOUND'
     WHEN WS-NUM (I)=93
     DISPLAY 'BIN-SRCH FOUND '
     DISPLAY WS-NUM(I)
     DISPLAY WS-DEP(I)

Just a point to note:  The table is searched in ASCENDING ORDER. If you see, the table is defined with ASCENDING key clause. In case we want, we can always define the table in DESCENDING KEY clause.
To use SEARCH ALL,a field must be in ASCENDING or DESCENDING key clause. This Identifier must be an entry within the table

When to use SEARCH and SEARCH ALL Function:
· For a table with less than 50 entries, go for SEARCH (Sequential Search)
· For a table with more than 50 entries go for SEARCH ALL (Binary Search)

Thursday, March 12, 2015

Difference between PDS and PDSE - Brief discussion

We all know that a PDS (partioned data set), commonly referred to as Library in Z/OS world is a collection of several data sets which we call as members. A PDS internally contains a directory which keeps track of the member names.
A PDS is created with a Data Set Organization of PO , (DSORG=PO), which stands for partitioned organization.

Creating PDS using 3.2 option
Directory blocks  . . 20
Data set name type  : PDS
Using JCL:
SPACE=(TRK,(50,10),20)

However a PDS have certain disadvantages:
1. When we delete a member from a PDS, the space remains unused. We need to compress the PDS using 'Z' at the command line , or we can use IBM utility  IEBCOPY to reclaim the unused space.

2. Also as we know, a PDS internally contains a directory. As the size of the PDS grows, ie, as we add more and more members into the pds, the directory size gets near to the threshold (This Directory size we give when we define a PDS), after which we can not add any more members in the PDS.
Then we need to copy all the members into a new PDS with increased Directory size.

PDSE ( partitioned data set extended ):  Exactly same as PDS in many respects.
However , PDSE  data sets can be  stored only on DASD, not on tape. Interesting thing is the directory can expand automatically  as needed. Additionally it has an index which helps to locate the members inside the PDSE faster . SPACE from deleted members are automatically reused in PDSE .

PDSE files have DSORG=PO and DSTYPE=LIBRARY.
JCL to create PDS
//ALLOC    EXEC  PGM=IDCAMS
//SYSPRINT DD    SYSOUT=A
//SYSIN    DD    *
    ALLOC -
    DSNAME(TEST.PDSE1.EXAMPLE1) -
    NEW -
    STORCLAS(RM06) -
    MGMTCLAS(RM06) -
    DSNTYPE(LIBRARY)

Thursday, February 5, 2015

Copy members of PDS using IEBPTPCH

This JCL will copy the members of the PDS into a PS file with the help of IEBPTPCH

//STEPNAME EXEC PGM=IEBPTPCH                 
//SYSPRINT DD SYSOUT=*                       
//SYSUT1   DD DSN=TEST.PDS1,DISP=SHR  
//SYSUT2   DD DSN=TEST.PS1,          
//     DISP=(NEW,CATLG),                     
//     SPACE=(CYL,(50,50,)),                 
//     DCB=(RECFM=FB,LRECL=133,BLKSIZE=1330),
//     UNIT=WORK                             
//SYSIN    DD *                              
    PUNCH TYPORG=PO                          

Tuesday, February 3, 2015

Sort JCL to split every alternate records

 This JCL will split the even and the odd number of records from the input file.

//STEP01   EXEC PGM=SORT                             
//SYSOUT   DD SYSOUT=*                               
//SORTWK01  DD UNIT=DISK,SPACE=(CYL,(100,100))       
//SORTIN    DD *                                     
1111111111111111111111111                            
2222222222222222222222222                            
3333333333333333333333333                            
4444444444444444444444444                            
5555555555555555555555555                            
6666666666666666666666666                            
//ODD       DD DSN=TEST.ODD.OP1,
//          DISP=(,CATLG),UNIT=TEST,                 
//          SPACE=(CYL,(50,50),RLSE)                 
//EVEN       DD DSN=TEST.EVEN.OP2,
//          DISP=(,CATLG),UNIT=TEST,                 
//          SPACE=(CYL,(50,50),RLSE)                 
//SYSIN     DD *                                     
  SORT FIELDS=COPY                                   
  OUTFIL FNAMES=(ODD,EVEN),SPLIT                     
//*   

Here is the output of the ODD file:

******************************
1111111111111111111111111    
3333333333333333333333333    
5555555555555555555555555    
******************************

Here is the output of the EVEN  file:

**************************
2222222222222222222222222
4444444444444444444444444
6666666666666666666666666
**************************
SPLIT parameter to put the first record into OUTPUT1, the second record into OUTPUT2, the third record into OUTPUT1, the fourth record into OUTPUT2, and so on until you run out of records. SPLIT splits the records one at a time among the data sets specified by FNAMES.
Other options SPLITBY and SPLIT1R are also available. Do check out the usage for further info.

Friday, January 30, 2015

SAS in Mainframes(z/Os) Tutorial with examples - Use of CARD and DATALINES in SAS

The CARD or the DATALINES  statement is valid in the data step and is used to Read the real data values into the program. Generally while running the sas programs, we use an input file(a physical DASD file)  and give a reference to the Data step using INFILE keyword.
DATALINE is nothing but Reading the Data inline. We can also use the keyword CARD. Both are equivalent in terms of function. CARDS were used long back when punch cards were used in mainframe system.
Have a look into the sas program below.
//SAS       EXEC SAS                     
//WORK      DD SPACE=(CYL,(50,10),RLSE)  
//SYSIN     DD *                         
                                         
OPTION NOCENTER;                         
OPTION SORTLIB='';                       
                                         
  DATA COV;                              
  INPUT @23  ACCTTYPE $CHAR01.           
    @10  NMBER  $CHAR02.                 
    @24  ACCTNO   $CHAR10.               
     ;                                   
 DATALINES;                              
  111111111111111111112222222222233333333
  111111111111111111112222222222233333333
  111111111111111111112222222222233333333
  111111111111111111112222222222233333333
    ;                                    
 PROC PRINT DATA=COV;                             
   
 A very small program, but there is no input file in the JCL. We are reading the input data using the DATALINES keyword. IF we print the data we will get the same expected output.

Wednesday, January 7, 2015

DB2 Explain and PLAN table column names in DB2

Whenever we run a query,DB2 creates an access plan that specifies how it will access the requested data. This is created whenevea the sql is compiled at bind time for static sql and before execution for dynamic sql.
DB2 bases the access paths on the SQL statements and also on the statistics and configuration parameters of the system.
Even when an sql is made efficient, it can become inefficient as data grows. So, we need to run DB2 runstats  so as to keep updated statistics . DB2 config and storage can change and plans can be rebound. Db2 explain gives us the info for the plan, package, or SQL statement when it is bound. The output of the EXPLAIN  is stored in user-created table called Plan table. Whenever we want to tune a query, we need to go and check the plan table so as to get an idea of the access path DB2 optimizer is using.

How can we populate the PLAN Table ?
EXPLAIN(YES) option on the BIND/REBIND plan/package command
EXPLAIN ALL keyword in SPUFI or while running the query in batch.

Step 1. The SQL statement in blue is the main Query for which we want to know the access path.
So we wrap the statement with the 'EXPLAIN ALL SET QUERYNO = 1 FOR'  like below

EXPLAIN ALL SET QUERYNO = 1 FOR
SELECT CUSTNO, CUSTLNAME                       
FROM CUST                                                        
WHERE CUSTNO LIKE '%0A';

Once we execute the above query, optimizer first writes the access path onto the Plan Table and then gives the output. Step below depicts how we can get the information from plan table


Now, Let us check few of the columns in the PLAN table and its significance: Given in Blue are the names of the table columns.
QUERYNO: Query number assigned by the user
QBLOCKNO:A number that identifies each query block within a query.
APPLNAME:The name of the application plan for the row.
PROGNAME:The name of the program or package containing the statement being explained.Applies for the explain as a result of SQL queries embeded in application program.
TSLOCKMODE: Identifes the Tablespace lock mode.

These columns relate to the index usage:
ACCESSTYPE:Type of table INDEX usage as as follows:
R -Full table scan (uses no index) when the query is executed
I -Use an index. Data will be retrieved from index and not from table,
I1 -one-fetch scan (MIN or MAX) functions
N -Index scan (predicate uses an IN )
M -Multi-index scan followed
   MX By an index scan on the index named in ACCESSNAME
   MI By an intersection of multiple indexes
   MU By a union of multiple indexes
MATCHCOLS: For ACCESSTYPE I, I1, N or MX, the number of index keys used in an index scan; otherwise, 0.
ACCESSCREATOR:For ACCESSTYPE I, I1, N, or MX, the creator of the index; otherwise, blank.
ACCESSNAME: For ACCESSTYPE I, I1, N, or MX, the name of the index; otherwise, blank.
INDEXONLY: Whether access to an index alone is enough to carry out the step, or whether data too must be
accessed. Y=Yes; N=No

The plan table columns that relate to SORT usgae are as follows:
METHOD:
A number (0, 1, 2, 3, or 4) that indicates the join method used for the step:
0 First table accessed, continuation of previous table accessed, or not used.
1 Nested loop join. For each row of the present composite table, matching rows of a new table are
found and joined.
2 Merge scan join. The present composite table and the new table are scanned in the order of the
join columns, and matching rows are joined.
3 Sorts needed by ORDER BY, GROUP BY, SELECT DISTINCT, UNION, a quantified predicate, or an
IN predicate. This step does not access a new table.
4 Hybrid join. The current composite table is scanned in the order of the join-column rows of the
new table. The new table is accessed using list prefetch.

SORTN_UNIQ: Whether the new table is sorted to remove duplicate rows. Y=Yes; N=No.
SORTN_JOIN: Whether the new table is sorted for join method 2 or 4. Y=Yes; N=No.
SORTN_ORDERBY: Whether the new table is sorted for ORDER BY. Y=Yes; N=No.
SORTN_GROUPBY: Whether the new table is sorted for GROUP BY. Y=Yes; N=No.
SORTC_UNIQ: Whether the composite table is sorted to remove duplicate rows. Y=Yes; N=No.
SORTC_JOIN: Whether the composite table is sorted for join method 1, 2 or 4. Y=Yes; N=No.
SORTC_ORDERBY: Whether the composite table is sorted for an ORDER BY clause or a quantified predicate. Y=Yes;
N=No.
SORTC_GROUPBY: Whether the composite table is sorted for a GROUP BY clause. Y=Yes; N=No.
PREFETCH : Whether data pages are to be read in advance by prefetch.  If we dont want to use the sequentail prefetch for a particualr query,we need to add the clause
OPTIMIZE FOR 1 ROW to it.

 Read about basic DB2 Prefetch  

What we should be looking at:

1. Indexes enhance performance and and reduce costs. We need to look the ACCESSTYPE to see if
an index is being used.. An ACCESSTYPE of "R" means all the data must be scanned. and no
indexes are being used.
2. Look for MATCHCOLS to see how many  index keys are being ueed. The more the better.
3. Check for the column INDEXONLY . Value of 'Y' means data being retrieved from index and no table is involved. This is no doubt good in terms of performance. Booster will be to have the columns used in 'Where predicate' as indexes.
4. Avoid unnecessary sorts as auch as possible.
5.PREFETCH is good and will be in action when mostly the table space scan is used. Very effective when the table data is in clustered sequence.

Some Learnings
1. Don't misuse select statements
2. Use IN instead of multiple ORs
3. Join with as any of the index columns as possible.
4. Avoid Arithmetic expressions in where clause
5. Use NOT EXISTS instead of NOT IN for a suvbquery