Mar 20

Object Conversion V6R1 and High Availability

We have been taking part in the V6R1 ESP with IBM which is due to come to a close. As part of the testing we have been doing with RAP we looked at the effect do the object conversion on the replication process. As you know one of the mantra’s of the HA specialist is that you should always switch your environments on a regular basis. This is aimed at making sure your environments stay in-sync at all times, if you switch regularly enough you should be picking up minor problems before they become real issues!

With V6R1 IBM has added some extra information to the object structure which requires the object is converted to run under V6R1, if the object is restored to a previous version say V5R4 that additional information has to be removed to allow the OS to see the object correctly. IBM has provided a PTF for previous versions to allow this activity to take place. I am not sure if the object is converted back on restore automatically or on first run? The PTF doesn’t provide the ability (as far as I can determine) to change this to convert on restore or convert on first run as it does when the object is converted for V6R1?

The interesting part of this is what effect will this have on the HA implementations? If the object is converted everytime it is restored then the impact will be on the HA replication process. This could be significant if the object is changed lots of times, each time it is restored to the system it could be converted. This can be effected by the setting of when to convert the object (on first run or restore) as if it is only converted on first run the replication product will have less conversions running, but when you switch to the target system and run your application you will have the impact of all the objects being converted as they are required to run! This is effected both ways, however it could be more significant if the ability to set the conversion on restore to earlier releases, cannot be set to convert on first run.

We did quite a bit of testing and overall we didn’t see too much of a problem, but our test system is only used for test and has very few objects being changed etc. If you are going to be doing this in a production environment with lots of object activity you could end up with significant processor impact…

We would suggest you consider just what impact this will have on your system, you may find that the impact is too significant to allow you to have different OS levels between your systems for too long. Once you go to running V6R1 on both systems the problem will disappear.

We are continuing to test V6R1 and RAP, JobQGenie and eventually the other products we have developed. Our aim is to provide a V6R1 ready installation for all products in time to meet the demand from the customer base. Current levels can be installed on V6R1 but will require the object conversion to be carried out.

Chris…

Mar 17

New features adding value to RAP

We have been adding new features to the RAP thick and fast for the last few months with some interesting results and feedback. We have been looking at what most users want as a minimum when implementing an Availability solution within the IBM System i environment.

Monitoring
We had many requests to improve the monitoring functionality, the original screens while providing the basic information would not allow the user to drill down to find out more in depth information about the state of the apply process. With the new panels you will be able to see the apply processing at the journal level with drill down to the individual apply state for each receiver. We have also included the Error review in the same screens and moved them to the Operations menu. Now you can see from a single panel group all of the associated information.

Alerting
A big problem with any process that runs 24×7 is the need to be alerted to any problems as they occur without having to have an operator constantly watching the screens. RAP will now provide Email capabilities to the alerting functions, if it sees an error or status you are interested in it will send an email to your designated email address immediately. This should give people a much better response time should any process start to fail.

Environment awareness
The product will now monitor the state of the receiver environment on the target system and clean up any receivers which are no longer required for processing. This will keep the DASD requirements to a minimum on the target system. We have also looked at the resource management within the product to ensure we remove any unused resources as soon as possible, this is a bit like the garbage collector used in many of the new languages such as Java.

As always there are a lot of new idea’s and suggestions to consider as we develop the product further with many of those ideas coming from our customer base as it grows. Being small means we are very interested in your input and will always look at how we can add more value to your investment in our products.

Chris…

Mar 10

What a week that was!

We have been very busy in the last couple of weeks with new development on the RAP plus testing V6R1.

V6R1 testing has been the biggest hold up due to a major problem with the new IBM Director interface which is to replace the Ops Navigator eventually. We went through the checklists supplied with the ESP and ensured we upgraded everything we needed to, we diligently ran the object conversion routines after installing V6R1 OS over an existing install. No problems at all with the install and the CUM package which came with the package also went in without problems. One very nice feature is the use of DVD’s no instead of CD’s, this meant we did not have to load lots of CD’s to get the OS installed. The system came up no problems at all and an initial review (Green Screen) showed no problems with anything running. One purpose of installing was to ensure the current products would run, RAP was the only test we have done so far and it appears to work OK! Once testing is complete we will post a V6R1 ready copy of all the products.

Next we tried to run the new HTTP ADMIN interface as the write ups sounded very interesting, especially the ability to add your own web services into the interface. It looked like it was running OK but the server would not serve the interface, all we got was a 404 error stating /ibm/console was missing. We spent about 3-4 hours trying to find out what had gone wrong before sending in an error report to IBM. V6R1 ESP has to sit behind GA OS problems so we had to wait some time before IBM got back to us. We spent another 2-3 days back and forth with IBM trying to understand what had gone wrong. We sent megabytes of data back and forth to IBM trying new fix packs and collecting information on the configurations and logs. Thankfully IBM has a script which did all of the hard work for us, but the file was 20MB and it took a number of attempts to get it to the IBM FTP site. We also had to download fix packs at 615MB each so our internet connection certainly got a workout. Eventually we had to wait for a new CUM from IBM, we also had to delete the 5761DG1 program and re-install it but the product is now working and seems pretty impressive.

The development of our own products didn’t go so well either, mainly due to programming errors (features) which didn’t seem to tie up with the manuals! Thanks to the forums we did manage to get a number of API issues bedded down and IBM has agreed to update the text for some of the API’s in the manuals to show the way they work in reality. Another problem we seem to struggle with a lot is the passing of structures, one API required _Packed structures while another which uses the same List resource didn’t, if we passed in a _Packed structure it didn’t work? We now have a process of trying a _Packed structure, but if it doesn’t work we will try a Non Packed!

The Email process is now running well and should make the next release of RAP. We have also added a new status interface which will allow the user to see the last applied information without having to drill down the error reports. This has created a few challenges with the Watch API’s as they fall over regularly when we capture certain messages. This has taken hours of debugging and correspondence with IBM and is still not fixed, seems IBM doesn’t work weekends like the rest of us as despite lots of correspondence we have not heard back from IBM since Friday! Hopefully they can get back to it this week and get a solution for us.

RAP is now starting to have features which should be available for any Availability Software package, we don’t want to make it a High Availability solution as we still strongly feel its over kill for most companies. The majority of customers who install a HA solution just need Data Protection and a level of recovery in the event of a system loss, they will invariably never switch to the other system on a regular basis. Many customers simple let the environment deteriorate into a state where any kind of recovery is almost impossible. Even if they do a good job of keeping the environment updated and switching the users to the target on a regular basis does this mean they can do so in an UNPLANNED event? RAP will provide simple tools to allow the users to monitor and manage their environment, the new email process is aimed at giving the user text based notification to any device which can receive email messages. (as many now carry email capable phones we feel this is important) Eventually we will strip down our JobQGenie product and provide a linkage between the job data we capture and the receiver stored or applied data. This should allow the users to make effective decisions about how the non applied data in the receiver should be applied.

Still have lots of new development ideas to get into code so may not get back to the Blog as often as we would like. If we see interesting code snippets we will continue to post them as we develop more technology.

Chris…

Mar 05

QGYOLJBL API usage

We wrote a short test program to allow the extraction of Messages from the joblog of the job the API is being called within. The code is simply showing how to code up the API not how to use the information returned from the API. We also only extract the messages from the returned resource using QGYGTLE API one at a time, we could extract more each time which could improve performance. There are lots of other uses for this API but our intention was to just browse the messages in our own joblog and take action dependent on the messages returned.

Here is the code:


#include
#include
#include 
#include 
#include 
#include               /* Create User Space */
#include 
#include 
#include 
#include 
#include 

typedef struct JobL_Msg_x {
        char ListDirection[10];
        char JobName[26];
        char IntJobId[16];
        char StartMsgKey[4];
        int  MaxMsgLen;
        int  MaxHlpLen;
        int  FieldOffset;
        int  NumFields;
        int  MsgQOffset;
        int  MsgQSize;
        int  Fields[1];
        char MsgQName[1];
        } JobL_Msg_t;



int main(int argc, char **argv) {
int Ret_Recs = -1;
int Num_Ents = 0;
int i;
char ListInfo[80];
char Buf[1024];
char Msg_Buf[4096];
char *tmp_ptr;
char LastKey[4] = {0x00};
JobL_Msg_t Sel_Info;
Qgy_Oljbl_ListInfo_t *ret_info;
Qgy_Oljbl_IDFieldInfo_t *Field_Info;
Qgy_Oljbl_RecVar_t *Buf_Ptr;
Qus_EC_t Error_Code = {0};

Error_Code.Bytes_Provided = sizeof(Error_Code);


memcpy(Sel_Info.ListDirection,"*NEXT     ",10);
memset(Sel_Info.JobName,' ',26);
memset(&Sel_Info.JobName,'*',1);
memset(&Sel_Info.StartMsgKey[0],0x00,4);
memset(Sel_Info.IntJobId,' ',16);
memset(Sel_Info.MsgQName,'*',1);
Sel_Info.MaxMsgLen = 132;
Sel_Info.MaxHlpLen = 3000;
Sel_Info.FieldOffset = 80;
Sel_Info.NumFields = 1;
Sel_Info.MsgQOffset = 84;
Sel_Info.MsgQSize = 1;
Sel_Info.Fields[0] = 201;

ret_info = (Qgy_Oljbl_ListInfo_t *)ListInfo;
Buf_Ptr = ( Qgy_Oljbl_RecVar_t *)Msg_Buf;
QGYOLJBL(Buf,
         sizeof(Buf),
         ListInfo,
         Ret_Recs,
         &Sel_Info,
         sizeof(Sel_Info),
         &Error_Code);
if(Error_Code.Bytes_Available > 0) {
   printf("Msg received LJBL %.7s\n",Error_Code.Exception_Id);
   return -1;
   }

for(i = 1;i <= ret_info->Total_Records; i++) {
   QGYGTLE(Msg_Buf,
           sizeof(Msg_Buf),
           ret_info->Request_Handle,
           &ListInfo,
           1,
           i,
           &Error_Code);
   if(Error_Code.Bytes_Available > 0) {
      printf("Msg received GTLE %.7s\n",Error_Code.Exception_Id);
      }
   else {
      printf("Message in JobLog %.7s\n",Buf_Ptr->Msg_ID);
      memcpy(LastKey,Buf_Ptr->Msg_Key,4);
      tmp_ptr = (char *)Buf_Ptr + Buf_Ptr->Offset_to_Fields_Retd;
      Field_Info = (Qgy_Oljbl_IDFieldInfo_t *)tmp_ptr;
      printf("Data Length = %d\n",Field_Info->Data_Length);
      }
   }
/* clean up the resources  */
QGYCLST(ret_info->Request_Handle,
        &Error_Code);
if(Error_Code.Bytes_Available > 0) {
   printf("Msg received CLST %.7s\n",Error_Code.Exception_Id);
   }
return 1;
}  

You will notice the API call is very similar to the QGYOLMSG API, but the structure is not _Packed, when we defined the structure as _Packed the offsets somehow got screwed up which is different to the QGYOLMSG API which required a _Packed structure! The returned information can be manipulated using the pointers we have described sufficiently enough to get the data requested etc.

Hope its useful to someone!

Chris…

Mar 03

QGYOLMSG API C code sample

I was trying to help debug an RPG program where the developer wanted to use the QGYOLMSG API to extract messages sent to the QSYSOPR message queue when I became embroiled in my own problems with the API.

Here is the link to the RPG post. As you can see the problems the RPG programmer encountered were all about the offsets to the passed parameters, I had a further problem due to the structure being built having to be a _Packed structure. Once we had resolved this the program ran but returned a CPF240F message which isn’t listed in the returned messages in the manual! I decided to post the code to the C/C++ forum to see if anyone could see what I was doing wrong. Here is the post.
Carsten Flensburg kindly ran the code through the debugger and pointed out the fact that the key passed into the Selection Information are being corrupted! The API was taking the int value field and converting it to Oct! This is not in the manual but who is surprised any more? After changing the key passed in from ’0301′ to ’301′ as in the code below the Error Code structure was sending back a strange message ID! So I decided to run the debugger myself to see what was going on. Carsten pointed me in the right direction in that the addresses of the pointer passed and the user space didn’t match! I had made the CLASSIC mistake of passing in the address of the pointer not the address of the User Space object! This was causing the whole memory map within the API to be thrown off, and it didn’t complain!

So in hindsight here are a couple of points to remember when coding the List API QGYOLMSG in C.

1. Make sure the selection Info is passed in a _Packed Structure.
2. Make sure you pass the address of the Buffer not its pointer!
3. The offsets are critical and the resulting messages are not always helpful.
4. Always delete the resource using the QGYCLST API to free up the List resource!

Carsten did mention the fact that the buffer passed in is not used for returning the messages if the QGYGTLE API is used, but my testing seems to point to the fact that the buffer is actually used to store the initial data at least, perhaps the messages which cant be stored are stored in another resource which is used by the QGYGTLE API once it has read through the initial store?? I did not move the testing any further as it was only intended to be a quick trial of the code!

The program code below works, it does provide limited information about the messages returned which can be expanded to meet your needs! Its not production code and needs more error checking to be added, so be warned!

#include
#include
#include 
#include 
#include 
#include               /* Create User Space */
#include 
#include 
#include 
#include 
#include 

typedef struct Msg_Ret_x {
        int Num;
        char Q_Inf[2][20];
        }Msg_Ret_t;

typedef char SelC[10];
typedef char MsgK[4];

typedef _Packed struct SelInf_x {
        char  List_Direction[10];
        char  Reserved[2];
        int   Severity_Criteria;
        int   Max_Msg_Length;
        int   Max_Help_Length;
        int   Sel_Criteria_Offset;
        int   Num_Sel_Criteria;
        int   Start_Msg_Keys_Offset;
        int   Retd_Fields_IDs_Offset;
        int   Num_Fields;
        char  Sel_Cri[1][10];
        char  Msg_Key[1][4];
        int   FieldID[2];
        }SelInf_t;

#define _16MB 16776704

int main(int argc, char **argv) {
int Ret_Recs = -1;
int Num_Ents = 0;
int i;
int Initial_Size = _16MB;
char Sort_Info = '0';
char ListInfo[80];
char QueueInfo[21] = "0QSYSOPR             ";
char Queue_List[44];
char SPC_Name[20] = "QGYOLMSG  QTEMP     ";
char Ext_Atr[10];
char Initial_Value = ' ';
char Auth[10] = "*CHANGE   ";
char SPC_Desc[50] = {' '};
char Replace[10] = "*YES      ";
char Msg_Buf[1024];
SelInf_t Sel_Info;
char *space;
Qgy_Olmsg_ListInfo_t *ret_info;
Qgy_Olmsg_RecVar_t *ret_msg;
Msg_Ret_t *Q_Info;
Qus_EC_t Error_Code = {0};

Error_Code.Bytes_Provided = sizeof(Error_Code);

Q_Info = (Msg_Ret_t *)Queue_List;
memset(Ext_Atr,' ',10);

memcpy(Sel_Info.List_Direction,"*NEXT     ",10);
Sel_Info.Severity_Criteria = 0;
Sel_Info.Max_Msg_Length = 132;
Sel_Info.Max_Help_Length = 0;
Sel_Info.Sel_Criteria_Offset = 44;
Sel_Info.Num_Sel_Criteria = 1;
Sel_Info.Start_Msg_Keys_Offset = 54;
Sel_Info.Retd_Fields_IDs_Offset = 58;
Sel_Info.Num_Fields = 2;
memcpy(Sel_Info.Sel_Cri[0],"*ALL      ",10);
memset(Sel_Info.Msg_Key[0],0x00,4);
Sel_Info.FieldID[1] = 301;
Sel_Info.FieldID[0] = 1001;

QUSCRTUS(SPC_Name,
         Ext_Atr,
         Initial_Size,
         &Initial_Value,
         Auth,
         SPC_Desc,
         Replace,
         &Error_Code);
if(Error_Code.Bytes_Available > 0) {
   printf("Msg received CRTUS %.7s\n",Error_Code.Exception_Id);
   /* send message */
   }

QUSPTRUS(SPC_Name,              /* get pointer to USRSPC */
         &space,
         &Error_Code);
if(Error_Code.Bytes_Available > 0) {
   printf("Msg received PTRUS %.7s\n",Error_Code.Exception_Id);
   /* send message */
   }

QGYOLMSG(space,
         _16MB,
         ListInfo,
         Ret_Recs,
         &Sort_Info,
         &Sel_Info,
         sizeof(Sel_Info),
         QueueInfo,
         Queue_List,
         &Error_Code);
if(Error_Code.Bytes_Available > 0) {
   printf("Msg received LMSG %.7s\n",Error_Code.Exception_Id);
   return -1;
   }
ret_info = (Qgy_Olmsg_ListInfo_t *)ListInfo;
printf("Available %d Returned %d\n",ret_info->Total_Records,
       ret_info->Records_Retd);
Num_Ents = ret_info->Records_Retd;
/* Use QGYGTLE to return the messages to the program */
ret_msg = (Qgy_Olmsg_RecVar_t *)Msg_Buf;
for(i = 1; i <= Num_Ents; i++) {
   QGYGTLE(Msg_Buf,
           sizeof(Msg_Buf),
           ret_info->Request_Handle,
           &ListInfo,
           1,
           i,
           &Error_Code);
   printf("Msg_ID GTLE %.7s\n",ret_msg->Msg_ID);
   }
/* clean up the resources  */
QGYCLST(ret_info->Request_Handle,
        &Error_Code);
if(Error_Code.Bytes_Available > 0) {
   printf("Msg received CLST %.7s\n",Error_Code.Exception_Id);
   }
return 1;
}   

Hope the program helps those who are trying th get the list API’s working.

Chris…