Mar 26

JobQGenie helps at Staples

We have been skirting around the news for some time adding some details in previous posts without mentioning the customer by name, finally we can say it out loud! We had been hoping for the story to break many months ago but we all know how slowly the publishing world works at times…

The story is about how JobQGenie fills an important gap in the HA solutions running in many IBM ‘i’ customers around the world. We have always said this is an important element of recovery and needs to be addressed, but it is not part of the sales pitch and the Planned Role swap is used as an indicator of being able to swap. Truth is when an unplanned role swap is required or even a planned one under certain circumstances all bets are off! Staples have seen this gap for many years and we stepped up to help them fill it. It was not EASY as the Staples button would have you think, we spent many long hours looking at how to improve the collection process to meet the high volumes they process. Eventually we had to start from scratch and approach the problem from a completely different angle which proved to be a winner. The startling fact is Staples had purchased the product after intensive testing 3 years before we came up with the final solution, but working closely with us they now have a top class solution.

If you want to read the article, and I suggest you do if you are running a HA solution! here is a link to the online article..

Happy reading…

Chris…

Mar 25

Ops Console issues with new Power 6 box


We have been struggling with the Ops Console set up with the new Power 6 system we just purchased and thought it might be worth sharing the experience. We had set up a direct connect console but as V7R1 will be dropping support we thought moving to LAN Ops console might be a worthwhile exercise.

The first system we move (i515) went without problem and we were able to connect the console using OpsConsole running on Windows 7 straight away. That accomplished we decided to move the recently purchased system to a LAN Ops Console. This is where we started to see some issues, in fact it took a lot of work to get it up and running successfully!

Just as a bit of background, we had the two LAN ports configured and had the communications all set up and running with out issue from the day we installed the system. One port was connected to our 1GB Ethernet switch and the other was connected to our 10/100 switch, this is where we wanted to assign the console as it provides us the ability to isolate the console connections.

To move the port to the Ops Console we had to ensure the port to be used as the Ops Console port would be the correct one, we had to move the line configurations around to have our 1GB Ethernet connection connected to the 2nd port on the adapter. All that worked fine and everything came back up and was working. Next we removed the 10/100 LAN connection and its interface from the configuration before using SST to set up the adapter to run as the Ops Console port. Again following the instructions provided by IBM (even though they are very confusing to say the least) we configured the port correctly and issued a restart of the port through the OPSCONSOLE macro in SST. Thats when everything went backwards! The port was reporting a hardware error in SST through the OPSCONSOLE macro.

The port would not light up at all, none of the lights on the port or the switch would illuminate so we called IBM and logged a hardware problem, after a short call it was then passed onto Software support as it was felt the hardware was fine but the configuration was incorrect. for 2 days we conferred with IBM support and ran lots of tests and log dumps to see what was wrong. Eventually after spending a whole day installing PTFs and re-ipl’ing the system numerous time Software support said the card needed to be changed. We have a 3rd system which we set up using the same process and it worked perfectly as well! That made us think that maybe this really is a hardware issue, the nagging thought we had was why only one port? when hardware fails its normally both ports which would fail!

IBM was pretty quick at getting a new card out and the engineer installed the card but the problem remained. We kept on changing the cables to the switch and nothing, then he decided to connect a crossover cable directly to the port. While we could not attach to the system using the cable it did have the effect of lighting the port up on his laptop and the system. Moving the cable from his PC to the switch extinguished the lights again and moving to known good ports on the switch cables did not work. Then he connected the cable from the 1GB switch to the port and it lit up! Somehow the card would not connect to the 10/100 switch, his Laptop had a 1GB switch and our 1GB switch worked as well!

So having determined that the problem was definitely not hardware the engineer set off and I logged the problem back with IBM as a software error.

In the meantime I decided to try a few changes, first of all I removed the LAN Ops Console configuration completely and set up a normal LAN connection using the same port as used for the Ops Console. If I set it up as *AUTO for the LINK speed the link would not activate, however I configured the LAN connection as 100M and FULL duplex and the connection worked with the same cable as before and to the 10/100 switch. So my next test was to configure the Ops Console using the same parameters, it made sense that if I could configure the LAN connection the Ops Console connection should configure in the same way? Unfortunately it failed to link, I had no lights on the system or the switch. I re-ipl’d the system again in the hopes that it would clear up any issues, Nothing! But when the system was powered off the link would flash on the switch? as soon as the ipl started it went out?

I knew configuring the port via the 1GB switch would work so thats what I would try, The Ops Console connected perfectly and I could now use a LAN Ops Console… That is only part of the fix, I then decided to just reset the parameters on the Ops Console port to allow it to connect as a 100M link (AUTO link speed and FULL Duplex). Having changed the configuration I restarted the port using the OPSCONSOLE macro and it linked up again, I was now connected at 100M via the 1GB switch. Next I thought I would simply switch the cables from the 1GB switch to the 10/100 switch, it worked! So I now have the exact set up I wanted, yes it not really a solution but at least I now have the Ops Console up and running to the Windows7 PC via the 10/100 switch!

Now I need to get IBM to fix the LIC to allow the *AUTO linkspeed to work correctly with a 10/100 switch and fix up why the process I went through results in the desired effect while using the direct route even with the same parameters appears to fail?

I hope others can make sense of what we did, maybe IBM will fix up the problem with a PTF, but in the meantime if you are having similar problems this post may help you get things working?

Chris…

Mar 10

PHP and DB2 with System *DTS columns


We have not posted much more on the C for ‘i’ or PHP for ‘i’ threads as we have been struggling with a problem within the files we use for our JobQGenie product. The problem is the files store system date and time stamps (*DTS) which are 8 character fields. When we used the PHP functions to extract the data we would always end up with 8 characters of junk!

We had thought the PHP routines which deal with TimeStamps would be able to convert the timestamps but unfortunately they only work with UNIX style time stamps! So we had to find out how to convert them before they were received in the PHP script. We had been looking at User Defined Functions (UDF’s) but never really understood what benefits they would bring for PHP. As usual we asked on the forums for suggestions on how to best manage these timestamps and UDF’s seemed to be the best solution. So we took the information from the manuals and wrote a UDF which would simply convert the timestamp to a pre-formatted string. We use the QWCCVTDT API in our UIM programs for displaying the dates so we created a UDF which would do the same thing.


#include <qusec.h> /* Error Code Structs */
#include <stdio.h> /* sprintf etc */
#include <string.h> /* string functions */
#include <qwccvtdt.h> /* convert timestamp */

typedef _Packed struct EC_x {
Qus_EC_t EC;
char Exception_Data[1024];
} EC_t;

typedef struct DateTime_x {
char Year[4];
char Month[2];
char Day[2];
char Hour[2];
char Minute[2];
char Second[4];
}DateTime_t;

#define _ERR_REC sizeof(_Packed struct EC_x)

void UNSTAMP(char * timeStamp,
char * cvtTimeStamp,
short *inIndicator,
short *outIndicator,
char *sqlState,
char *funcName,
char *specName,
char *msgText) {
char Input_Fmt[10] = "*DTS "; /* Time stamp input fmt */
char Output_Fmt[10] = "*YYMD "; /* Time stamp input fmt */
DateTime_t buf;
EC_t Error_Code = {0}; /* err struct */

Error_Code.EC.Bytes_Provided = _ERR_REC;

QWCCVTDT(Input_Fmt,
timeStamp,
Output_Fmt,
&buf,
&Error_Code);
if(Error_Code.EC.Bytes_Available > 0) {
/* create the message to be returned */
memset(cvtTimeStamp,'0',26);
memcpy(sqlState,"38999",5);
}
sprintf(cvtTimeStamp,"%.4s/%.2s/%.2s %.2s:%.2s:%.2s",
buf.Year,buf.Month,buf.Day,buf.Hour,buf.Minute,
buf.Second);
return;
}

This program will take the *DTS time stamp passed and convert it to a Character string via a predefined structure. We compiled this as a module and then as a service program with EXPORT *ALL.

Next we had to let SQL know we would be using it. The information on how to create the UDF can be found in the IBM Infocenter but here is our script for the function.


Drop Function UNSTAMP;

Create Function UNSTAMP(timestamp char(8))
returns char(26)
external name 'CHLIB/UNSTAMP(UNSTAMP)'
LANGUAGE C
PARAMETER STYLE SQL
NO SQL
DETERMINISTIC
DISALLOW PARALLEL;

To add the function to SQL we ran the following request
RUNSQLSTM SRCFILE(QSQLSRC) SRCMBR(UNSTAMP) COMMIT(*NONE) ERRLVL(20)

Running an SQL interactive session we were able to determine that the code did in fact work and the returned string was formatted as we wanted. Next we had to add it to a php script, after a few attempts we came up with a script that worked. Our biggest problem which we are still working on a solution was how to limit the number of entries returned, our test file had approximately 25,000 records
so the data build for the browser took some time! MySQL has the ability to use LIMIT passing in the start record and the number to fetch, DB2 requires the use of FETCH which does not have the same capabilities as far as we can tell. A big thanks to Scott Klement who found a stupid error that was driving us nuts!

Here is the script we ran, you will notice we have restricted the number of records to fetch and display. Eventually we will use this with page limits to allow control over the returned data.

<?php
session_start();
// register the next record variable
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<style type="text/css">
tr.d0 td {
background-color: #CC9999;
}
tr.d1 td {
background-color: #9999CC;
}
</style>
<META content="text/html; charset=iso-8859-1" http-equiv=Content-Type></HEAD>
<BODY>
<?php
//include("i5toolkit/Toolkit_classes.php");
// include the file which holds the user info
include("../scripts/config.php");
// connect to the i5
$options = array("i5_naming"=>DB2_I5_NAMING_ON,"i5_lib"=>"CHLIB");
$conn = db2_connect("","","",$options);
if (is_bool ( $conn ) && $conn == FALSE) {
die ( "No Connection " .db2_conn_errormsg($conn) );
}
// maximum page size
$size = 200;
// where to display records from
$start = 0;
if(isset($_SESSION['next']))
$next = $_SESSION['next'];
else
$next = 0;
if(isset($_SESSION['previous']))
$next = $_SESSION['previous'];
else
$previous = 0;
$query = "select
jobname,
usrname,
jobid,
jobq,
jobqlib,
unstamp(ENTERTS) as TS,
unstamp(STARTTS) as STR,
unstamp(ENDEDTS) as END,
jobtype,
subtype,
endcde,
prcused,
jobstate from jq0
where JOBID > " .$next
." FETCH FIRST " .$size ." ROWS ONLY";
$result = db2_exec($conn,$query);
if(!$result) {
die("db2_exec " .db2_stmt_errormsg());
}

?>
<a href="db2test1.php?id=<?php echo($start);?>">Next</a>
<table border="1">
<tr class="<?php echo("d1"); ?>">
<td><?php echo("Job ID"); ?></td> <!-- Jobid -->
<td><?php echo("Job Name");?></td> <!-- JobName -->
<td><?php echo("User Name");?></td> <!-- UserName -->
<td><?php echo("Job Queue");?></td> <!-- Job Queue -->
<td><?php echo("JobQ Library");?></td> <!-- JobQ Library -->
<td><?php echo("Entered Time");?></td> <!-- Entered TS -->
<td><?php echo("Started Time");?></td> <!-- Start TS -->
<td><?php echo("Ended Time");?></td> <!-- Ended TS -->
<td><?php echo("Job Type");?></td> <!-- Job Type -->
<td><?php echo("Sub Type");?></td> <!-- SubType -->
<td><?php echo("End Code");?></td> <!-- EndCode -->
<td><?php echo("Processor Used");?></td> <!-- Proc Used -->
<td><?php echo("Job State");?></td> <!-- JobState -->
</tr><?php
for($i = 0; $i < $size; $i++) {
$rec = db2_fetch_both($result) ?>
<tr class="<?php echo("d" .($i & 1)); ?>">
<td><a href=""><?php echo($rec[2]); ?></a></td>
<td><?php echo($rec[0]);?></td>
<td><?php echo($rec[1]);?></td>
<td><?php echo($rec[3]);?></td>
<td><?php echo($rec[4]);?></td>
<td><?php echo($rec['TS']);?></td>
<td><?php echo($rec['STR']);?></td>
<td><?php echo($rec['END']);?></td>
<td><?php echo($rec[8]);?></td>
<td><?php echo($rec[9]);?></td>
<td><?php
if($rec[13] == 0) echo("Completed Normally");
else if ($rec[10] == 10) echo("Completed Normally During Controlled Ending");
else if ($rec[10] == 20) echo("Exceeded End Severity");
else if ($rec[10] == 30) echo("Ended Abnormally");
else if ($rec[10] == 40) echo("Ended before becoming Active");
else if ($rec[10] == 50) echo("Ended while Active");
else if ($rec[10] == 60) echo("Subsystem ended while job was Active");
else if ($rec[10] == 70) echo("System ended abnormally while job was Active");
else if ($rec[10] == 80) echo("Job ended (ENDJOBABN");
else if ($rec[10] == 90) echo("Forced end after ENDJOBAN");
else if ($rec[10] == 999) echo("On Job Queue");
?></td>
<td><?php $prcused = $rec[11]/10000; echo(round((float)$prcused,2));?></td>
<td><?php
if($rec[12] == 0) echo("Ended");
else if($rec[12] == 1) echo("Failed");
else if($rec[12] == 2) echo("Job Queue");
else if($rec[12] == 3) echo("Unknown");?></td>
</tr><?php
}
db2_free_result($result);
db2_close($conn);
?>
</table>
</BODY>
</HTML>

This results in output similar to the following.

When you consider the UDF is required 3 times for every row, even 25,000 rows took milliseconds to run locally on the IBM ‘i’ in an interactive SQL session, thats pretty impressive..

Chris…

Mar 09

Remote Journalling and TCP/IP


I had always believed the TCP stack is pretty fault tolerant which would ensure data sent from one system would always arrive at the target intact. I appears that is a myth in certain circumstances?

I recently had a note from Larry Yougren (Journal Guru) asking me to take a look at a journal IQ test of sorts which had been posted on the iSeries Network site. I posted the link in a previous blog entry but here it is again for those who want to take a closer look Journal IQ Test. A couple of the questions really intrigued me especially Question 13. I knew about the new validation option in version 6.1 through my work with a customer but had no idea why it had been provided.

Here is the help text for the parameter.

Validity checking (VLDCHK) – Help

Specifies whether or not to use communications validity
checking. When communications validity checking is
enabled, the remote journal environment will provide
additional checking to verify that the data which is
received by the target system matches the data that was
sent from the source system. If the data does not match,
the data will not be written to the target system, the
remote journal environment will be inactivated, and
messages indicating the communications failure will be
issued to the journal message queue and QHST.

Note: This parameter is only valid when
JRNSTATE(*ACTIVE) is specified.

*SAME
The value does not change.

*DISABLED
Communications validity checking is disabled for this
remote journal environment.
*ENABLED
Communications validity checking is enabled for this
remote journal environment.

Note: Communications validity checking may impact
performance.

I had not really understood that TCP/IP would allow data sent from one system to another could be corrupted (bit flipping etc) en-route!

One thing about Larry I really like is his enthusiasm for the journal subject and his ability to make technical jargon readable so I thought I would take a look at the articles Larry has provided on the iSeries Network site for further information. I came up with the following article RJ Garble detection.

It makes for a good read, especially for those customers who are replicating their HA environment over the WAN or Internet. I would suggest any company who moves their data over a long distance (or the internet which can travel round the country just to go to a site in the next town) looks at setting this parameter on for a few months until they are sure the comms links are stable enough. If you start to see problems in your replication environment and the RJ links start to show signs of failure this could also protect you from other system issues, turning it on and taking the additional 3% CPU may be more than worth it in the long run…

Chris…