Jan 26

FTP Manager, first steps to Open Source project

We have a product which allows the management of the FTP server and client on the IBM ‘i’ and we wondered if anyone would like to be involved in a project to review the source and help make a new product out of it. The goal will be to take the product to open source, but I would like to see some improvements before we get that far! We have seen a number of downloads of the 5250 FTP client which is a derivative of the FTP Manager client which caused us to wonder if this would be a suitable Open Source project to start with? As the improved product will be published as open/free the existing product will be pulled from the site.

We are willing to share the source with those who would be willing to help clean it up and develop some new functionality. It is written entirely in C and uses UIM for the user interface. We would like to build the underlying functionality into a web based product where the interface can either be a 5250 or browser based session.

The people I am hoping to attract will need C programming, PHP, UIM and javascript skills. You will also need access to an IBM ‘i’ with the relevant compilers. The client side of things could possibly be written almost entirely in PHP, but the security management and internal logging will still require native programs etc.

The FTP Manager is listed on our website and you can see the features we have built in already, the plan would be to improve the products interfaces, add ssl support and develop the new browser based interface. Once finished we will provide download-able source and possibly compiled objects etc for the community.

If you have the required skills, time and energy to get involved let us know. Once we have an understanding of the support we will get we can start the process of making the source available.

Chris…

Jan 25

phpMyAdmin returns 404 error, have to press refresh to get in!

I have been having problems with the phpMyAdmin installation on Gentoo Linux for some time now. The problem is when I tried to sign in the page would not refresh with the correct page, I had to press the refresh button on the browser to show the index.php page. On Firefox it would simply just stay on the login page where IE7 would give me a 404 error while still showing the correct URL in the address bar..

I tried to update the installed version initially which failed due to the php installation not being created with the correct ‘use’ flags. After updating php and emerge –unmerge phpmyadmin, I still had the same problems. So I ran another emerge –unmerge phpmyadmin and went in to delete all remaining objects in the phpmyadmin directory and then ran emerge –newuse phpmyadmin again. This time I took the default config.sample.inc.conf file and copied to config.inc.php before restarting Apache.

Still the problem existed, I could not understand why it was saying the url as shown in the browser address bar would return a 404 error yet is was definitely in the directory and had the correct attributes. After some digging I found a post to a forum which mentioned a re-direct issue with phpMyAdmin in certain installs which looked like it had similar results.

So I took the advice of the poster and put $cfg['PmaAbsoluteUri'] = "http://" . $_SERVER['SERVER_NAME'] . dirname($_SERVER['SCRIPT_NAME']); into my config.inc.php file and it works!

I don’t know why the Gentoo installer didn’t add this to the config file? Normally with Gentoo the install takes care of all the basic configuration parameters and this was installed by the installer OK, just didn’t set up the config file or point me to any documentation which would suggest I needed to..

Hope this helps others who have the same issue!

Chris…

Jan 14

Audit process testing successful for RAP

The test of the auditing functionality of RAP has now completed full testing. The test included the pollution of the target database with sporadic record level updates against a number of the files to show how the audit functions will pick up the corruption between the systems. We also ran full object level Audits that check the objects between systems to ensure the code changes we made in the programs worked.

The data level tests consisted of an initial File/Mbr based check to ensure the functions return the correct CRC check sums for the source and target. Then we ran a program to create sporadic updates on the target before running the File/Mbr CRC checks again to ensure the corruption is identified correctly. Once we knew this was working we ran a Record level Audit with the Repair option set to *NO, this showed us the RRN’s of the records which were supposedly out of sync and prints them out to an audit report. Then we ran the Record level Audit again with the repair option *YES, the report which is generated shows the RRN’s which had been repaired so we could match this with the initial report to ensure the same records have been identified and repaired. So far everything showed the process was working and the records were back in synch, all we had to do to verify this was to run the File/Mbr level Audits again which showed as we expected no errors between the systems.

We did run a few more Record level Audits for other functionality checks such as ending the Audits while they were midstream to check the reporting messages and final report generation which all came back clean!

The Object Audits simply required changing a few attributes of objects between the systems to see if they were picked up by the process. This was eventually carried out on the target because after we updated objects on the source and ran the Audits they came back clean! Unfortunately (or fortunately depending on your perspective) the replication process had brought the objects back into sync which returned an unexpected result. Other than that we had no more surprises with the tests and everything we expected to be picked up was. After replicating the objects from the source to the target again the Audits came back clean.

As this was the biggest change since the last unit test code level, we feel pretty confident we can get the new release out before the end of the month again. The biggest challenge facing us now is the dreaded documentation!

If you would like to help us test the latest release before the GA release let us know.

Chris…

Jan 13

A stupid mistakes which took hours to resolve

Now that the CrcBuilder technology has been added to the next version of RAP the product has gone back into unit testing. Initially everything was going well until we forced some data errors into the audit functions. We started to see deadlocks in the programs where each was waiting for something from the other system.

After a lot of code reviews and hours of debug we finally found the culprits. Firstly, we have the target process running through a switch loop to determine which part the the process it is running and which code segment to run through. Somehow we had missed one of the stupid mistakes we had run into in the past, the atoi() function takes a character string and returns an integer based on the content. We use the value to determine the key which has been passed which is a 4 character value.
Here is a simplified version of what we were doing..


int function a(int sock, char*ptr) {
char key[4];

do {
   memcpy(key,ptr,4);
   switch(key) {
      case : 1 {
           do_something();
           send(something_back);
           break;
           }
      case : 2 {
          do_something_else();
          send(something_else_back);
          break;
          }
     default : {
         break;
         }
      }
   recv(ptr);
   }while(key != something);
}

This code worked fine until we changed the data which was passed from the source from pure character data to mixed. Then we started to somehow slip past the expected function and instead of hitting the send() functions we went straight through to the recv() function.

Here is the corrected code


int function a(int sock, char*ptr) {
char key[5]; // larger buffer
memset(key,'\0',5);  //NULL terminate
do {
   memcpy(key,ptr,4);
   switch(key) {
      case : 1 {
           do_something();
           send(something_back);
           break;
           }
      case : 2 {
          do_something_else();
          send(something_else_back);
          break;
          }
     default : {
         break;
         }
      }
   recv(ptr);
   }while(key != something);
}

Now it all works as it should. The problem is that atoi() works on null terminated strings, if you don’t terminate the string you could get the wrong return value sometimes and not others! Stupid mistake to make especially as we had seen this before, but somehow we missed it this time! The question is why should changing the data passed affect this so dramatically, we had no null termination between the elements before?

The next problem was easier to fix, well we found it quicker anyhow!
We are sending data between the source and target similar to the above and we take the data passed and map it to a data structure before we call API’s with the content. We found that the target system program was passing garbage into the API which of course resulted in the API failing and we had error status messages returned to the source system.

After some investigation we found the culprit, we have implemented the Adler_32 CRC checking to the data which resulted in changes to the structures we passed between systems.
Here are the original structures (Yes we only needed one but somehow during the development we didn’t do that. The module is called from many places which created the mismatch)

// source system 
#typedef struct req_x {
         char key[4];
         char Obj_Name[10];
         char Obj_Lib[10];
         char Obj_Type[10];
         char CRC[16];
         } req_t;

int struct_size = 0;
req_t Request;
struct_size = sizeof(Request);

// target system
#typedef _Packed struct resp_x {
         char key[4];
         char Obj_Name[10];
         char Obj_Lib[10];
         char Obj_Type[10];
         char CRC[16];
         } resp_t;
iint struct_size = 0;
req_t Response;
struct_size = sizeof(Response);

The functions worked because originally the size of the _Packed and non _Packed structures were the same size (you cannot pack the character strings). However because we now had a mixed structures they were not.

// source system 
#typedef struct req_x {
         char key[4];
         char Obj_Name[10];
         char Obj_Lib[10];
         char Obj_Type[10];
         uLong CRC;
         } req_t;

int struct_size = 0;
req_t Request;
struct_size = sizeof(Request);

// target system
#typedef _Packed struct resp_x {
         char key[4];
         char Obj_Name[10];
         char Obj_Lib[10];
         char Obj_Type[10];
         uLong CRC;
         } resp_t;

int struct_size = 0;
req_t Response;
struct_size = sizeof(Response);

The sizeof() function returned different values on each system, so when the data was moved into the structures from the recv buffer we offset the data in the structure every time due to the difference in structure size.

Now we have only one structure and it is _Packed…

Both of these problems are very stupid and should not have crept into the code, but with about 500,000 lines of code in RAP now, its sometimes easy to miss the obvious ones.

OK enough of the chatter, I have to get back to testing so we can get this release out to the customers!

Chris…

Jan 12

Better results using the new ADLER_32 CRC

The Adler32 Crc Functions are now fully integrated into RAP and testing of the full product has started again. We had a few challenges along the way due to a stupid mistake which we will post about later but things are looking good for a release in the very near future.

As part of the test cycle we have to go through all of the module tests again just to ensure we have not broken something else when coding up the new features. We had built some large files previously to do the initial Audit testing and then the CrcBuilder testing so we decided to see what improvements we had made using the new technology.

You will find details of the initial trials which used the MD5 checksum for each record in an earlier post which at the time we felt were pretty good! So we decide to run the same tests to see the results, this should be almost a like for like comparison as the data and systems are the same now as they were then except we are going to use bigger files.

Here is an excerpt that shows the results we had then

The first audit ran with 0 skipped records (basically I read every record including deleted records). This resulted in CPU utilization of 167% on the source and 134% on the target. It read through 250,000 records and took from 18:19:37 to 18:24:24 to complete. The time difference between the systems is about 3 seconds so does not give much of a skew in the numbers. I dont know why the CPU numbers are above 100%, but I was still able to get very quick response from the system and saw no degradation in system performance from a user perspective.

This time the audit ran with 0 skipped records again and we saw a CPU Utilization of 45% on the source and 6% on the target! (Yes we did press F10 and F5 to confirm)
It read through 1.6 million records and took from 18:55:51 to 19:05:56 (approx ten minutes.). This would show us the process is now 3 times faster and uses much less system resources! (IBM you should be looking at the QC3CALHA API its terrible when called repetitively).

Here is a screen-shot of the results

Display of audit results

Display of audit results

The file and member level results are much more impressive with the new block mode when using the IBM CRC’s but at the record level IBM CRC’s just cannot touch the Adler32 for performance.

Chris…

Jan 11

IE 7 does not work like other Browsers!

We have been developing the new website for sometime now but had only ever looked at the results using the Firefox browser. This morning we decided to look at it using IE7 just to make sure the layouts were not ‘buggered up’ when we used other browsers. To our surprise the sign-in functions stopped working altogether and we were dumped back into a view of the directory!

After hours of reprogramming and pulling every piece of code apart and rebuilding it we started to doubt the logic used. Its the same logic we had used in other sites, so we pulled up the other sites and checked the functions worked. They did so we just couldn’t understand why it works in one site and not the other. The php logs had no errors, apached reported no problems and the system logs were devoid of any reason behind the problem.

After hours of trawling through the web, we came across a couple of notes from other programmers saying they had similar problems associated with the use of session variables, but we could show the variables in the page just not in the following pages? So we started to look closely at what we were doing and started to dump all of the variables! No session variables were being stored in following pages!

So we set about looking for similar problems on the web, after a couple of red herrings we found out the problem is caused because we had put in an underscore ‘_’ in the base URL (www.whatson_test.local). IE7 will not process any session variables if the URL has an underscore in it! Because we were running a test server we had appended test with an underscore.

So we renamed the site (without the underscore www.whatson-test.local) restarted everything, mapped the IP in the host files and now it all works!

Why does MS have such a poor regard for the world we live in? If the other browsers all correctly support the URL forms why doesn’t IE???

If you come across the same problem don’t spend the hours we did looking for what is really a stupid problem..

Chris…

Jan 10

CrcBuilder Technology integrated into RAP

What appeared to be a simple task of taking a product we had already developed and wrapping replication technology around it turned out to be more than we expected, but we feel the final result is worth it. The problem we had with the original file auditing routine was the time it took to process large files and the overhead associated with it. Now with the ADLER32 CRC routine and a few file management changes we have seen a marked improvement in both the speed of processing and the associated overhead.

The new file / member level data checking should be a first pass process where any problems found can be subjected to the record level process which has a built in repair function. This should in most cases be more than enough to resolve any data corruption problems. Not that you should get any unless you have errant programs running on the target system or have users that accidentally cause corruption. Eventually the process can be changed to allow automated repair of a corrupt data segment as the process works at the file level, but at this time the users will be in control of how the file data gets repaired.

We kept basic principles of the CrcBuilder in terms of what data would be checked and how the Crc would be created (file or member level) but we have added generic support for the file and members which are to be checked as well as library and library list support. This allows a user to request audits against files which begin with ‘file*’ and have members which begin with ‘member*’ etc. If we have requests to support this in the CrcBuilder product it should be a fairly simple task to add! We have also kept all of the Crc support we had in the CrcBuilder product but will use only the ADLER32 CRC function for the record level audits.

Here are a couple of screen shots of the RAP version of the product.

The new version of RAP will be available in the near future, we had hope to announce availability by the end of this month, but with the new features we have been asked to provide this date has slipped a bit! However if you are interested in getting hold of the latest version before we make the final cut available let us know and we will see if your environment would be suitable for out BETA program.

Now it’s time to get the testing underway and do some PHP development for the new What’s On website.

Chris…

Jan 08

CrcBuilder after the rush

A number of people have now downloaded the CrcBuilder program and we wondered what kind of results people are seeing? The program was developed with another IBM ‘i’ user/customer who had some pretty large and complex files to check, his results showed a performance level which was markedly different to ours. They did in fact build their own version which only used the Adler32 CRC function but was written in RPG. Their program flew through the records much faster than our C program could using the Adler32 function even with all of the tweeks and performance changes we tried! But the C program did provide more functionality than the RPG program just because of the capabilities of the C language on IBM ‘i’.

If you have run the program and would be willing to share your experiences we would be very happy to see them. The program is not being enhanced at this time as we don’t know what else to add? If you have some ideas on what it could do better or extra let us know and we will see if we can build in the changes..

While we don’t provide support for the product, if you have downloaded it and found problems, let us know and we will try to fix them.

Chris…

Jan 07

Working on new projects with PHP and new RAP features

While this is not being developed using IBM ‘i’ the code could be installed and run on the system! We have been asked to develop a new Event source website to replace the existing one we developed a longtime ago for What’s On which is run as a community service by Shield. The new site will be developed using CSS and PHP with a MySQL backend for the DB (Sorry DB2 would be possible but not necessary this time) and should be a major improvement over the current site!

We are still developing new features for the next release of RAP even though the product went forward to unit testing as we have had a number of requests from existing user which make sense to put in before we cut the final code for the release. One feature which seems to have attracted a lot of attention is the CrcBuilder technology we released as a free download which allows a files data to be read and a CRC to be created for the data which can be checked against a file on another system. Our implementation will run the required CrcBuilder functions on both systems and compare the results automatically, reporting any errors found and storing the results for user review.

While we are still busy with these projects we still have capacity for additional projects, so if you need any HA or PHP related work undertaken let us know, we will be more than happy to discuss our rates and capabilities with you.

Chris…