Oct 24

PowerHA and LVLT4i.

We have had a number of conversations about LVLT4i and what it offers to the Managed Service Provider(MSP). As part of those discussions the IBM solution PowerHA often comes up as it also uses iASP technology but that is really where the similarity ends.

PowerHA uses the iASP to isolate the objects that are to be replicated to another system/storage device and it has an exact copy of the iASP from the source on the target. Changes are captured at the hardware level and are sent to the remote system as they occur.

LVLT4i only replicates objects to a remote iASP, it uses either Audit journal triggers or the Remote Journal technology to capture and send the data. The source object resides in *SYSBAS and the target object in an iASP, it is used primarily to allow multiple copies of the same library/object combination to be stored on a single system. The remote iASP is always available to the user.

iASP is not widely implemented at customer sites, this is in part due to the lack of support for iASP’s built into many of the applications that run on the IBM i today (many of the applications were built before iASP technology was available). For a customer to migrate an application to allow iASP use there are a number of constraints which have to be considered plus each users environment has to be adjusted to allow the iASP content to be used (SETASPGRP etc). This has further limited the use of iASP as many do not feel the benefits of moving to the iASP model out-weight the cost of migration. Another issue is you are now adding an additional storage management requirement, the iASP is disk based which will require protection to be added in some form. With LVLT4i you can leave your system unchanged, only the target system is going to need iASP setup and that will be in the hands of your Managed Service Provider. The decision about what to replicate is yours, with some professional help from a Managed Service Provider who knows your application it should be pretty bullet proof when it comes to recovery.

If you implement PowerHA you are probably going to need to set up an Admin Domain, this is where any *SYSBAS objects such as system values, profiles and configuration objects are managed. in LVLT4i we do not manage system values or configuration objects (configuration objects can be troublesome especially with TCP/IP) or system values. We have however just built in a new profile and password process to allow the security aspects of an application to be managed across systems in real time. Simple scripts can capture configuration and system value settings many of which are not important to your application so LVLT4i has you covered. If we find a need to build in system value or configuration management we will do so fairly rapidly.

PowerHA is priced by Core, so you license it for each Active Core on each system. Using CBU licensing, PowerHA can utilize lower active cores on the target and only activate them when the system is required. Unfortunately in a HA environment you are probably switching regularly so you will have the same number of active cores all the time. LVLT4i is priced by IBM tier regardless of the number of active cores. The target system license is included with the source system license regardless of the target system tier so a Manage Service Provider who has a P30 to support many P05 clients is not penalized.
PowerHA also comes in a few flavors which are decided on by the type of set up you require. Some of the functionality such as Asynchronous mirroring is only available in the Enterprise edition so if you need to ensure your application is not constrained by remote confirmation processing (waiting for the remote system to confirm it has the data) your are going to need the Enterprise edition which costs more per core. LVLT4i comes in one flavor and is based on a rental model, the transport of data over Synchronous/Asynchronous remote journals is available to all plus it supports any geographic model.

Because the iASP is always available the ability to backup at any time is possible with LVLT4i. With PowerHA you have to use a Flashcopy to make another disk based copy of the iASP which can then be used for the back up to tape etc. That requires a duplicate set of disks to match the iASP content. With LVLT4i you can use Save While Active or suspend the apply process for point in time saves, the remote journal will still be receiving your application updates which can be applied once the save has completed so data protection is not exposed.

RPO is an important number which is regularly banded around by the High Availability providers, PowerHA states it is 0 because everything is replicated at the hardware level. We believe LVLT4i is pretty close to the same but there are a couple of things to consider.

First of all, RPO of 0 will require synchronous delivery of changes, if you use an Asynchronous delivery method queued changes will affect that for either solution. LVLT4i uses Remote journalling for data changes, so if you use Synchronous mode I feel the two are similar in effect.

Because we use a different process for object changes, any object updates are going to be dependent on the level of change activity being processed by the object replication processes. The amount of data being replicated is also a factor as a single stream of object changes is used to transfer the updates. We have done a lot of work on minimizing the data which has be be sent over the wire such as using commands instead of save restore, pipe-lining changes so multiple updates to an object are optimized into a single action and compression within the save process. This has greatly reduced the activity and therefore bandwidth requirements.

PowerHA is probably better at object replication because of the technology IBM can access, plus it is going to be carried out in line with the data changes. The same constraints about using synchronous mode affect the object replication process so bandwidth is going to be a major factor in the speed of replication etc. Having said that, most of the smaller clients we have implemented any kind of availability for (HA4i/DR4i) do not see significant object activity and little to no backlogs in the object replication process.

The next recovery figure RTO talks about how long it will take from making the decision to switch, to actually switching. My initial findings about iASP tended to show a fairly long role-swap time because you had to vary off the iASP and then on again to make it available. We have never purchased PowerHA so our tests are based around how long it took to vary off and then on again a single iASP on our P05 system (approximately 20 minutes). I would suspect the newer and faster systems have reduced the time it takes but it is still a fairly long time. LVLT4i is not a contender in this role because we expect the role-swap times to be pretty extended (4 – 12 hours) even if you do a lot of automation and preparation.

One of the issues which affect all High Availability Solutions is the management of batch, if you have a batch process running at the time of failure it could affect the integrity of the application data on the target system. LVLT4i and PowerHA both have this limitation as the capture of job queue content is not possible even in an iASP, but we have a solution which when integrated with LVLT4i will allow you to reload job queues and identify orphaned data which has been applied by a batch process. Our JQG4i product captures all activity for specific job queues and will track each job from load to completion. This will allow you to recover the entire application environment to a known start point and thereby ensure your data integrity is maintained. Just being able to automatically reload jobs that did not run before the system failure is a big advantage that many current users benefit from.

There are plenty of options out there to choose from but each has its own strengths and weaknesses. LVLT4i uses the same replication technology as out HA4i and DR4i products with enhancements to allow the use of iASP as the target disk. It is not designed to meet the same RTO expectations as PowerHA even though both make effective use of iASP technology. However, PowerHA is not necessarily the best option for everyone because it does have a number of dependencies that make it more difficult/costly to implement than a logical replication solution, you have to weigh up the pros and cons of each technology and make a decision about what is important.

If you are interested in knowing more or would like to see a demo of the LVLT4i product please let us know and we will be happy to schedule.

Chris…

Oct 23

SAVSECDTA timing?

We are looking at how to manage the recovery of profiles and passwords in an environment where the profiles cannot be managed constantly. When using our HA4i product we have the ability to constantly maintain the user profiles and passwords because the user profiles are allowed to exist on the target system. However in an environment such as that required for the LVLT4i product User Profiles cannot exist because they may conflict with other profiles from other clients (All user profiles have to exist in *SYSBAS)

The process we have tested involves using the SAVSECDTA command to save the data to a save file, this save file can be automatically replicated to the iASP on the target system. The Profile information is captured in a file which is also replicated to the target iASP using normal replication processes (Remote Journals). When the system needs to be rebuilt for recovery the information collected in the SAVSECDTA file will be restored, the profiles will be updated using the profile data we have collected and then the RSTAUT command will be run. This will bring the system and profiles up to the latest content available.

While we were testing the processes we noticed a very strange thing. The first time we ran the request on a system it took a little while to complete about 1 minute, but when we ran the request again it took only a couple of seconds? The content of the save file was the same (we even set the compression level to high with no significant impact) but why is it taking so long the first time? We thought that maybe it was because the save file was already available (we put it in QTEMP) but again signing off and on then retrying gave us the same results, it now only took a few seconds to complete the save? Signing onto another system and doing the exact same process yielded the same results, the first time took about 1 minute while subsequent tries only took a few seconds.

We do not know what is going on under the covers but it certainly seems like something gets lined up after the first save, this leads us to believe that doing a SAVSECDTA on a regular basis (nightly?) may not be a bad thing. If you have any information as to why, let us know as we are very curious.

LVLT4i is new and while we feel the product should attract a number of Managed Service Providers we are interested in knowing what you think. Would you be interested in a solution that provides a very low RPO (close to zero data loss) with a RTO in the 4 – 12 hours time frame? If you are interested let us know, we will be happy to put you in touch with one of the MSP’s we have been working with. If you are a MSP and would like to know more or even see a demo of the product let us know as well, we are excited by the opportunities this could bring.

Chris…

Oct 20

New Product Library Vault, Why?

We have just announced the availability of a new product, Library Vault for IBM i (LVLT4i) which is aimed primarily at the Managed Service Providers. The product allows the replication of data and objects from *SYSBAS on a clients system to an iASP on a target system.

The product evolved after a number of discussions with Managed Service Providers who were looking for something less than a full blown High Availability Product but more than a simple Disaster Recovery solution. It had to be flexible enough to be licensed by the replication content not the systems being used to run it on.

We looked at our existing products and how the licensing worked, it became very apparent that neither would fit the role as they were both licensed at the system level plus HA4i was more than they needed because it had all bells and whistles associated with a High Availability product while DR4i just didn’t have the object capabilities required. So we had to look at what we could do to build something that sits in the middle and license it in such a manner that would allow the price to be fair for all parties.

Originally the product was going to be used in a LPAR to LPAR scenario because the plan was to use the HA4i product with some removed functionality, however one of the MSP’s decided that managing lots of LPAR’s even if they are hosted as VM’s under an IBM i host would entail too much management and effort. The RTO was not going to be the main driver here only the RPO, so keeping the overhead of managing the solution would be a deciding factor. We looked at how to implement the existing redirection process used for mapping libraries that HA4i and DR4i use, it soon became very apparent to us that this would not be ideal as each transaction being processed would require a lot of effort to set the target object. So we decided to look at how we could take the iASP technology we had built many years ago for our RAP product and structure it in such a manner which would meet all of the requirements.

After some discussion and trials we eventually had a working solution that would deliver an effective iASP based replication process. Next we needed to set the licensing to allow flexibility in how it could be deployed. The original concept would be to set the licensing at the library level as most clients would be basing their recovery on a number of libraries so adding the ability to manage the number of licenses against the number of libraries was started. What at first seemed to be a simple task soon threw up more questions than answers! The number of libraries even with a range was not going to be a fair practice for setting our price, some libraries would be larger than others and have more activity which would generate more activity for the replication process. Also the IFS would be totally outside of the licensing as it has no correlation with a library based object (nesting of directories) so it would need to be managed separately. We also recognized that the Data Apply was based solely on the Journal so library based licensing would not work for it either.

The key to getting this to work would be flexibility, we needed to understand this from the MSP’s position, the effort required to manage the set up and licensing had to be simple enough for the sales person to be able to go in and know what price he should set. So we eventually came back to the IBM tier based pricing, even though we have the ability to license all the way back to the object, CPU, LPAR, Journal etc. We needed to give the MSP flexibility to sell the solution at an affordable price without complex license charts. We also understand that a MSP would grow the business and probably have additional resources available for new clients in advance, so we decided that the price had to be based on the clients system and not on the pair of systems being used.

LVLT4i is just getting started, its future will be defined by the MSP community who use it because they will drive the development of new features. We have always felt that Availability is best handled by professionals because Availability is not a one off project, it has to evolve as the clients requirements evolve and develop. Our products hopefully give clients the ability to move through a natural progression from DR to HA. Just because you don’t need High Availability today doesn’t mean you wont later, we have yet to find anyone who doesn’t need to protect their data. Having that data protected to the nearest transaction at an affordable cost is something we want to provide.

If you feel LVLT4i is right for you let us know, we will be happy to put you in touch with one of the partners we are working with to discuss your needs. If you would like to discuss other opportunities for the product such as data aggregation or centralized storage let us know, we are always happy to see if the technology we have, fits other interests.

Chris…

Jul 04

Who would have thought! I am starting to use RPG!

I have always said that I did not need to learn or use ‘RPG’ on the IBM i as I always found that ‘C’ could do all that I needed. Recently I was asked by a friend to help them with some RPG code to handle Java and the clean up of the objects it created (Java would not automatically clean up objects because they were effectively created by the RPG program and this program ran constantly, so temp storage just kept growing until it blew up). Not knowing RPG or understanding how the layout worked (I jumped straight into ‘/Free’!) I found this very difficult as ‘/Free’ is not really free format(there are still some column constraints) while ‘C’ really is free format. Still after some research and a lot of head scratching I finally got some sample code working, we then built a Service program that could handle the Java clean up and built code into the existing RPG programs to call it. The solution works the clients systems are not blowing up with memory issues caused by Java objects not being cleaned up.

I though OK that’s the last time I will have to do that and was happy that I could get back to good old ‘C’ programming. Unfortunately, I came across another issue which required me to pick up the RPG manuals and code up a test application.

We have a client who was experiencing problems with an application that uses commitment control and constraints that required us to build a test which would emulate the problem on our systems. As usual the first thing I did was to write a ‘C’ based solution, I did find a Commitment control test which was written by Paul Tuohy here. This was all written using RPG so I thought I would just follow the program logic and write a ‘C’ version which seemed the easiest option. While I could get the simple file update logic built and the program would work without Commitment Control, I found that as soon as Commitment Control was started the program would freeze on receipt of data from STDIN, (I will have to ask IBM why when I have time) so I decided my bets options was to take the code that Paul had provided and build my own interpretation of the program with some additional features I needed.

I wanted the program accept multiple entries plus allow deletes by key before the commit of the data so I had to make a few changes to the logic and add a new delete option. While the program is very clunky it does achieve what I needed it to do and I found out a lot about commitment control and constraints as a result. I am also unsure if the program is as efficient as it could be, but it works and for now that is all that’s needed.

Here is the code I ended up using.


H Option(*SrcStmt : *NoDebugIO) DftActGrp(*No) ActGrp('COMMITDEMO')
FHeader1 UF A E K Disk Commit(SomeTimes)
FDetails1 UF A E K Disk Commit(SomeTimes)

D Commit1 PR ExtPgm('COMMITRPG')
D SomeTimes n

D Commit1 PI
D SomeTimes n

D ToDo S 1a
D ToDel S 1a
D ToAdd S 1a
/Free
ToAdd = *Blanks;
Dow (ToAdd <> 'n');
Dsply 'Enter a Key Value (2 long): ' ' ' Key;
If (Key <> *Blanks);
Text = 'Header for ' + Key;
Write Header;
For Sequence = 1 to 3;
Text = 'Detail for ' + Key + ' ' + %Char(Sequence);
Write Details;
Chain Key Header1;
NumRows += 1;
Update Header;
EndFor;
NumRows = 0;
EndIf;
Key = *Blanks;
Dsply 'Enter a more Keys y/n : ' ' ' ToAdd;
EndDo;
ToDel = *Blanks;
Dsply 'Do you want to delete entries : ' ' ' ToDel;
Dow (ToDel <> 'n');
Key = *Blanks;
Dsply 'Enter the Key to Delete : ' ' ' Key;
if (Key <> *Blanks);
chain Key header1;
if %found;
delete header1;
EndIf;
EndIf;
ToDel = *Blanks;
Dsply 'Do you want to delete more entries : ' ' ' ToDel;
EndDo;
ToDo = *Blanks;
Dow (ToDo <> 'c' and ToDo <> 'r' and ToDo <> 'i');
Dsply 'c - Commit, r - Rollback, i - Ignore ' ' ' ToDo;
EndDo;

If SomeTimes and (ToDo = 'c');
Commit;
ElseIf SomeTimes and (ToDo = 'r');
RolBk;
EndIf;

*InLR = *On;
/End-Free

Note: the Blog does not allow RPG code indentation so the view you see is not what it was copied in as!

The database was exactly the same that Paul had defined including the cascading delete for the details file (I liked that bit) so when we delete the Header Record the matching records in the Details file are also deleted. That saved us having to chain (see I can speak RPG) the details file and remove the entries. Now we can see the problem the client was experiencing and know how to resolve it.

As usual Google was our best friend, thanks to Paul Tuohy and ITJungle for providing the sample code we based the test application on. I am now a little less resistant to RPG and may delve a little more into its capabilities and how I can use it effectively, who knows I may even become good at it?? The point I am trying to make here is that while I still do not want to use RPG, I did what I keep telling others to do, I used the best tool for the job. Using any language just because it is all you know is not always the best option, sometimes you have to jump outside of your comfort zone and try something new.

Chris…

Jun 24

New RDX Drive not supported by the BACKUP menu commands on our V7R1 Power 720

When we read that IBM was recommending users who are currently using the DAT160GB tape system move to the RDX drive system for backup purposes, we decided we would give it a try. First of all we checked with a number of people that the RDX drives were a suitable backup device and to make sure it was going to be supported on our small 720 system (IBM seemed to be positioning at our size of company and hardware) before we placed the order. We placed the order over a week ago and the drive finally arrived today.

After finding out that our planned move to VIOS based partitioning was flawed and not possible with internal disk, we had high hopes that the support for RDX technology would be a big step forward from our current backup technology (tape is painfully slow and error prone) especially as we have multiple partitions. We made the decision to purchase the enclosure and 2x320GB drives at the same time we purchased the hardware/software to support the move to VIOS partitions. We understood the ethernet card and PowerVM licenses were now extra to requirements, but we still hoped that the investment in the new drive technology would be worthwhile.

All of the hardware and software turned up today so we unpacked the drives and enclosure and attached it to the IBM i via the front USB port. It was powered on and the drive inserted which showed all green lights on the front of the enclosure.

First problem we came across was the drive would not show up in the hardware configs, when we previously migrated back to i-hosting-i we did not allocate the USB adapter in the hardware profiles, so a quick configuration update was made and the drive finally showed up in the available resources on the partition. Next we decided to test moving the drive between partitions (systems) using the DLPAR options, it all seemed to work fine as long as we ensured the device was varied off prior to the DLPAR move request. It did take some time for the adapter to move and even longer for the device to show up in the partitions.

Once we were happy with the ability to move between partitions we formatted the drive in anticipation of using it for our daily/weekly backups. The format was very quick and showed the correct 320Gb of available space on the drive. We then tried to add it to the backup schedule in place of our tape device was where we came across the biggest problem, the IBM i BACKUP options provided with the OS only support tape drives!! So we are now faced with having to develop our own backup processes to allow us to use it the drive for our backups.

We are still unsure how the backup will be stored on the drive, IBM has indicated that it should function in the same manner as a tape meaning that it will allow multiple saves to be carried out to the same device and write to the end of the last save. We now have to build the save processes and set them up to replace the OS based solution that we have today. Once we get the save processes developed we will report back just how good the drives are and how easy they will be to use for our simple backup requirements. Have to stick with Tape for now unless IBM adds support for the RMS devices in the OS BACKUP solution in the near future. Yet again our IBM connections really didn’t know the capabilities of the RDX drives in an IBM i environment, maybe we can come up with some answers…

Chris…

Jun 23

Annoying CPF9E7F message fixed

After the attempted migration from i-hosting-i to a VIOS based partition configuration and subsequent rebuild of the i-hosting-i partitions, we found that the QSYSOPR message queue was being sent CPF9E7F messages constantly. We checked the HMC configurations and everything looked OK because we had configured 4 partitions with a total of 2 Processors out of the 4 we have available. We had upgraded the system to have 4 available processors ready for the VIOS configurations where we intended to use 2 for IBMi, 1 for AIX and 1 for Linux.

We asked our sales rep what the problem was especially as we have a license for the additional AIX core which we wanted to implement as well, his response was to speak with support as it looked like we were exceeding our licenses. Eventually we raised a PMR and spoke with IBM, they informed us that while we were not technically exceeding our entitlement the way the IBMi OS calculated the available CPU cores meant it saw a problem. The answer was pretty simple to implement, we had to set up Shared Processor Pools and allocate a maximum number of available cores to that pool. Then we then had to make each partition use that pool so that we could not exceed our entitlement. This was done using the Shared Processor Pool Management option in the HMC where we created the new pool and set the partitions to use that pool. That fixed the immediate problem, but the partition profiles also needed updating and the to be re-booted for the changes to take permanent effect.

When we created the IBM i shared pool we also took the opportunity to create a AIX pool and a Linux pool so that when we add those partitions to the system we can correctly allocate the additional processors to them.

We no longer see the CPF9E7F messages and everything runs just the same as it always did. We continue to learn just how capable the IBM i Power system can be, the downside to that is just how complex it can be as well. We hope to set up the AIX partition and Linux partitions in the near future, we will post our experiences as we go along.

Chris…

Jun 12

Issue with ‘restore 21′ resolved, everything running

The problems with the restore 21 of the partition data have been resolved and all of the partitions are now up and running.

The problem which gave us the most grief was the update to the content of the partition which was running V7R2. For some reason the restore operation kept hanging at different spots in the restore 21 process. One of the problems seemed to be with damaged objects on the system which caused the restore to hang and required a forced power off of the partition (SYSREQ 2 did nothing). We cleaned up the damaged objects and started the restore again only to hang again while restoring the IFS only this time we could end the restore operation with SYSREQ 2 and get back to a command line. There was nothing in the joblog to show why the restore was hanging so we eventually manually run the command to restore the IFS. We then started the partition and everything looked OK, but when we tried to start the HTTP server (we like the mobile support so we needed it running) it kept ending abnormally, turns out we forgot to run the RSTAUT command. Restore 21 does this after the RST for the IFS completes. After we ran the RSTAUT the jobs all started up correctly and we had the partition up and running again.

The other problem we had was with a V6R1 partition, it refused to start complaining about a lack of resource (B2008105 LP=00004). As this was a deployment of a running configuration so we thought nothing had changed and wondered why it would no longer start up. In the back of our minds we had a vague recollection that setting up partitions for V6R1 on Power7+ systems required the RestrictedIO partition flag to be set so we looked through the partition profile to find where it was set without success. We discovered that it is not part of the profile, you have to set the flag in the properties for the partition. Once we had done this the partition came up without any further problems and we now had all of our original configuration up and running.

We made a couple of additional changes to the configs because one of the reasons we really liked the VIOS option was being able to start everything up at once. With our set up we were powering up the host partition and then powering up each of the clients manually. We wanted to be able to power on the system and all of the partitions would fire up automatically. Also when we wanted to power down we just wanted to power down the host partition and it would take care of all the hosted partitions, the answers is the Power Controlling settings. We set up each of the NWSD objects in the hosting server to be Power Control *YES, we then updated the profiles for the hosted partitons to be Power Controlled by the hosting partition. After initializing the profiles with the NWSD object varied off and shutting down the profiles we then varied on the NWSD objects and the partitions automatically started up. Now when we start the main partition the other partitions all start once the NWSD is activated (they are all set to vary on at IPL). We also set the hosting partition to power on when the server was powered on and the server to power off when all of the partitions were ended. We have not tested the power down sequence to make sure the guest partitions are ended normally when we PWRDWNSYS *IMMED on the hosting partition but it should shut down each partition gracefully before shutting itself down.

Now its back to HA4i development and testing for the new release, manuals to write and a new PHP interface to design and code. Even though we like the Web Access for i interface it is not as comprehensive as the PHP interface in terms of being able to configure and manage the product.

If you are planning a move to partitioning your Power system we hope the documenting of our experiences is helpful.

Chris…

Jun 11

Rebuild of the i-hosting-i underway.

We have finally started the rebuild of the data for the i-hosting-i partitions and came across a few problems.

First problem was to do with the system plan. Before we started down the VIOS route we created a system plan from the existing partition and system information and checked it to make sure we had no errors logged. Nothing was shown as a problem so our plan was to use it to deploy again if we could not get the VIOS set up functioning. As it turns out we could not use the system plan, the deployment failed every time because of adapter issues which did not show up when we viewed the plan on the HMC.

This required us to edit the system plan which required us to use the system planning tool. We downloaded the SPT to a PC and installed it, a slight issue with Windows 8 meant we had to run the program in Windows 7 mode to get it to install, but once it was up and running we managed to import the original system plan. Even though the system plan was created from a running system with active partitions the planning tool threw up a lot of errors. We had problems with the addition of the internal SATA tape drive blocking the USB adapter and so on which took a pretty long time to understand, in the end we just configured few things we must have to export the plan and exported it ready for import to the HMC. Eventually the plan did deploy on the HMC so it looked like we were ready to go.

We did an IPL D using the SAVSYS tape and all seemed to go well until we got to the DASD configuration in DST. We had the LIC installed the first drive as the load source but we needed to add all of the other drives and Raid protect them. As we progressed through the DST options we kept getting errors about connections being missing, a search using Google turned up nothing so decided to take the F10 option (ignore the message and continue). It turned out to be a problem because we only had one of the Raid cards set up, not have both (I thought we only had one but 2 show up in the hardware list) so when we took the option to add the drives to ASP1 and then started Raid protection it took hours (IBM support did try to help by DLPAR’ing the additional Raid card but we were too late to gain any benefit) so 6 hours later we had the drives set up and protected.

Because this is the hosting partition the other partition data was restored at the same time which took about 5 hours to complete. We checked the NWSD objects for the hosted partitions were restored correctly and configured, we saw that they were were in a VARIED OFF state so we VARIED them ON and watched as they became ACTIVE, so far so good.

At this point we thought OK we are now ready to start the other partitions. We took the option to activate the first partition profile on the HMC but it quickly came to a grinding halt! the SRC code displayed was B2004158 LP=0002, not much information turned up with a Google search so I tried to get a console up to see what was actually going on. It appears that when you first start the partition you need to specifically set the advanced start up parameters the first time (the normal setting is do not override the Mode and source settings), we just set it to B,N and the partition started up.

We still have one partition which fails to start, this is a V6R1 partition and while we did see some reference in the VIOS configurations to dedicated IO for V6R1 on Power 7+ we know this was running before so we think it was damaged on the restore of the NWSD? We have a full system save on tape for it so as soon as everything else is fixed we will try a IPL D with the SAVSYS and rebuild the data.

After over a week of fighting with IBM to get the right hardware and software to run a VIOS based partitioned system we have accepted that i-hosting-i will be the solution for now. We have already started to look at SAN in the hopes of one day having enough bandwidth to trek down this road again, this time we know that internal disks are not for VIOS partitioning! Pity the IBM sales team didn’t know that before we ordered the additional hardware for Ethernet and the additional core activations for PowerVM. I am sure that with enough trail and error you could get a VIOS running with internal disk running, but if the performance is degraded as IBM suggests (they don’t say by how much) I think it may be a futile exercise?

Hope you find the information useful, maybe it will help you avoid some of the pitfalls we came across and save you time and money :-).

Chris..

Jun 10

Its a bust!

Finally we get the answer we have been looking for..

Generally we don’t recommend VIOS and virtualised partitions using internal disks.
Usually organisations are using VIOS with external storage.
There are many reasons – performance, benefits, etc.

Yep, mostly for performance reasons, its i-hosting-i on internal disk, vios for external disk…..The big problem with that is that very few people are crossing those boundaries.

So all of the work so far to get the VIOS set up has been in vain.. Well not entirely because we have learned a lot of very good lessons about the AIX/VIO interfaces and how to set up and install. But for now we are just going to back peddle and use i-hosting-i until we can get a SAN to test out what a VIOS implementation can provide. I am also interested in how we could set up the internal disks to run IBM i hosting while having a single drive for VIOS that could manage the external drives (if that is in fact possible).

If we do actually get to the stage of implementing we will again publish our experiences. May take us a while to get back to this as we need to ensure the HA4i product release is put back on track.

Keep watching.

Chris…