Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Contributor

R80.40 5400 Cluster - HA module not started

Jump to solution

Hi,

We've configured 5400 cluster (HA) with two gateways and we have several VLAN subinterfaces both on a bond and on the physical interfaces. After rebooting the gateways the cluster is not started and we are getting following error message:

[Expert@CP-2:0]# cphaprob state

HA module not started.

(The same error message on both gateways)

We had to manualy start cluster with the command: cphastart on both gateways after each reboot.

We found following sk165073 saying that it could be a problem if there is a lot of non-monitored interfaces in a cluster. (by default not all VLAN subinterfaces are monitored)
We configured following fw kernel parameter in order to monitor all VLAN interfaces also:

[Expert@CP-1:0]# cat $FWDIR/boot/modules/fwkern.conf
fwha_monitor_all_vlan = 1

on both our gateways bu the same issue remains, the cluster is not automatically started after the reboot and has to be manually started on both gateways.
Please if you have some idea how to resolve this cluster issue.


Here is the detail command output:

[Expert@CP-2:0]# fw ctl get int fwha_monitor_all_vlan
fwha_monitor_all_vlan = 1
[Expert@CP-2:0]# cphaprob state

HA module not started.

[Expert@CP-2:0]# cphastart
[Expert@CP-2:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

ID Unique Address Assigned Load State Name

1 10.255.253.1 100% ACTIVE CP1
2 (local) 10.255.253.2 0% STANDBY CP2


Active PNOTEs: None

Last member state change event:
Event Code: CLUS-114802
State change: DOWN -> STANDBY
Reason for state change: There is already an ACTIVE member in the cluster (member 1)
Event time: Fri Sep 18 13:27:35 2020

Cluster failover count:
Failover counter: 0
Time of counter reset: Fri Sep 18 13:21:46 2020 (reboot)

 

[Expert@CP-2:0]# cphaprob -a -m if

CCP mode: Manual (Unicast)
Required interfaces: 11
Required secured interfaces: 1


Interface Name: Status:

eth5 UP
bond1 (LS) UP
bond5 (S-LS) UP
bond4.172 (LS) UP
bond4.20 (LS) UP
bond4.60 (LS) UP
eth6.164 UP
eth6.166 UP
bond4.113 (LS) UP
eth6.165 UP
bond4.10 (LS) UP

S - sync, LM - link monitor, HA/LS - bond type

Virtual cluster interfaces: 10

eth5
bond1
bond4.172
bond4.20
bond4.60
eth6.164
eth6.166
bond4.113
eth6.165
bond4.10


Monitoring mode is "Monitor all VLANs": All VLANs are monitored

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Contributor

I am very sory for inconvenience,  seems that last hotfix solves this problem. I just have installed  HOTFIX_R80_40_JUMBO_HF_MAIN Take: 78

and my cluster now works fine, it is normally UP after the reboot.

Thank you all for supporting me.

View solution in original post

0 Kudos
11 Replies
Highlighted
Champion
Champion

Did you check the state of the cluster in cpconfig?

When the first time wizard on these systems was run before they were to be a cluster member and the membership question was answered with No, then cluster membership is not turned on in cpconfig.

Regards, Maarten
Highlighted
Contributor

Hi Maarten,
yes, I was very careful about declaring my gateways as cluster members during the first setup wizard. I am pretty sure I have enabled Cluster membership. Anyway, my cluster is working if I manualy start cphastart after the gateway reboot. I believe my cluster setup is fine, since I have done several cluster setups recently.

Highlighted
Champion
Champion

Maybe the issue from sk98977 ?

Highlighted
Contributor

Hi,

I see it now, I did not have matching hostnames in the SmartConsole and on the corresponding gateways. Now I have changed hostnames on the gateways so they are identical to gateways hostnames in the SmartConsole, but the problem still remains, I still have HA module not started after the gateways reboot, still have to manually start the cphastart.

Highlighted

I can confirm exactly the same issue on R80.30  3.10 kernel on 16000 appliances with latest HFA.

After reboot/cprestart, following services are not started (based on "cpwd_admin list" output):

cphamset -d

rtmd

CPUSE agent (DAservices)

 

Can you try to install policy once you get "HA not started" ?

Did you try to disable CCP encryption on both gateways ?

Kind regards,
Jozko Mrkvicka
Highlighted
Contributor

Hi Jozko Mrkvicka,

CCP encryption is disabled by default I have not changed it, so it is already disabled on both my gateways:

cphaprob ccp_encrypt

OFF

I also compared outputs from cpwd_admin list before and after a reboot and I see cphamset -d service is missing in my case after a reboot. So, probably there is some issue why cphamset is not starting after a reboot.

One funny thing, I also tried a policy install after reboot and it is working, even my cluster is established if I do policy install, it has the same effect as if I do manual cphastart on my gateways.

Best Regards,

Mladen

 

0 Kudos
Highlighted

Hi @MladenAntesevic 

same issue with 6000 and 3.10 kernel.

Open a TAC case.

0 Kudos
Highlighted
Contributor

Hi @HeikoAnkenbrand,

thanks for your information, I am opening a TAC case.

Regards,

Mladen

0 Kudos
Highlighted

Hi @MladenAntesevic 

Another idea!

Have a look if the following parameter is set:

/etc/fw.boot/ha_boot.conf
ha_installed 1

Tags (1)
Highlighted
Contributor

Hi @HeikoAnkenbrand ,

yes, this option is set on my both gateways, this is the complete output:

cat /etc/fw.boot/ha_boot.conf
ha_installed 1
fw1_build 994000685
release R80.40
take 294

 

Everything looks fine. Please, explain to me what is your idea?

 

Regards,

Mladen

0 Kudos
Highlighted
Contributor

I am very sory for inconvenience,  seems that last hotfix solves this problem. I just have installed  HOTFIX_R80_40_JUMBO_HF_MAIN Take: 78

and my cluster now works fine, it is normally UP after the reboot.

Thank you all for supporting me.

View solution in original post

0 Kudos