Hello,
Taking advantage of this post, I would like to expose my case.
I have a ClusterXL HA, which has "broken".
Reviewing the "messages" I found some messages that I can not understand, basically are the "Cluster policy installation state freeze ON" and "Cluster policy installation state freeze OFF".
Does this mean that the GW has "frozen"? Or am I interpreting it wrong?
The Cluster right now, only has one member, but what I see with the "cphaprob -a if" is that the SYNC interface is "disconnected"?
I want to find the root-cause of this problem with the Cluster.
Thanks for your comments.
[Expert@fFW:0]# grep CLUS /var/log/messages
Nov 2 12:15:39 2023 fw1 kernel: [fw4_1];CLUS-120001-1: Cluster policy installation started (old/new Policy ID: 3899683516/1142038046)
Nov 2 12:15:39 2023 fw1 kernel: [fw4_1];CLUS-120008-1: Cluster policy installation state freeze ON (Time=83875441, Caller=fwha_set_conf, Type=0 State=ACTIVE)
Nov 2 12:15:39 2023 fw1 kernel: [fw4_1];CLUS-120009-1: Cluster policy installation state freeze OFF (Time=83875441, Caller=check_required_if_num)
Nov 2 12:15:39 2023 fw1 kernel: [fw4_1];CLUS-114904-1: State change: ACTIVE(!) -> ACTIVE | Reason: Reason for ACTIVE! alert has been resolved
Nov 2 12:15:39 2023 fw1 kernel: [fw4_1];CLUS-120207-1: Local Probing PNOTE OFF
Nov 2 12:15:39 2023 fw1 kernel: [fw4_1];CLUS-120002-1: Cluster policy installation completed successfully without negotiation (new Policy ID: 1142038046)
Nov 2 12:15:40 2023 fw1 kernel: [fw4_1];CLUS-110205-1: State change: ACTIVE -> ACTIVE(!) | Reason: Interface Sync is down (disconnected / link down)
Nov 2 12:15:46 2023 fw1 kernel: [fw4_1];CLUS-120207-1: Local probing has started on interface: eth8
Nov 2 12:15:46 2023 fw1 kernel: [fw4_1];CLUS-120207-1: Local Probing PNOTE ON
Nov 2 12:15:51 2023 fw1 kernel: [fw4_1];CLUS-216400-1: Remote member 2 (state DOWN -> LOST) | Reason: Timeout Control Protocol packet expired member declared as DEAD
Nov 2 23:01:09 2023 fw1 kernel: [fw4_1];CLUS-110405-1: State remains: ACTIVE! | Reason: Sync interface is down
Nov 2 23:08:30 2023 fw1 kernel: [fw4_1];CLUS-110205-1: State remains: ACTIVE! | Reason: Interface Sync is down (disconnected / link down)
Nov 3 14:38:43 2023 fw1 kernel: [fw4_1];CLUS-120001-1: Cluster policy installation started (old/new Policy ID: 1142038046/3231739712)
Nov 3 14:38:43 2023 fw1 kernel: [fw4_1];CLUS-120008-1: Cluster policy installation state freeze ON (Time=84824979, Caller=fwha_set_conf, Type=0 State=ACTIVE)
Nov 3 14:38:43 2023 fw1 kernel: [fw4_1];CLUS-120009-1: Cluster policy installation state freeze OFF (Time=84824979, Caller=check_required_if_num)
Nov 3 14:38:43 2023 fw1 kernel: [fw4_1];CLUS-114904-1: State change: ACTIVE(!) -> ACTIVE | Reason: Reason for ACTIVE! alert has been resolved
Nov 3 14:38:43 2023 fw1 kernel: [fw4_1];CLUS-120207-1: Local Probing PNOTE OFF
Nov 3 14:38:43 2023 fw1 kernel: [fw4_1];CLUS-120002-1: Cluster policy installation completed successfully without negotiation (new Policy ID: 3231739712)
Nov 3 14:38:43 2023 fw1 kernel: [fw4_1];CLUS-110205-1: State change: ACTIVE -> ACTIVE(!) | Reason: Interface Sync is down (disconnected / link down)
Nov 3 14:38:50 2023 fw1 kernel: [fw4_1];CLUS-120207-1: Local probing has started on interface: Mgmt
Nov 3 14:38:50 2023 fw1 kernel: [fw4_1];CLUS-120207-1: Local Probing PNOTE ON
Nov 3 14:38:55 2023 fw1 kernel: [fw4_1];CLUS-216400-1: Remote member 2 (state DOWN -> LOST) | Reason: Timeout Control Protocol packet expired member declared as DEAD
[Expert@FW:0]# cphaprob -a if
CCP mode: Manual (Unicast)
Required interfaces: 6
Required secured interfaces: 1
Interface Name: Status:
eth8 (P) UP
Sync (S) DOWN (4441.6 secs)
Mgmt (P) UP
bond2.30 (LS-P) UP
bond2.240 (LS-P) UP
bond10.450 (LS-P) UP
bond10.460 (LS-P) UP
[Expert@FW:0]# cphaprob state
Cluster Mode: High Availability (Active Up) with IGMP Membership
ID Unique Address Assigned Load State Name
1 (local) 20.6.5.5 100% ACTIVE(!) GW1
Active PNOTEs: LPRB, IAC
Last member state change event:
Event Code: CLUS-110205
State change: ACTIVE -> ACTIVE(!)
Reason for state change: Interface Sync is down (disconnected / link down)
Event time: Fri Nov 3 14:38:43 2023
Last cluster failover event:
Transition to new ACTIVE: Member 2 -> Member 1
Reason: ADMIN_DOWN PNOTE
Event time: Sat Aug 12 22:40:11 2023
Cluster failover count:
Failover counter: 115
Time of counter reset: Fri Jul 28 09:33:23 2023 (reboot)
Cheers. 🙂