I have fully working cluster lab, so can easily show you. Key is that whatever you configure as cluster interfaces, if one of them goes down, there will be failover. I gave some exampled below from the lab.
Andy
master fw:
[Expert@CP-FW-01:0]# cphaprob roles
ID Role
1 (local) Master
2 Non-Master
[Expert@CP-FW-01:0]# cphaprob state
Cluster Mode: High Availability (Active Up) with IGMP Membership
ID Unique Address Assigned Load State Name
1 (local) 169.254.0.112 100% ACTIVE CP-FW-01
2 169.254.0.111 0% STANDBY CP-FW-02
Active PNOTEs: None
Last member state change event:
Event Code: CLUS-114704
State change: STANDBY -> ACTIVE
Reason for state change: No other ACTIVE members have been found in the cluster
Event time: Wed Jul 3 08:34:59 2024
Last cluster failover event:
Transition to new ACTIVE: Member 2 -> Member 1
Reason: ADMIN_DOWN PNOTE
Event time: Wed Jul 3 08:34:59 2024
Cluster failover count:
Failover counter: 4
Time of counter reset: Thu Jun 27 20:23:48 2024 (reboot)
[Expert@CP-FW-01:0]# cphaprob -a if
CCP mode: Manual (Unicast)
Required interfaces: 4
Required secured interfaces: 1
Interface Name: Status:
eth0 (LM) UP
eth1 (LM) UP
eth2 (LM) UP
eth3 (S) UP
S - sync, HA/LS - bond type, LM - link monitor, P - probing
Virtual cluster interfaces: 3
eth0 172.16.10.246
eth1 172.31.10.246
eth2 192.168.10.246
[Expert@CP-FW-01:0]#
[Expert@CP-FW-01:0]# cphaprob -i list
There are no pnotes in problem state
[Expert@CP-FW-01:0]# cphaprob -l list
Built-in Devices:
Device Name: Interface Active Check
Current state: OK
Device Name: Recovery Delay
Current state: OK
Device Name: CoreXL Configuration
Current state: OK
Registered Devices:
Device Name: Fullsync
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 9772.6 sec
Device Name: Policy
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 9771.4 sec
Device Name: routed
Registration number: 2
Timeout: none
Current state: OK
Time since last report: 96711.5 sec
Device Name: cxld
Registration number: 3
Timeout: 30 sec
Current state: OK
Time since last report: 238143 sec
Process Status: UP
Device Name: fwd
Registration number: 4
Timeout: 30 sec
Current state: OK
Time since last report: 238143 sec
Process Status: UP
Device Name: cphad
Registration number: 5
Timeout: 30 sec
Current state: OK
Time since last report: 238130 sec
Process Status: UP
Device Name: Init
Registration number: 6
Timeout: none
Current state: OK
Time since last report: 238125 sec
[Expert@CP-FW-01:0]#
*************************************************
backup fw:
[Expert@CP-FW-02:0]#
[Expert@CP-FW-02:0]# cphaprob roles
ID Role
1 Master
2 (local) Non-Master
[Expert@CP-FW-02:0]# cphaprob state
Cluster Mode: High Availability (Active Up) with IGMP Membership
ID Unique Address Assigned Load State Name
1 169.254.0.112 100% ACTIVE CP-FW-01
2 (local) 169.254.0.111 0% STANDBY CP-FW-02
Active PNOTEs: None
Last member state change event:
Event Code: CLUS-114802
State change: DOWN -> STANDBY
Reason for state change: There is already an ACTIVE member in the cluster (member 1)
Event time: Wed Jul 3 08:35:00 2024
Last cluster failover event:
Transition to new ACTIVE: Member 2 -> Member 1
Reason: ADMIN_DOWN PNOTE
Event time: Wed Jul 3 08:34:59 2024
Cluster failover count:
Failover counter: 4
Time of counter reset: Thu Jun 27 20:23:48 2024 (reboot)
[Expert@CP-FW-02:0]# cphaprob -a if
CCP mode: Manual (Unicast)
Required interfaces: 4
Required secured interfaces: 1
Interface Name: Status:
eth0 (LM) UP
eth1 (LM) UP
eth2 (LM) UP
eth3 (S) UP
S - sync, HA/LS - bond type, LM - link monitor, P - probing
Virtual cluster interfaces: 3
eth0 172.16.10.246
eth1 172.31.10.246
eth2 192.168.10.246
[Expert@CP-FW-02:0]# cphaprob syncstat
Delta Sync Statistics
Sync status: OK
Drops:
Lost updates................................. 0
Lost bulk update events...................... 0
Oversized updates not sent................... 0
Sync at risk:
Sent reject notifications.................... 0
Received reject notifications................ 0
Sent messages:
Total generated sync messages................ 1736600
Sent retransmission requests................. 0
Sent retransmission updates.................. 0
Peak fragments per update.................... 2
Received messages:
Total received updates....................... 625084
Received retransmission requests............. 0
Sync Interface:
Name......................................... eth3
Link speed................................... 1000Mb/s
Rate......................................... 20190 [Bps]
Peak rate.................................... 236430[Bps]
Link usage................................... 0%
Total........................................ 18331 [MB]
Queue sizes (num of updates):
Sending queue size........................... 512
Receiving queue size......................... 256
Fragments queue size......................... 50
Timers:
Delta Sync interval (ms)..................... 100
Reset on Mon Jul 1 16:46:36 2024 (triggered by fullsync).
[Expert@CP-FW-02:0]# cphaprob -i list
There are no pnotes in problem state
[Expert@CP-FW-02:0]# cphaprob -l list
Built-in Devices:
Device Name: Interface Active Check
Current state: OK
Device Name: Recovery Delay
Current state: OK
Device Name: CoreXL Configuration
Current state: OK
Registered Devices:
Device Name: Fullsync
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 9833.7 sec
Device Name: Policy
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 9832.4 sec
Device Name: routed
Registration number: 2
Timeout: none
Current state: OK
Time since last report: 239927 sec
Device Name: cxld
Registration number: 3
Timeout: 30 sec
Current state: OK
Time since last report: 240004 sec
Process Status: UP
Device Name: fwd
Registration number: 4
Timeout: 30 sec
Current state: OK
Time since last report: 240004 sec
Process Status: UP
Device Name: cphad
Registration number: 5
Timeout: 30 sec
Current state: OK
Time since last report: 239991 sec
Process Status: UP
Device Name: Init
Registration number: 6
Timeout: none
Current state: OK
Time since last report: 239986 sec
[Expert@CP-FW-02:0]#
To gracefully failover, you do this on MASTER member:
clusterXL_admin down;clusterXL_admin up