Re: Cluster checkpoint

samdin · ‎2024-07-04

Hello,

I'm a beginner

Can you tell me how the members of a checkpoint cluster communicate?

Here's my question:
1)Let's imagine that one of the interfaces of firewall 1 goes down, for example eth1: 192.168.10.1 ( in the picture)
Firewall 1 will become backup and firewall 2 nominal.
Who triggers this switchover?

2) another question. Is the HA interface used to send the rules compiled on firewall 1 to firewall 2?

3) when you have 2 firewalls in a cluster, how do you remove one of the firewalls from the cluster so that the 2 firewalls become independent?

Many thanks in advance.

the_rock · ‎2024-07-04

I have fully working cluster lab, so can easily show you. Key is that whatever you configure as cluster interfaces, if one of them goes down, there will be failover. I gave some exampled below from the lab.

Andy

master fw:

[Expert@CP-FW-01:0]# cphaprob roles

ID Role

1 (local) Master
2 Non-Master

[Expert@CP-FW-01:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

ID Unique Address Assigned Load State Name

1 (local) 169.254.0.112 100% ACTIVE CP-FW-01
2 169.254.0.111 0% STANDBY CP-FW-02

Active PNOTEs: None

Last member state change event:
Event Code: CLUS-114704
State change: STANDBY -> ACTIVE
Reason for state change: No other ACTIVE members have been found in the cluster
Event time: Wed Jul 3 08:34:59 2024

Last cluster failover event:
Transition to new ACTIVE: Member 2 -> Member 1
Reason: ADMIN_DOWN PNOTE
Event time: Wed Jul 3 08:34:59 2024

Cluster failover count:
Failover counter: 4
Time of counter reset: Thu Jun 27 20:23:48 2024 (reboot)

[Expert@CP-FW-01:0]# cphaprob -a if

CCP mode: Manual (Unicast)
Required interfaces: 4
Required secured interfaces: 1

Interface Name: Status:

eth0 (LM) UP
eth1 (LM) UP
eth2 (LM) UP
eth3 (S) UP

S - sync, HA/LS - bond type, LM - link monitor, P - probing

Virtual cluster interfaces: 3

eth0 172.16.10.246
eth1 172.31.10.246
eth2 192.168.10.246

[Expert@CP-FW-01:0]#

[Expert@CP-FW-01:0]# cphaprob -i list

There are no pnotes in problem state

[Expert@CP-FW-01:0]# cphaprob -l list

Built-in Devices:

Device Name: Interface Active Check
Current state: OK

Device Name: Recovery Delay
Current state: OK

Device Name: CoreXL Configuration
Current state: OK

Registered Devices:

Device Name: Fullsync
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 9772.6 sec

Device Name: Policy
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 9771.4 sec

Device Name: routed
Registration number: 2
Timeout: none
Current state: OK
Time since last report: 96711.5 sec

Device Name: cxld
Registration number: 3
Timeout: 30 sec
Current state: OK
Time since last report: 238143 sec
Process Status: UP

Device Name: fwd
Registration number: 4
Timeout: 30 sec
Current state: OK
Time since last report: 238143 sec
Process Status: UP

Device Name: cphad
Registration number: 5
Timeout: 30 sec
Current state: OK
Time since last report: 238130 sec
Process Status: UP

Device Name: Init
Registration number: 6
Timeout: none
Current state: OK
Time since last report: 238125 sec

[Expert@CP-FW-01:0]#

*************************************************

backup fw:

[Expert@CP-FW-02:0]#
[Expert@CP-FW-02:0]# cphaprob roles

ID Role

1 Master
2 (local) Non-Master

[Expert@CP-FW-02:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

ID Unique Address Assigned Load State Name

1 169.254.0.112 100% ACTIVE CP-FW-01
2 (local) 169.254.0.111 0% STANDBY CP-FW-02

Active PNOTEs: None

Last member state change event:
Event Code: CLUS-114802
State change: DOWN -> STANDBY
Reason for state change: There is already an ACTIVE member in the cluster (member 1)
Event time: Wed Jul 3 08:35:00 2024

Last cluster failover event:
Transition to new ACTIVE: Member 2 -> Member 1
Reason: ADMIN_DOWN PNOTE
Event time: Wed Jul 3 08:34:59 2024

Cluster failover count:
Failover counter: 4
Time of counter reset: Thu Jun 27 20:23:48 2024 (reboot)

[Expert@CP-FW-02:0]# cphaprob -a if

CCP mode: Manual (Unicast)
Required interfaces: 4
Required secured interfaces: 1

Interface Name: Status:

eth0 (LM) UP
eth1 (LM) UP
eth2 (LM) UP
eth3 (S) UP

S - sync, HA/LS - bond type, LM - link monitor, P - probing

Virtual cluster interfaces: 3

eth0 172.16.10.246
eth1 172.31.10.246
eth2 192.168.10.246

[Expert@CP-FW-02:0]# cphaprob syncstat

Delta Sync Statistics

Sync status: OK

Drops:
Lost updates................................. 0
Lost bulk update events...................... 0
Oversized updates not sent................... 0

Sync at risk:
Sent reject notifications.................... 0
Received reject notifications................ 0

Sent messages:
Total generated sync messages................ 1736600
Sent retransmission requests................. 0
Sent retransmission updates.................. 0
Peak fragments per update.................... 2

Received messages:
Total received updates....................... 625084
Received retransmission requests............. 0

Sync Interface:
Name......................................... eth3
Link speed................................... 1000Mb/s
Rate......................................... 20190 [Bps]
Peak rate.................................... 236430[Bps]
Link usage................................... 0%
Total........................................ 18331 [MB]

Queue sizes (num of updates):
Sending queue size........................... 512
Receiving queue size......................... 256
Fragments queue size......................... 50

Timers:
Delta Sync interval (ms)..................... 100

Reset on Mon Jul 1 16:46:36 2024 (triggered by fullsync).

[Expert@CP-FW-02:0]# cphaprob -i list

There are no pnotes in problem state

[Expert@CP-FW-02:0]# cphaprob -l list

Built-in Devices:

Device Name: Interface Active Check
Current state: OK

Device Name: Recovery Delay
Current state: OK

Device Name: CoreXL Configuration
Current state: OK

Registered Devices:

Device Name: Fullsync
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 9833.7 sec

Device Name: Policy
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 9832.4 sec

Device Name: routed
Registration number: 2
Timeout: none
Current state: OK
Time since last report: 239927 sec

Device Name: cxld
Registration number: 3
Timeout: 30 sec
Current state: OK
Time since last report: 240004 sec
Process Status: UP

Device Name: fwd
Registration number: 4
Timeout: 30 sec
Current state: OK
Time since last report: 240004 sec
Process Status: UP

Device Name: cphad
Registration number: 5
Timeout: 30 sec
Current state: OK
Time since last report: 239991 sec
Process Status: UP

Device Name: Init
Registration number: 6
Timeout: none
Current state: OK
Time since last report: 239986 sec

[Expert@CP-FW-02:0]#

To gracefully failover, you do this on MASTER member:

clusterXL_admin down;clusterXL_admin up

PhoneBoy · ‎2024-07-08

If you want to understand how ClusterXL works, start with the documentation: https://sc1.checkpoint.com/documents/R81.10/WebAdminGuides/EN/CP_R81.10_ClusterXL_AdminGuide/Topics-...

Other members "take over" when they see issues on the other node.
The management pushes policy to both members independently
To remove a firewall from a cluster, it would need to be removed from the relevant cluster object and a policy installation done.

Are you a member of CheckMates?

Cluster checkpoint