Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
cezar_varlan1
Collaborator

How to resize Check Point Cloudguard NGFW High-Availability Cluster in Azure

Procedure to upgrade Azure CKP Cloudguard firewalls (tested on R80.40)

 

[1] Connect to SSH on each of the Firewalls in the cluster

                [a] Evaluate the cluster is working and take not of the current Primary member

                # cphaprob state

                # cphaprob roles

 

 

 

 

 

ID         Role

1 (local)  Master
2          Non-Master

 

 

 

 

 

                [b] Check current core count and distribution

                # fw ctl get int fwlic_num_of_allowed_cores

                # fw ctl multik stat

                [c] Check current contents of boot.conf file

                # cat /var/opt/fw.boot/boot.conf

 

 

 

 

[Expert@ckpupgrademe1:0]# cat /var/opt/fw.boot/boot.conf
CTL_IPFORWARDING        1
DEFAULT_FILTER_PATH     /etc/fw.boot/default.bin
KERN_INSTANCE_NUM       3
COREXL_INSTALLED        1
KERN6_INSTANCE_NUM      2
IPV6_INSTALLED  0
CORE_OVERRIDE   4

 

 

 

 

[2] Connect to portal.azure.com and locate the Firewall VM’s. Proceed to stop the standby member identified at step [1][a] as Non-Master or Standby.

[3] On the Firewall VM’s properties - Browse to Settings > Size  

Pick the new VM size of choice:

In our case we need 16 cores

Reference: https://support.checkpoint.com/results/sk/sk109360#Pricing%20in%20Azure%20Marketplace

[4] Press the Resize button

[5] Once the machine is resized you can Start the VM and check cluster status

 

Browse to Overview and press the Start button.

Once the VM has started check cluster status and core distribution

 

 

 

 

[Expert@ckpupgrademe2:0]# cphaprob role
ID         Role

1          Master
2 (local)  Non-Master

[Expert@ckpupgrademe2:0]# cphaprob state

Cluster Mode:   High Availability (Active Up) with IGMP Membership

ID         Unique Address  Assigned Load   State          Name

1          10.8.1.5        100%            ACTIVE         CKPUPGRADEME1
2 (local)  10.8.1.6        0%              STANDBY        CKPUPGRADEME2

Active PNOTEs: None

Last member state change event:
   Event Code:                 CLUS-114802
   State change:               DOWN -> STANDBY
   Reason for state change:    There is already an ACTIVE member in the cluster (member 1)
   Event time:                 Tue Nov  7 11:15:18 2023

Cluster failover count:
   Failover counter:           0
   Time of counter reset:      Tue Nov  7 03:07:17 2023 (reboot)

[Expert@ckpupgrademe2:0]#
[Expert@ckpupgrademe2:0]# fw ctl multik stat
ID | Active  | CPU    | Connections | Peak
----------------------------------------------
 0 | Yes     | 15     |          22 |       39
 1 | Yes     | 14     |          31 |       41
 2 | Yes     | 13     |          34 |       41
[Expert@ckpupgrademe2:0]#

 

 

 

 

 

Looks like the second member still has only 3 FW workers. But it has all the 16 cores.

 

 

 

 

[Expert@ckpupgrademe2:0]# cat /proc/cpuinfo  | grep proc
processor       : 0
processor       : 1
processor       : 2
processor       : 3
processor       : 4
processor       : 5
processor       : 6
processor       : 7
processor       : 8
processor       : 9
processor       : 10
processor       : 11
processor       : 12
processor       : 13
processor       : 14
processor       : 15

 

 

 

 

[6] Check the contents of the boot.conf file.

 

 

 

 

[Expert@ckpupgrademe2:0]# cat /var/opt/fw.boot/boot.conf

CTL_IPFORWARDING        1
DEFAULT_FILTER_PATH     /etc/fw.boot/default.bin
KERN_INSTANCE_NUM       3
COREXL_INSTALLED        1
KERN6_INSTANCE_NUM      2
IPV6_INSTALLED  0
CORE_OVERRIDE   16

 

 

 

 

This means that it sees all the cores but limits number of Kernel Instances to 3. If the two cluster members have different core numbers you would have issues with cluster sync. We do now have full cluster functionality even if the VMs are of different sizes so we can switch this member to active and perform the same steps before finally editing the boot.conf file.

 

[7] Switchover the cluster from Primary to Standby.

[a] Connect via SSH to the current Primary [Active] member

[b] Run the “# clusterXL_admin down” command

[c] Confirm that the cluster has been switched over

[8] Perform steps [1] -> [6] on the new Standby member.

 

[9] Edit boot.conf and make sure to correct the number of KERN_INSTANCE_NUM

 

 

 

 

[Expert@ckpupgrademe2:0]# cat /var/opt/fw.boot/boot.conf
CTL_IPFORWARDING        1
DEFAULT_FILTER_PATH     /etc/fw.boot/default.bin
KERN_INSTANCE_NUM       14
COREXL_INSTALLED        1
KERN6_INSTANCE_NUM      2
IPV6_INSTALLED  0
CORE_OVERRIDE   16

 

 

 

 

[10] Reboot the firewall to apply the changes. Now check the number of cores

 

 

 

 

[Expert@ckpupgrademe1:0]# fw ctl multik stat
Kernel fw_0: CPU 15
Kernel fw_1: CPU 14
Kernel fw_2: CPU 13
Kernel fw_3: CPU 12
Kernel fw_4: CPU 11
Kernel fw_5: CPU 10
Kernel fw_6: CPU 9
Kernel fw_7: CPU 8
Kernel fw_8: CPU 7
Kernel fw_9: CPU 6
Kernel fw_10: CPU 5
Kernel fw_11: CPU 4
Kernel fw_12: CPU 3
Kernel fw_13: CPU 2
Daemon cprid: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Daemon mpdaemon: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Daemon fwd: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Daemon in.asessiond: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Daemon lpd: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Daemon core_uploader: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Daemon cprid: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Daemon cpd: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Interface enP38308p0s2: has multi queue enabled
Interface enP47606p0s2: has multi queue enabled

 

 

 

 

[11] At this point the cluster would not work as one member has more FW instances than the other. Connect to the Primary member and perform the edit on boot.conf and reboot. Once the Primary member is rebooted, the remaining cluster member with modified core **bleep** will now become primary and once the second member comes back online the cluster will be functioning normally.

[12] Push policy and check logs to confirm normal operation.

[13] Futher optimization

[a] Edit Affinity $FWDIR/conf/fwaffinity.conf and allocate cores to specific interfaces. Also keep in mind you need to allocate at least one core to FWD for heavy logging. 
Note: You should not go over the total VM core limit. If you are allocating cores to SND and FWD decrease the Kernel instance number to provide enough usable cores and not oversubscribe.

 

3 Replies
Shay_Levin
Admin
Admin

Nice

Thank you for sharing

0 Kudos
Chris_Atkinson
Employee Employee
Employee

0 Kudos
cezar_varlan1
Collaborator

Yes, this one works too. I tried to remember why this was not enough and the reason is on my last production upgrade I hit some kind of an issue where cpconfig was unable to write to the file. Going back through my notes here is what happened then:

# We tried to edit the /etc/fw.boot/boot.conf file in order to change the KERN_INSTANCE_NUM to 6 but we were getting the permission error.

# As discussed when we change the CoreXL settings then the changes are pushed in the file boot.conf and due to permission issue the changes are not pushing and that's why cores was not increasing in CoreXL

# There was a lock on the file boot.conf in order to release the lock we ran below command:-

chattr -i boot.conf

# After unlocking the file we did the changes through cpconfig and enable the 6 fw workers and post rebooting the gateway we could see that CoreXL is now enable with 6 cores on both the gateway. Also we enable the Multi-Queue



0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.