Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
pepso100
Explorer

CoreXL doesnt see one core

Hi guys, 

 

I am just converted StandAlone FW to cluster, but I can't sync cluster due to error "Mismatch in the number of CoreXL FW instances has been detected"

 

 

[Expert@gw_b:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

ID Unique Address Assigned Load State Name

1 169.254.0.2 100% ACTIVE(!) gwA
2 (local) 169.254.0.1 0% DOWN gwB


Active PNOTEs: COREXL

Last member state change event:
Event Code: CLUS-113905
State change: ACTIVE(!) -> DOWN
Reason for state change: Mismatch in the number of CoreXL FW instances has been detected
Event time: Thu Nov 24 14:55:43 2022

Last cluster failover event:
Transition to new ACTIVE: Member 2 -> Member 1
Reason: Mismatch in the number of CoreXL FW instances has been detected
Event time: Thu Nov 24 14:55:43 2022

Cluster failover count:
Failover counter: 1
Time of counter reset: Thu Nov 24 14:55:07 2022 (reboot)

 

1. licenses should be fine on both members of clsuter

 

[Expert@gw_a:0]# cplic print
Host Expiration Features
192.168.1.100 16Dec2022 CPSG-C-8-U CPSB-FW CPSB-VPN CPSB-IPSA CPSB-DL

 

[Expert@gw_b:0]# cplic print
Host Expiration Features
192.168.1.100 24Dec2022 CPSG-C-8-U CPSB-FW CPSB-VPN CPSB-IPSA CPSB-DLP

 

2. Number of cores is same on both members of cluster

[Expert@gw_a:0]# cpstat -f cpu os | grep -i CPUs
    CPUs Number: 6

[Expert@gw_b:0]# cpstat -f cpu os | grep -i CPUs
    CPUs Number: 6

 

3. here is difference. 

 

on GW_A  is   (only)

fw_0: CPU 5
fw_1: CPU 2

and on GW_B is 
fw_0: CPU 5
fw_1: CPU 2
fw_2: CPU 4

 

[Expert@gw_a:0]# fw ctl affinity -l
Interface eth0: CPU 0
Interface eth1: CPU 0
Interface eth2: CPU 0
Interface eth3: CPU 0
fw_0: CPU 5
fw_1: CPU 2
Daemon mpdaemon: CPU 2 5
Daemon fwd: CPU 2 5
Daemon scrubd: CPU 2 5
Daemon core_uploader: CPU 2 5
Daemon rtmd: CPU 2 5
Daemon rad: CPU 2 5
Daemon in.msd: CPU 2 5
Daemon usrchkd: CPU 2 5
Daemon pdpd: CPU 2 5
Daemon cp_file_convertd: CPU 2 5
Daemon pepd: CPU 2 5
Daemon lpd: CPU 2 5
Daemon in.asessiond: CPU 2 5
Daemon scanengine_b: CPU 2 5
Daemon watermark_cp_file_convertd: CPU 2 5
Daemon vpnd: CPU 2 5
Daemon in.acapd: CPU 2 5
Daemon scrub_cp_file_convertd: CPU 2 5
Daemon cprid: CPU 2 5
Daemon cpd: CPU 2 5

 

 

[Expert@gw_b:0]# fw ctl affinity -l
Interface eth0: CPU 0
Interface eth1: CPU 0
Interface eth2: CPU 0
Interface eth3: CPU 0
fw_0: CPU 5
fw_1: CPU 2
fw_2: CPU 4
Daemon mpdaemon: CPU 2 4 5
Daemon fwd: CPU 2 4 5
Daemon pdpd: CPU 2 4 5
Daemon lpd: CPU 2 4 5
Daemon cp_file_convertd: CPU 2 4 5
Daemon vpnd: CPU 2 4 5
Daemon scrub_cp_file_convertd: CPU 2 4 5
Daemon pepd: CPU 2 4 5
Daemon rad: CPU 2 4 5
Daemon rtmd: CPU 2 4 5
Daemon in.asessiond: CPU 2 4 5
Daemon usrchkd: CPU 2 4 5
Daemon in.acapd: CPU 2 4 5
Daemon watermark_cp_file_convertd: CPU 2 4 5
Daemon in.msd: CPU 2 4 5
Daemon scrubd: CPU 2 4 5
Daemon cprid: CPU 2 4 5
Daemon cpd: CPU 2 4 5

 

Any idea someone what can be an issue? 

Thank you very much for any advice...

 

 

 

0 Kudos
18 Replies

Have you checked cpconfig > corexl?

Could you please share the version / jumbo take information for the cluster?

0 Kudos
pepso100
Explorer

GW_B

[Expert@gw_b:0]# cpinfo -y all

This is Check Point CPinfo Build 914000214 for GAIA
[IDA]
No hotfixes..

[MGMT]
No hotfixes..

[CPFC]
HOTFIX_TEX_ENGINE_R81_AUTOUPDATE

[FW1]
HOTFIX_GOT_TPCONF_AUTOUPDATE
HOTFIX_TEX_ENGINE_R81_AUTOUPDATE

FW1 build number:
This is Check Point's software version R81 - Build 959
kernel: R81 - Build 813

[SecurePlatform]
No hotfixes..

[PPACK]
No hotfixes..

[CPinfo]
No hotfixes..

[AutoUpdater]
No hotfixes..

[DIAG]
No hotfixes..

[CVPN]
No hotfixes..

[CPUpdates]
BUNDLE_GOT_TPCONF_AUTOUPDATE Take: 63
BUNDLE_TEX_ENGINE_R81_AUTOUPDATE Take: 10

 

GW_A

gw_a> cpinfo -y all

This is Check Point CPinfo Build 914000231 for GAIA
[IDA]
No hotfixes..
[MGMT]
No hotfixes..
[CPFC]
HOTFIX_TEX_ENGINE_R81_AUTOUPDATE
[FW1]
HOTFIX_TEX_ENGINE_R81_AUTOUPDATE
HOTFIX_R80_40_MAAS_TUNNEL_AUTOUPDATE
HOTFIX_GOT_TPCONF_AUTOUPDATE

FW1 build number:
This is Check Point's software version R81 - Build 959
kernel: R81 - Build 813
[SecurePlatform]
No hotfixes..
[PPACK]
No hotfixes..
[CPinfo]
No hotfixes..
[AutoUpdater]
No hotfixes..
[DIAG]
No hotfixes..
[CVPN]
No hotfixes..
[CPUpdates]
BUNDLE_CPSDC_AUTOUPDATE Take: 21
BUNDLE_HCP_AUTOUPDATE Take: 58
BUNDLE_CORE_FILE_UPLOADER_AUTOUPDATE Take: 17
BUNDLE_R80_40_MAAS_TUNNEL_AUTOUPDATE Take: 49
BUNDLE_INFRA_AUTOUPDATE Take: 55
BUNDLE_DEP_INSTALLER_AUTOUPDATE Take: 25
BUNDLE_GENERAL_AUTOUPDATE Take: 13
BUNDLE_GOT_TPCONF_AUTOUPDATE Take: 111
BUNDLE_TEX_ENGINE_R81_AUTOUPDATE Take: 12
[CPDepInst]
No hotfixes..
[core_uploader]
HOTFIX_CHARON_HF
[hcp_wrapper]
HOTFIX_HCP_AUTOUPDATE
[cpsdc_wrapper]
HOTFIX_CPSDC_AUTOUPDATE

0 Kudos
Lesley
Contributor

1 gateway has 2 FW workers and the other one has 3. I would pick a number and use it on both members 🙂 

After changes I think you need to reboot to make it effective. 

0 Kudos
pepso100
Explorer

how can I do that? 

number of workers should be  always  number of cores -1 (SND), no?

 

0 Kudos
_Val_
Admin
Admin

By reading the comments here, it seems your boxes have different amount of CoreXL instances configured. run cpconfig and check there is any manual modifications were done. Set up the same amount of instances and rebote the members. This should clean up the issue.

0 Kudos
pepso100
Explorer

Hi Val,

thx for advice.

 

I did exactly what you said ( I setup 6 instances on both nodes via cpconfig and reboot them), and now it looks that number of CoreXL instantces are same but output is slightly different on GW_A and GW_B 

[Expert@gw_a:0]# fw ctl affinity -l
Interface eth0: CPU 0
Interface eth1: CPU 0
Interface eth2: CPU 0
Interface eth3: CPU 0
fw_0: CPU 5
fw_1: CPU 2
fw_2: CPU 2
fw_3: CPU 2
fw_4: CPU 2
fw_5: CPU 2
Daemon mpdaemon: CPU 2 5
Daemon fwd: CPU 2 5
Daemon scrub_cp_file_convertd: CPU 2 5
Daemon core_uploader: CPU 2 5
Daemon rtmd: CPU 2 5
Daemon rad: CPU 2 5
Daemon in.msd: CPU 2 5
Daemon pdpd: CPU 2 5
Daemon watermark_cp_file_convertd: CPU 2 5
Daemon lpd: CPU 2 5
Daemon scrubd: CPU 2 5
Daemon in.asessiond: CPU 2 5
Daemon scanengine_b: CPU 2 5
Daemon vpnd: CPU 2 5
Daemon cp_file_convertd: CPU 2 5
Daemon usrchkd: CPU 2 5
Daemon in.acapd: CPU 2 5
Daemon pepd: CPU 2 5
Daemon cprid: CPU 2 5
Daemon cpd: CPU 2 5

 

 

[Expert@gw_b:0]# fw ctl affinity -l
Interface eth0: CPU 0
Interface eth1: CPU 0
Interface eth2: CPU 0
Interface eth3: CPU 0
fw_0: CPU 5
fw_1: CPU 2
fw_2: CPU 4
fw_3: CPU 1
fw_4: CPU 3
fw_5: CPU 0

But I have still same issue.

[Expert@gw_a:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

ID Unique Address Assigned Load State Name

1 (local) 169.254.0.2 100% ACTIVE(!) gwA
2 169.254.0.1 0% DOWN gwB


Active PNOTEs: COREXL

Last member state change event:
Event Code: CLUS-113905
State change: STANDBY -> ACTIVE(!)
Reason for state change: Mismatch in the number of CoreXL FW instances has been detected
Event time: Thu Nov 24 15:06:52 2022

Last cluster failover event:
Transition to new ACTIVE: Member 2 -> Member 1
Reason: Mismatch in the number of CoreXL FW instances has been detected
Event time: Thu Nov 24 15:06:52 2022

Cluster failover count:
Failover counter: 1
Time of counter reset: Thu Nov 24 15:06:17 2022 (reboot)

0 Kudos
Ilya_Yusupov
Employee
Employee

Hi @pepso100 ,

 

Based on your last output i can only a guess that one member running in user space mode while the second one i kernel mode.

You can validate through cpconfig, under corexl section.

0 Kudos
pepso100
Explorer

Hi Ilya,

 

Where in cpconfig I can see if I am in kernel mode or user space mode?

Thank you.

gw_b> cpconfig
This program will let you re-configure
your Check Point products configuration.


Configuration Options:
----------------------
(1) Licenses and contracts
(2) SNMP Extension
(3) PKCS#11 Token
(4) Random Pool
(5) Secure Internal Communication
(6) Disable cluster membership for this gateway
(7) Enable Check Point Per Virtual System State
(8) Enable Check Point ClusterXL for Bridge Active/Standby
(9) Check Point CoreXL
(10) Automatic start of Check Point Products

(11) Exit

Enter your choice (1-11) :9

 

Configuring Check Point CoreXL...
=================================


CoreXL is currently enabled with 6 IPv4 firewall instances.

(1) Change the number of firewall instances
(2) Disable Check Point CoreXL

0 Kudos
Ilya_Yusupov
Employee
Employee

Sorry i forgot to ask first which version you are running as the option is there in R81.10.

0 Kudos
pepso100
Explorer

yes u right, I am running on R81.10

0 Kudos
_Val_
Admin
Admin

this is not what I asked you to do. run cpconfig and show us what you have in corexl on both members.

0 Kudos
pepso100
Explorer

Hi Valeri,

 

1.

in your original post you wrote: "Set up the same amount of instances and rebote the members" .. .so I did it.

So now  I am little bit confused  by your reply "this is not what I asked you to do".

 

2.  Today morning when I boot up both FWs, status of cluster was ok 

[Expert@gw_b:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

ID Unique Address Assigned Load State Name

1 169.254.0.2 0% STANDBY gwA
2 (local) 169.254.0.1 100% ACTIVE gwB


Active PNOTEs: None

Last member state change event:
Event Code: CLUS-114904
State change: ACTIVE(!) -> ACTIVE
Reason for state change: Reason for ACTIVE! alert has been resolved
Event time: Fri Nov 25 10:26:29 2022

Cluster failover count:
Failover counter: 0
Time of counter reset: Fri Nov 25 10:25:17 2022 (reboot)

 

[Expert@gw_a:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

ID Unique Address Assigned Load State Name

1 (local) 169.254.0.2 0% STANDBY gwA
2 169.254.0.1 100% ACTIVE gwB


Active PNOTEs: None

Last member state change event:
Event Code: CLUS-114802
State change: INIT -> STANDBY
Reason for state change: There is already an ACTIVE member in the cluster (member 2)
Event time: Fri Nov 25 10:26:30 2022

Cluster failover count:
Failover counter: 0
Time of counter reset: Fri Nov 25 10:25:17 2022 (reboot)

 

3. 

[Expert@gw_a:0]# fw ctl affinity -l
Interface eth0: CPU 0
Interface eth1: CPU 0
Interface eth2: CPU 0
Interface eth3: CPU 0
fw_0: CPU 5
fw_1: CPU 2
fw_2: CPU 4
fw_3: CPU 1
fw_4: CPU 3
fw_5: CPU 0
[Expert@gw_a:0]# cpconfig
This program will let you re-configure
your Check Point products configuration

 

[Expert@gw_b:0]# fw ctl affinity -l
Interface eth0: CPU 0
Interface eth1: CPU 0
Interface eth2: CPU 0
Interface eth3: CPU 0
fw_0: CPU 5
fw_1: CPU 2
fw_2: CPU 4
fw_3: CPU 1
fw_4: CPU 3
fw_5: CPU 0
[Expert@gw_b:0]# cpconfig
This program will let you re-configure
your Check Point products configuration.

 

3. here  is output of coreXL

 

[Expert@gw_a:0]# cpconfig
This program will let you re-configure
your Check Point products configuration.


Configuration Options:
----------------------
(1) Licenses and contracts
(2) SNMP Extension
(3) PKCS#11 Token
(4) Random Pool
(5) Secure Internal Communication
(6) Disable cluster membership for this gateway
(7) Enable Check Point Per Virtual System State
(8) Enable Check Point ClusterXL for Bridge Active/Standby
(9) Check Point CoreXL
(10) Automatic start of Check Point Products

(11) Exit

Enter your choice (1-11) :9

 

Configuring Check Point CoreXL...
=================================


CoreXL is currently enabled with 6 IPv4 firewall instances.

(1) Change the number of firewall instances
(2) Disable Check Point CoreXL

(3) Exit
Enter your choice (1-3) : 1

This machine has 6 CPUs.

Note: All cluster members must have the same number of firewall instances
enabled.

How many IPv4 firewall instances would you like to enable (2 to 6) [4] ? ^C

 

 

[Expert@gw_b:0]# cpconfig
This program will let you re-configure
your Check Point products configuration.


Configuration Options:
----------------------
(1) Licenses and contracts
(2) SNMP Extension
(3) PKCS#11 Token
(4) Random Pool
(5) Secure Internal Communication
(6) Disable cluster membership for this gateway
(7) Enable Check Point Per Virtual System State
(8) Enable Check Point ClusterXL for Bridge Active/Standby
(9) Check Point CoreXL
(10) Automatic start of Check Point Products

(11) Exit

Enter your choice (1-11) :9

 

Configuring Check Point CoreXL...
=================================


CoreXL is currently enabled with 6 IPv4 firewall instances.

(1) Change the number of firewall instances
(2) Disable Check Point CoreXL

(3) Exit
Enter your choice (1-3) : 1

This machine has 6 CPUs.

Note: All cluster members must have the same number of firewall instances
enabled.

How many IPv4 firewall instances would you like to enable (2 to 6) [4] ? ^C

I have no idea what fixed the issue...

I would like to really understand to the background what cause it and what fix it. 

Anyway thank you for all your help!

 

0 Kudos
_Val_
Admin
Admin

On one of the members you had 3, not 4 CoreXl instances available. Now you have 4 cores, which is default for 6 cores in total, on both cluster members. This is what fixed your issue.

0 Kudos
pepso100
Explorer

I knew that I have different number of CoreXL instances. 

The thing is   I setup same amount of CoreXL instances (via cpconfig to  6)  and despite that  "fw ctl affinity -l" showed me different number of CoreXL instances pre cluster member.

 

As you can see from output above I have 6 coreXL instances  "CoreXL is currently enabled with 6 IPv4 firewall instances."

So after my change  cpconfig, nothing change (yesterday).

Today If I boot up both FWs,  number of CoreXL instances are same

0 Kudos
_Val_
Admin
Admin

Reboot is required after changing amount of CoreXL instances. Once you rebooted, it is all green.

0 Kudos
pepso100
Explorer

1. yesterday I change via cpconfig  number of CoreXL instances to 6.

2. then I rebooted both FWs

3. after loading  (booting them up)  I finally have outcome where I had same amount of CoreXL instances but

 3-A)  outcome of fw ctl affinity -l   was different as you can see from output below

[Expert@gw_a:0]# fw ctl affinity -l
Interface eth0: CPU 0
Interface eth1: CPU 0
Interface eth2: CPU 0
Interface eth3: CPU 0
fw_0: CPU 5
fw_1: CPU 2
fw_2: CPU 2
fw_3: CPU 2
fw_4: CPU 2
fw_5: CPU 2
Daemon mpdaemon: CPU 2 5
Daemon fwd: CPU 2 5
Daemon scrub_cp_file_convertd: CPU 2 5
Daemon core_uploader: CPU 2 5
Daemon rtmd: CPU 2 5
Daemon rad: CPU 2 5
Daemon in.msd: CPU 2 5
Daemon pdpd: CPU 2 5
Daemon watermark_cp_file_convertd: CPU 2 5
Daemon lpd: CPU 2 5
Daemon scrubd: CPU 2 5
Daemon in.asessiond: CPU 2 5
Daemon scanengine_b: CPU 2 5
Daemon vpnd: CPU 2 5
Daemon cp_file_convertd: CPU 2 5
Daemon usrchkd: CPU 2 5
Daemon in.acapd: CPU 2 5
Daemon pepd: CPU 2 5
Daemon cprid: CPU 2 5
Daemon cpd: CPU 2 5

 

 

[Expert@gw_b:0]# fw ctl affinity -l
Interface eth0: CPU 0
Interface eth1: CPU 0
Interface eth2: CPU 0
Interface eth3: CPU 0
fw_0: CPU 5
fw_1: CPU 2
fw_2: CPU 4
fw_3: CPU 1
fw_4: CPU 3
fw_5: CPU 0

 

3-B) output of cphaprob state  showed me that cluster is not in-sync

 

[Expert@gw_a:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

ID Unique Address Assigned Load State Name

1 (local) 169.254.0.2 100% ACTIVE(!) gwA
2 169.254.0.1 0% DOWN gwB


Active PNOTEs: COREXL

Last member state change event:
Event Code: CLUS-113905
State change: STANDBY -> ACTIVE(!)
Reason for state change: Mismatch in the number of CoreXL FW instances has been detected
Event time: Thu Nov 24 15:06:52 2022

Last cluster failover event:
Transition to new ACTIVE: Member 2 -> Member 1
Reason: Mismatch in the number of CoreXL FW instances has been detected
Event time: Thu Nov 24 15:06:52 2022

Cluster failover count:
Failover counter: 1
Time of counter reset: Thu Nov 24 15:06:17 2022 (reboot)

 

 

Summary:

Yesterday: after  changing number of CoreXL instances and reboots, change only nubmer of CoreXL instances (were same on both nodes) but  cluster was still not in-sync

Today: when I booted up both FWs everything is fine. ( Also number of CoreXL instances and  Cluster is in-sync)

0 Kudos

If the problem persists please capture a CPinfo & hwdiag outputs and contact support to investigate further.

_Val_
Admin
Admin

Just to make sure, setting to 6 cores in the system where only 6 cores present is wrong.

0 Kudos