Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
RPawar
Contributor
Jump to solution

Check Point Maestro SGM2 showing down in cluster.

So we have 2 MHO-140 orchestrators and 2 quantum force 9300 appliances for Security group.

Now all the devices are connected based on a single site architecture.

We have connected the MHO-140 devices with respective uplinks and have connected the 2 quantum force appliances with downlink port dl27 dl28 respectively.

When we check HA status it shows "Down" for second SGM.

tried rebooting the device and reinstalling the policy still same issue, however the appliance inherits the last installed policy as checked with "cpstat fw"

Further checking on the fabric layer i got to know that we have connected MHO1 to SGM1 with DAC and MHO2 with SGM1 with fiber on port dl27 similarly for MHO1 to SGM2 with fiber and MHO2 to SGM2 with DAC on port dl28(this connectivity was done due to length challenge between the devices, length was above 3 meters hence fiber was used).

Now i want to know if this type of connectivity is supported in Maestro setup?, can this be the root cause of the issue?

Would be very much helpful if someone clarifies with some official documentation.

 

Thank You.

 

0 Kudos
1 Solution

Accepted Solutions
RPawar
Contributor

Hello eemap & loyd,

I would like to thank you guys for your assistance,

The issue is now solved.

After viewing the logs I got to know that there was a file which was missing named "fwaccel_dos_rate_on_install" on both the SGMs due to which the There was a problem Ponte 11 on Configuration, I manually created the file with same name and with same permissions on both the SGMs and after reboot of Active SGM it automatically made the Second SGM as Active and First SGM as down. After this I manually rebooted the First SGM and post reboot the cluster was up and in Active-Active state.

 

View solution in original post

12 Replies
Lloyd_Braun
Advisor

Not sure if that is supported in a Maestro setup but running "cphaprob list" and identifying which pnote(s) are not healthy may help point to whether this is the root cause of the issue or not.

0 Kudos
RPawar
Contributor

Thanks for your reply Loyd,

I did check the pnotes aswell and I can see that there is one problem pnote with number 11 and name Configuration.

Can you assist as to how I can further isolate as to what configuration exactly is giving this problem pnote?

0 Kudos
Lloyd_Braun
Advisor

There is some good info up here on configuration pnote indicating /config/active mismatches: sk181887 - Configuration mismatch in a Maestro environment

 

 

 

emmap
Employee
Employee

The pull_config command from that article is the way to go here for sure. You'll need to add the Sync IP of the active SGM on the end of it, for example if SGM1 is active and SGM2 is down, run cpha_blade_config pull_config 192.0.2.1 on SGM2. This will sync the config and likely reboot the SGM. If it doesn't resolve the issue you can add a 'force' switch to the command, run just 'cpha_blade_config' to check the usage.

0 Kudos
RPawar
Contributor

Hello Emap,

Thanks for your response, i tried the pull config method on the problem SGM and rebooted it manually its still in down state

 

0 Kudos
RPawar
Contributor

Hello Loyd,

As suggested in the document i did search the /var/log/configuration_reboot_reason file below is the output from both problem SGM & Active SGM

Problem SGM Logs

Reboot was performed due to configuration mismatch:

- /opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install

Remote file '/opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install' does not exist


===========================================================[ Thu Jun 26 13:50:14 IST 2025 ]===========================================================
Reboot was performed due to configuration mismatch:

- /opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install

Remote file '/opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install' does not exist


===========================================================[ Thu Jun 26 13:54:44 IST 2025 ]===========================================================
Reboot was performed due to configuration mismatch:

- /opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install

Remote file '/opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install' does not exist


===========================================================[ Thu Jun 26 13:59:14 IST 2025 ]===========================================================
Reboot was performed due to configuration mismatch:

- /opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install

Remote file '/opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install' does not exist

 

------------------------------------------------------------------------------------------------------------------------------------------------------

Active SGM Logs

===========================================================[ Tue Apr 22 09:32:30 EDT 2025 ]===========================================================
Reboot was performed due to configuration mismatch:

- /opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install

Remote file '/opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install' does not exist


===========================================================[ Tue Apr 22 09:37:18 EDT 2025 ]===========================================================
Reboot was performed due to configuration mismatch:

- /opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install

Remote file '/opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install' does not exist


===========================================================[ Fri Jun 20 12:56:06 EDT 2025 ]===========================================================
Reboot was performed due to configuration mismatch:

- /config/active

- /opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install


Displaying differences between /config/active file on local (1_01) and remote (1_02) members:

Lines included in the file on remote member only:
- interface:bond1:state on
- interface:bond2:state on
- interface:bond4:state on
- routed:instance:default:static:default:gateway:address:103.148.166.1 t
- routed:instance:default:static:default:gateway:address:103.148.166.1:preference 1

Lines included in the file on local member only:
- interface:bond1:state off
- interface:bond2:state off
- interface:bond4:state off

Remote file '/opt/CPsuite-R81.20/fw1/conf/fwaccel_dos_rate_on_install' does not exist

 

Would be great help if you can assist here

0 Kudos
emmap
Employee
Employee

Looks like someone has done some configuration in local clish instead of gclish. Try the pull config on SGM2 with the force command and see what the output says, if it has sync'd anything over. 

0 Kudos
RPawar
Contributor

Hello emmap,

 

below is the output that was generated after running the command

[Fri Jul 4 12:31:21 IST 2025 | 192.0.2.2] Hop: skiping blade_config_is_part_of_pull_conf_group
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Acquiring lock, waiting 0.1 seconds if needed
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Lock acquired! (file: /opt/CPsuite-R81.20/fw1/tmp/cpha_blade_config_lock)
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Dst ip changed from: 0 to: 192.0.2.1
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Executing "blade_config_pull_update_done_actions 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0" in index 0
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Executing "run_pre_pull_functions 192.0.2.1" in index 0
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Executing "validate_pull_sync_ip" in index 0
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Executing "pull_information_file_from_remote_blade_wrapper" in index 1
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] pull_information_file_from_remote_blade has been called
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] pull_information_file_from_remote_blade: Requesting information file from remote blade (192.0.2.1).
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] pull_information_file_from_remote_blade: Pulling information file from remote blade (192.0.2.1).
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] rgcopy_wrapper: executing command: rgcopy -b 192.0.2.1 /tmp/cpha_blade_config_info_caller_1_02_vs0 /tmp/cpha_blade_config_info_caller_1_02_vs0
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] rgcopy_wrapper: rgcopy succeeded
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] pull_information_file_from_remote_blade: '/tmp/cpha_blade_config_info_caller_1_02_vs0' md5sum on local file is: d62c5655c5b33e9c22d45b9565920fe1 and on remote blade is: d62c5655c5b33e9c22d45b9565920fe1
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] pull_information_file_from_remote_blade: '/tmp/cpha_blade_config_info_caller_1_02_vs0' was pulled from 192.0.2.1
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Executing "blade_config_pull_update_upgrade_info 192.0.2.1" in index 2
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] My ver: MV="R81.20", 192.0.2.1 ver: MV="R81.20"
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Executing "blade_config_pull_update_necessary_actions 192.0.2.1" in index 3
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] My relative time: 1751612482, 192.0.2.1 relative time: 1751612483, diff: -1 seconds
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] Pulling time isn't needed!
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] My policy signature: 185331029, 192.0.2.1 policy signature: 185331029
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] My PS signature: No PS installed, 192.0.2.1 PS signature: No PS installed
[Fri Jul 4 12:31:22 IST 2025 | 192.0.2.2] My FG1 signature: No FG installed, 192.0.2.1 FG1 signature: No FG installed
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] My policy name: Check-Point-Maestro140, 192.0.2.1 policy name: Check-Point-Maestro140
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] My policy load status: 1, 192.0.2.1 policy load status: 1
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Device Filter state is 'OK'
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] My policy time: 1751612200, 192.0.2.1 policy time: 1751612200
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] My local.arp md5sum: , 192.0.2.1 local.arp md5sum:
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] My ICAP configuration md5sum: 1775f7b43aae76a02bd5d3420a330d20, 192.0.2.1 ICAP configuration md5sum: 1775f7b43aae76a02bd5d3420a330d20
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Applying policy isn't needed!
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Executing "blade_config_pull 192.0.2.1" in index 1
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Pulling blade configuration from: 192.0.2.1
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Executing "pull_vsx_mode_wrapper" in index 0
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] pull_vsx_mode: Starting...
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] pull_vsx_mode: Failed to get '/bin/dbget vsx' result on local sgm.
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Executing "pull_ipv6_configuration_wrapper" in index 1
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Pulling IPV6 configuration from: 192.0.2.1
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] 192.0.2.1 IPV6 state is: 0, my IPV6 status is: 0
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Executing "pull_database_wrapper_allow_reboot" in index 2
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Pulling database from: 192.0.2.1
[Fri Jul 4 12:31:23 IST 2025 | 192.0.2.2] Setting cached smo IP : 192.0.2.1
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Deleting SMO cached_smo_ip
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Executing "load_chassis_priority" in index 3
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] load_chassis_priority: load old vs 0
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Executing "compare_certs_wrapper" in index 4
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Comparing remote and local certificates, please wait...
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Certificates OK
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Executing "pull_time_wrapper" in index 5
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Skipping pulling time (already done)
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Executing "pull_instances_wrapper" in index 6
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Executing "pull_policy" in index 7
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Acquiring lock, waiting 0.1 seconds if needed
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Lock acquired! (file: /opt/CPsuite-R81.20/fw1/tmp/cpha_blade_config_lock_fw)
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Skipping pulling policy files (already done)
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Executing "save_old_sic_name" in index 0
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Executing "pull_registry" in index 1
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Executing "pull_license_from_remote_sgm 192.0.2.1" in index 0
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] rgcopy_wrapper: executing command: rgcopy -b 192.0.2.1 /opt/CPshrd-R81.20/conf/cp.license.smo /opt/CPshrd-R81.20/conf/cp.license.smo
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] rgcopy_wrapper: rgcopy succeeded
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] Executing "pull_registry_from_remote_sgm 192.0.2.1" in index 1
[Fri Jul 4 12:31:28 IST 2025 | 192.0.2.2] rgcopy_wrapper: executing command: rgcopy -b 192.0.2.1 /opt/CPshrd-R81.20/registry/HKLM_registry.data /opt/CPshrd-R81.20/registry/HKLM_registry.data.temp
[Fri Jul 4 12:31:29 IST 2025 | 192.0.2.2] rgcopy_wrapper: rgcopy succeeded
[Fri Jul 4 12:31:29 IST 2025 | 192.0.2.2] Executing "apply_registry_to_temp_file /opt/CPshrd-R81.20/registry/HKLM_registry.data.temp" in index 2
[Fri Jul 4 12:31:29 IST 2025 | 192.0.2.2] Applying registry changes to /opt/CPshrd-R81.20/registry/HKLM_registry.data.temp
[Fri Jul 4 12:31:29 IST 2025 | 192.0.2.2] Executing "apply_registry_wd_pid /opt/CPshrd-R81.20/registry/HKLM_registry.data.temp" in index 0
[Fri Jul 4 12:31:29 IST 2025 | 192.0.2.2] Failed to retrieve Wd_Pid
[Fri Jul 4 12:31:29 IST 2025 | 192.0.2.2] Saving must save local values to remote registry ended with return code 0
[Fri Jul 4 12:31:29 IST 2025 | 192.0.2.2] Executing "commit_license /opt/CPshrd-R81.20/conf/cp.license.temp" in index 3
[Fri Jul 4 12:31:29 IST 2025 | 192.0.2.2] Executing "commit_registry /opt/CPshrd-R81.20/registry/HKLM_registry.data.temp" in index 4
[Fri Jul 4 12:31:29 IST 2025 | 192.0.2.2] Executing command: /opt/CPsmo-R81.20/bin/copy_reg_keep_local_vals.tcl /opt/CPshrd-R81.20/registry/HKLM_registry.data.temp /opt/CPshrd-R81.20/registry/HKLM_registry.data
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] copy_reg_keep_local_vals: commit registry succeeded
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] Pull registry succeeded
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] Executing "save_new_sic_name" in index 2
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] Executing "refresh_sic_name" in index 3
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] Executing "pull_amw 192.0.2.1 0" in index 8
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] AMW pull function. called with args: 192.0.2.1 0
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] No AMW installed on SMO (192.0.2.1)
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] No need to pull AMW policy
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] Executing "pull_swb_perf 192.0.2.1" in index 9
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] Pulling swb_perf configuration from: 192.0.2.1 (boot or cpstart event)
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] swb_perf state local = 0; swb_perf state remote = 0
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] Executing "handle_sxl_mode_change" in index 10
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] Executing "pull_mq_configurations" in index 11
[Fri Jul 4 12:31:35 IST 2025 | 192.0.2.2] Pulling MQ configurations from smo
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Executing "compare_snapshot_current" in index 12
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Comparing remote and local running snapshot, please wait...
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Snapshot comparison OK
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Executing "remove_reboot_retry_file" in index 13
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Acquiring lock, waiting 0.1 seconds if needed
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Lock acquired! (file: /tmp/cpha_blade_config_boot_retries_lock)
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Vs 0 removed reboot retry file
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Lock released! (file: /tmp/cpha_blade_config_boot_retries_lock)
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] pull returned: 0
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Acquiring lock, waiting 0.1 seconds if needed
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Lock acquired! (file: /tmp/cpha_blade_config_boot_retries_lock)
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] Lock released! (file: /tmp/cpha_blade_config_boot_retries_lock)
[Fri Jul 4 12:31:50 IST 2025 | 192.0.2.2] pull configuration completed successfully on try number: 1

0 Kudos
RPawar
Contributor

It shows pull configuration was successful but still the SGM is not participating in the HA cluster.

this is the issue we have been facing from the very begining of the implementation, initially we were not able to m 1 2 (move) to the second member we thaught its because of no policy installed onthe members, post policy installation we were able to move between SGMs however the HA Down status for second member was observed.

0 Kudos
emmap
Employee
Employee

Did you use the 'force' flag when you pulled the config? Try: cpha_blade_config pull_config all -force 192.0.2.1 - and if that doesn't work maybe remove the SGM from the security group, let it reboot and settle, then add it back in.

0 Kudos
RPawar
Contributor

Yes the above output was the result of using -force  in the command

 

0 Kudos
RPawar
Contributor

Hello eemap & loyd,

I would like to thank you guys for your assistance,

The issue is now solved.

After viewing the logs I got to know that there was a file which was missing named "fwaccel_dos_rate_on_install" on both the SGMs due to which the There was a problem Ponte 11 on Configuration, I manually created the file with same name and with same permissions on both the SGMs and after reboot of Active SGM it automatically made the Second SGM as Active and First SGM as down. After this I manually rebooted the First SGM and post reboot the cluster was up and in Active-Active state.