Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
razotevsSVR
Explorer

Cloudguard On Azure management Server High Availability failing

Hi, 
 

I will appreciate any ideas regarding Management HA in Azure or Cloudguard in cloud environment. Thanks in advance!

 

I have two management servers both on Azure, R81.10. Deployed straight from the marketplace and for the sake of the tests in the very same Vnet and Subnet to avoid routing and firewall issues. Connected to the Primary and following sk54160 I am getting:

 

SIC Status for azurecpmngmhubneu: Not Communicating

Peer sent wrong DN: cn=cp_mgmt,o=azurecpmngmhubneu..krdwxk** Reset SIC from peer, and establish trust again. **

Opened TAC service request "6-0003465014" ten days ago and they suggested sk110514. This article however is pointing to mdsenv command which is not available:

Primary Management Server:

[Expert@azurecpmngmhubweu:0]# mdsenv azurecpmngmhubweu

-bash: mdsenv: command not found

 

Secondary Management Server:

[Expert@azurecpmngmhubneu:0]# mdsenv

-bash: mdsenv: command not found

 

The second suggestion was to check the MTU, but the MTU is the default one and changes according to Microsoft can cause even more issues, so I am not being willing to play with that on VM level.

Deployment of security gateway from the marketplace and trying to activate it as a secondary management however is failing with another error which is kind of expected (Did it just for the sake of the initial SIC Trust Establishment and seems to work just fine):

Error: 'Security Management Server' is not responding. Verify that 'Security Management Server' is installed on the gateway. If 'Security Management Server' should not be installed verify that it is not selected in the Products List of the gateway (SmartDashboard > Security Gateway > General Properties > Software Blades List).

0 Kudos
13 Replies
_Val_
Admin
Admin

Could you please share the content of your /etc/profile.d/CP.sh file?


0 Kudos
razotevsSVR
Explorer

Sure Val, thanks for the reply

 

if [ -r /opt/CPshrd-R81.10/tmp/.CPprofile.sh ]; then
. /opt/CPshrd-R81.10/tmp/.CPprofile.sh
fi

0 Kudos
test
Explorer

Ooookay, then /opt/CPshrd-R81.10/tmp/.CPprofile.sh, please 🙂

0 Kudos
razotevsSVR
Explorer

Yep, this one is holding a bit more 😀

====================================================================

# vi /opt/CPshrd-R81.10/tmp/.CPprofile.sh

. /opt/CPshrd-R81.10/scripts/cpprofile_functions.sh

_cpprof_add CPDIR /opt/CPshrd-R81.10 1 1

_cpprof_dir PATH $CPDIR/util 1

_cpprof_add CPAPACHEDIR "/opt/CPshrd-R81.10/web/Apache" 1 1

SAMLPORTAL_HOME=/opt/CPSamlPortal ; export SAMLPORTAL_HOME ; hash 1>/dev/null 2>&1

#CPPostgreSQL Start DON'T REMOVE MANUALLY

PG_LIB_PATH=$CPDIR/database/postgresql/lib

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PG_LIB_PATH ; export LD_LIBRARY_PATH

#CPPostgreSQL End DON'T REMOVE MANUALLY

LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${CPDIR}/lib64 ; export LD_LIBRARY_PATH

_cpprof_add FWDIR "/opt/CPsuite-R81.10/fw1" 1 1

LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${FWDIR}/lib64 ; export LD_LIBRARY_PATH

_cpprof_add MDS_FWDIR "/opt/CPsuite-R81.10/fw1" 0 0

_cpprof_add CPMDIR "/opt/CPsuite-R81.10/fw1" 0 0

_cpprof_add SUDIR "/opt/CPsuite-R81.10/fw1/sup" 0 0

_cpprof_add SUROOT "/var/log/cpupgrade/suroot" 0 0

_cpprof_add FW_BOOT_DIR "/etc/fw.boot" 0 0

_cpprof_add NGM_SOLR_LOCAL_PATH "/opt/CPsuite-R81.10/fw1/Solr" 0 0

_cpprof_add JAVA_HOME "/opt/CPsuite-R81.10/fw1/jre" 1 0

_cpprof_add NGM_MEM "2048" 0 0

_cpprof_add PGDIR "/opt/CPshrd-R81.10/database/postgresql" 0 0

_cpprof_add PGDATA "/opt/CPshrd-R81.10/database/postgresql/data" 0 0

_cpprof_add DONT_LOAD_FWM_OBJECTS "1" 0 0

_cpprof_add DC_DIR "/opt/CPDynamicContent" 0 0

_cpprof_add ITP_DIR "/opt/CPInfinityTp" 0 0

_cpprof_add CLASSPATH "/opt/CPsuite-R81.10/fw1/ngm" 0 0

LD_LIBRARY_PATH=/opt/uf/SecureComputing/lib:${LD_LIBRARY_PATH} ; export LD_LIBRARY_PATH ; hash 1>/dev/null 2>&1

UCPORTALDIR_HOME=/opt/CPUserCheckPortal ; export UCPORTALDIR_HOME ; hash 1>/dev/null 2>&1

DLPDIR=/opt/CPsuite-R81.10/fw1/dlp ; export DLPDIR

PATH=${PATH}:${FWDIR}/oracle_oi/sdk ; export PATH ; hash 1>/dev/null 2>&1

LD_LIBRARY_PATH=${FWDIR}/oracle_oi/sdk:${LD_LIBRARY_PATH} ; export LD_LIBRARY_PATH ; hash 1>/dev/null 2>&1

POSTFIX_DIR=/opt/postfix ; export POSTFIX_DIR ; hash 1>/dev/null 2>&1

MAIL_CONFIG=/opt/postfix/etc/postfix ; export MAIL_CONFIG ; hash 1>/dev/null 2>&1

_cpprof_add JAVA_HOME "/opt/CPshrd-R81.10/jre_32" 1 0

_cpprof_add JAVA_HOME_32 "/opt/CPshrd-R81.10/jre_32" 0 0

_cpprof_add JAVA_HOME_64 "/opt/CPshrd-R81.10/jre_64" 0 0

_cpprof_add JETTY_HOME "/opt/CPshrd-R81.10/jetty" 0 0

_cpprof_add FGDIR "/opt/CPsuite-R81.10/fg1" 1 1

_cpprof_add ZETCDIR "/opt/CPzetc" 1 0

_cpprof_add DADIR "/opt/CPda" 1 0

START_AUTO_UPDATER=1 ; export START_AUTO_UPDATER

AUTOUPDATERDIR=/opt/AutoUpdater ; export AUTOUPDATERDIR

PATH=${PATH}:${AUTOUPDATERDIR}/latest/bin ; export PATH ; hash 1>/dev/null 2>&1

_cpprof_add INFODIR "/opt/CPinfo-10" -1 0

_cpprof_add PATH_DIR "/opt/CPDepInst/latest" 1 0

_cpprof_add DDRDIR "/opt/DDR" 1 0

NACPORTAL_HOME=/opt/CPNacPortal ; export NACPORTAL_HOME ; hash 1>/dev/null 2>&1

_cpprof_add DIAGDIR "/opt/CPdiag" 1 1

_cpprof_add RTDIR "/opt/CPrt-R81.10" 1 1

_cpprof_dir PATH "/opt/CPrt-R81.10/log_indexer" 1

_cpprof_dir PATH "/opt/CPrt-R81.10/log_exporter" 1

_cpprof_add INDEXERDIR "/opt/CPrt-R81.10/log_indexer" 1 1

_cpprof_add EXPORTERDIR "/opt/CPrt-R81.10/log_exporter" 0 0

_cpprof_add UEPMDIR "/opt/CPuepm-R81.10" 1 1

_cpprof_dir PATH "${CPDIR}/database/postgresql/bin" 1

_cpprof_dir PATH "${UEPMDIR}/engine/scripts" 1

_cpprof_dir LD_LIBRARY_PATH "${CPDIR}/database/postgresql/lib" 1

_cpprof_add VSECDIR "/opt/CPvsec-R81.10" 1 1

_cpprof_add CDTDIR "/opt/CPcdt" 1 0

LD_LIBRARY_PATH=/opt/CPcdt/lib:${LD_LIBRARY_PATH} ; export LD_LIBRARY_PATH ; hash 1>/dev/null 2>&1

_cpprof_add DEPCONDIR "/opt/CPDepCon-R81.10" 1 0

_cpprof_add REPMANDIR "/opt/CPRepMan-R81.10" 1 0

_cpprof_add SMARTLOGDIR "/opt/CPSmartLog-R81.10" 1 0

_cpprof_add CPM_DOCTOR "ON" 0 0

_cpprof_add DYNAMICCONTENTDIR "/opt/CPDynamicContent" 1 0

_cpprof_add CMEDIR "/opt/CPcme" 1 0

_cpprof_add OPENSSL_CONF "/opt/CPshrd-R81.10/conf/openssl.cnf" 0 0

_cpprof_add CPOTELCOL_DIR "/opt/CPotelcol" 1 1

_cpprof_add CPVIEWEXPORTER_DIR "/opt/CPviewExporter" 1 1

0 Kudos
_Val_
Admin
Admin

Ok, this one looks okay. Try checking if $MDSDIR is resolved for you. If it is, run mdsenv as $MDSDIR/bin/mdsenv. If it is not, this is something for TAC to look into.

Your bash shell, did you set up bash for a user manually, or are you accessing it with expert command, while first logging to cpshell?

  

0 Kudos
razotevsSVR
Explorer

During the arm template deployment from the marketplace it is providing few options for "Default shell for admin user"

The options are: (I am choosing /bin/bash)

/etc/cli.sh

/bin/bash

/bin/csh

/bin/tcsh

 

That's all I got:

# echo $MDSDIR

# $MDSDIR/bin/mdsenv
-bash: /bin/mdsenv: No such file or directory
# echo $MDS_FWDIR
/opt/CPsuite-R81.10/fw1
# $MDS_FWDIR/bin/mdsenv
-bash: /opt/CPsuite-R81.10/fw1/bin/mdsenv: No such file or directory
# cd $MDS_FWDIR/bin/
# mds
mds_backup_start mds_uncheck_IPSEventManager mdsstart_eventia mdsstop_eventia
mds_check_IPSEventManager mdscmd_start mdsstart_start mdsstop_start
mds_restore_start mdsconfig_start mdsstat_start
mds_restored.sh mdsstart_customer_start mdsstop_customer_start

 

Thanks

0 Kudos
_Val_
Admin
Admin

Okay, thanks for your patience. The suggested SK is for MDS environment only, and your installation is clearly SMS, not MDS pair. 

Your secondary MGMT, is it set as secondary? It might be you deployed two primary ones...

0 Kudos
razotevsSVR
Explorer

Yes, seems like two primary managements, but during the deployment from aARM it is not providing the same options like before to choose primary or secondary.

I was wondering if the marketplace image is somehow different, but doesn't make much sense. Should be pretty much the standard, just the initial setup is not in checkpoint, but instead in the ARM providing variables.

Single domain is enough for us at this point. Just want the autosync to another management is second region for disaster recovery. I can use Azure DR options, but there is a downtime and that's why I prefer the checkpoint native approach.

Thanks

0 Kudos
razotevsSVR
Explorer

Quick Update: 

1. After manual initial configuration (Skipping the ARM template variables) and specifying in GAIA as a Secondary management the Trust was established. SIC Status for Secondary: Communicating

2. Ping from 1>>>2 works, ping from 2>>>1 works. They are in the same subnet so no chance for routing or other blockers. Direct access to one another.

3. Following sk54160 I am fine until the very moment when host is added, but the status is "Machine Status is not available"

4. Publish or Install Database fails with "Publish Failed - Action Failed due to an Internal Error"

5. The only choice to continue is to Discard the changes.

6. Installed the latest Check_Point_R81_10_JUMBO_HF_MAIN_Bundle_T79_FULL.tgz successfully. No change, same error!

 

Any ideas are welcome.

 

0 Kudos
razotevsSVR
Explorer

Both servers seems to be up and running properly:

 

# $MDS_FWDIR/scripts/cpm_status.sh
Check Point Security Management Server is running and ready

# $MDS_FWDIR/scripts/cpm_status.sh
Check Point Security Management Server is running and ready

0 Kudos
_Val_
Admin
Admin

OK, so this is the cause. You cannot set up SIC between two primary SMSs. Please explain to your TAC engineer this fact, and ask for guidance. You need to demote one to be secondary, and only TAC can help you with this. 

0 Kudos
razotevsSVR
Explorer

They are responding extremely slow. The case is 11 days now and still not even a single viable solution . Not sure if they are reading my emails at all.

Instead I've deleted the secondary and deployed a new one with manual first time wizard as a secondary (much faster solution). Trust is established, but still not communicating properly for some reason. Check the above post for details. Same subnet, no NSG's, No Firewall, No UDR or something. Direct communication. There is something I am missing with the chekpoints itself.

 

Saw this  sk39345 and the restrictions not mentioned in the admin guide. Even switched off the SmartEvent blade, but still the same. 

0 Kudos
Duane_Toler
Advisor

If nothing is working correctly, you will need to debug CPD and review $FWDIR/log/cpd.elg to see where it fails.  If your CloudGuard management is in different VNETs, then you are getting hit by your management server being subject to NAT by Azure VNET, and this won't work for management HA.  CPD needs to see the packets as its native IP.  This may need to be a change with GUIDBedit to allow the two hosts to communicate SIC despite the IP changes (just like $FWDIR/conf/masters on a gateway, but I don't think this applies to management; you can try it tho:  either sk102712 or manually edit the file then run "chattr +i $FWDIR/conf/masters" to make it immutable. 

 

Here's an SK on doing manual/quasi-emergency Active/Primary management changes when you lost the Active+Primary management: https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

 

Likewise, if your HA management goes dual Active, here's how to quell one:

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

 

If your host came up as primary management, you can verify everything with the internal variable configs:

# cpprod_util FwIsHAManagement

This will return 0 if this is not a management HA.  Obviously, 1 if so.

 

You can change its configuration to HA instead with:

cpprod_util  FwSetHAManagement 1 1 1

 

That will set it as Management HA.  cpstop/cpstart, then check the first one again, and you should be good to go.  Feel free to check this with TAC, tho.

 

A neat trick:  You can build a gateway+management host, then later "turn off" the management server with cpprod_util and reboot.  That will disable the local management and let you re-do SIC to control the gateway from another management server.  I've had to do this a time or two for customers doing acquisitions where we needed to take over a gateway until we could get it rebuilt.  YES, re-building the gateway is the proper way, but if you're pushed into a deadline.... you do what you gotta do. 🙂  (and yes, we did rebuild those hosts later).  This trick actually came directly from TAC long ago.  (Yes, you can also do the opposite, and "turn off" the gateway instead).

 

Good luck!

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.