- Products
- Learn
- Local User Groups
- Partners
- More
Firewall Uptime, Reimagined
How AIOps Simplifies Operations and Prevents Outages
Introduction to Lakera:
Securing the AI Frontier!
Check Point Named Leader
2025 Gartner® Magic Quadrant™ for Hybrid Mesh Firewall
HTTPS Inspection
Help us to understand your needs better
CheckMates Go:
SharePoint CVEs and More!
Hello
Today we've tried to upgrade a 5600 cluster of two members from R80.40 to R81.20 running OSPF . It finished as a total disaster ,
Standby member was dead after reboot , with these messages
Nov 11 16:58:24 2023 ctsmdpc01fw routed[27361]: [routed] ERROR: cpcl_cxl_runtime_status(1216): HA mode not started
Nov 11 16:58:25 2023 ctsmdpc01fw routed[27361]: [routed] ERROR: cpcl_cxl_runtime_status(1216): HA mode not started
Nov 11 16:58:25 2023 ctsmdpc01fw routed[27361]: [routed] ERROR: cpcl_cxl_runtime_status(1216): HA mode not started
It seems like cluster membership was deleted , Standalone ....lost sync IP , etc...
gateway01fw> show routed cluster-state
Cluster: Standalone
Master/Slave: Master
Sync IP: N/A
Cluster Sync: N/A
We are going to open a SR to Checkpoint but I would like to know if someone has found a similar problem..
thanks
Can you check if ClusterXL is enabled in cpconfig?
Yes, it was enabled. We tried to disable/ reboot and enable again but the result was the same.
finally we did a revert snapshot
Just in case, did you change the cluster object version and compiled/installed a new policy? was it installed successfully? R80.40 policy version will not work on R81.20. It seems that it loads the default policy, where clustering is not present, hence the HA error
Please post the output from "fw stat"
After upgrade , gateway lost connection with management we had no option to do nothing. We did a revert snapshot
But your observation is absolutely right about modifying object version in management. I think that the object version was not modified initially .
We'll try again in a few days.
thanks
That definitely could have been part of the problem @Maller
Andy
That should not happen. SIC did not work at all?
@_Val_ makes a very good point actually...did you change cluster object to R81.20 in general properties tab?
Best regards,
Andy
Which method / process did you use for the upgrade e.g. MVC and was a policy install performed successfully after?
Hello
No, I didn't have the oportunity to enable mvc , install ...nothing.
steps followed
1- Verify the applicable CPUSE Software Packages
2- Download the applicable CPUSESoftware Packages.
3-Install the applicable CPUSE Software Packages.
after step 3 gateway reboot and crashed
Thats very unfortunate. I always follow zero downtime upgrade method and never had an issue. Hope TAC can check this further for you.
Andy
Hello,
I went through a similar event.
Did you have any CT scan results after the event happened to you?
What was the root cause of the problem?
Can you update this post with your comments, please.
Hello Matlu
Yes, as Val indicated it seems to be related to version object change in mgmt server. Our team followed an old procedure used in R80.X upgrades where the standby node was upgraded previously to modify cluster object version in mgmt server. To upgrade to R81.X first of all , object must be upgraded int mgmt . Mistakes when nobody reads the upgrade guide 😞
thanks
Hello,
Did you use the CPUSE package or the Blink Image package?
When you downloaded the package (either CPUSE or Blink Image), before "Installing" it, you must change the Cluster object version, from the SmartConsole?
This is a previous step before sending to install the package in the passive member?
I have 1 doubt, if you change the version of the Cluster object, before installing the package in the passive member, you have to install "policies"? Or is it just change?
Wouldn't this give more errors?
Hi Matlu
Answering your qüestions
Did you use the CPUSE package or the Blink Image package? CPUSE package
** ************************************************************************* **
** Majors **
** ************************************************************************* **
Display name Status
R81.20 Gaia Fresh Install and upgrade Downloaded <--
When you downloaded the package (either CPUSE or Blink Image), before "Installing" it, you must change the Cluster object version, from the SmartConsole? YES
This is a previous step before sending to install the package in the passive member? YES
I have 1 doubt, if you change the version of the Cluster object, before installing the package in the passive member, you have to install "policies"? Or is it just change? Just change it . After standby node is upgraded then you have to install policy
Wouldn't this give more errors? Yes , install policy will finish ok in R81.20 node and failed in not upgraded node. But you have to deselect option "For gateways clusers, if installation on cluster member fails,do not install on that cluster"
When all members in cluster are upgraded , select this option again
This is part of the reason why I never do or recommend this method. I know it probably goes without saying that changing cluster version has to be done when upgrading, but I find doing zero downtime upgrade seems more "natural" to me, if you will.
I had done it that way for years and never had an issue and besides, literally every customer I ever done this for, they dont care if they lose handful of pings or connecton is down for a minute, hence why this is all done after hours anyway.
Just my 2 cents...
Andy
Hello, Friend
When you upgrade from an old version, for example from R80.30 to R81.20
Do you have to make previous jumps to avoid breaking the Cluster?
Or you can jump directly?
When you change the Cluster object from your SmartConsole, is it just change to the new version, in this case R81.20, and ‘publish’ or is it not necessary to ‘publish’?
Hello,
My Name us Naama Specktor and I am checkpoint employee,
I will appreciate it if you will share SR #, here on in PM.
thanks in advanced,
Naama
Just to confirm, I also get this on standby cluster members when installing hotfix's.
I've gone through an R80.40 ClusterXL gateway upgrade tonight. /var/log/messages gets spammed with "[routed] ERROR: cpcl_cxl_runtime_status(1216): HA mode not started" messages every second. I've went from base to T41 and T53 just to check. It does it regardless of the version.
And if you try cphastop; cphastart ... any change? Reboot?
Andy
Is there still a solution, i have the same Problem.
Can someone post the SR?
Thanks in advance
Alex
What does cphaprob -a if show?
Andy
Hi Andy,
thank you for response.
cphaprob state shows "HA module not started." I didn't run the cphaprob -a if, but i think i get the same output there.
I upgraded the standby FW to R81.20 with the proposed cpuse image, but after the upgrade i got a broken cluster with the alarm "Identity Awareness is not responding". Then i installed the latest HF, also proposed by cpuse and the alarm was still there.
In the Alarm description there was mentioned that the alarm can be caused by different Image Version and should disappear after the upgrade of the primary FW. But after the upgrade the Cluster was offline and broken. In the /var/log/messages is saw "[routed] ERROR: cpcl_cxl_runtime_status(1216): HA mode not started".
I was forced to run a "set snapshot revert AutoSnap..." to activate the automatic created snapshot before the upgrade, because it's a productive system. So currently the FW run with the R81 Release again. I think i will reproduce the problem in a lab environment to have more time for troubleshooting.
Do you have any idea?
Kind Regards,
Alex
Hey
Is this kind of events, is it normal to happen also when doing a JHF upgrade?
We did a JHF upgrade to our VSX Cluster, starting with the STANDBY box.
Suddenly the broken Cluster appeared, and this famous ‘HA not ......’ message.
Is this normal, or is it a bad procedure or a bug in the solution?
Leaderboard
Epsum factorial non deposit quid pro quo hic escorol.
User | Count |
---|---|
19 | |
12 | |
7 | |
7 | |
6 | |
5 | |
4 | |
4 | |
4 | |
4 |
Thu 09 Oct 2025 @ 10:00 AM (CEST)
CheckMates Live BeLux: Discover How to Stop Data Leaks in GenAI Tools: Live Demo You Can’t Miss!Thu 09 Oct 2025 @ 10:00 AM (CEST)
CheckMates Live BeLux: Discover How to Stop Data Leaks in GenAI Tools: Live Demo You Can’t Miss!Wed 22 Oct 2025 @ 11:00 AM (EDT)
Firewall Uptime, Reimagined: How AIOps Simplifies Operations and Prevents OutagesAbout CheckMates
Learn Check Point
Advanced Learning
YOU DESERVE THE BEST SECURITY