- Products
- Learn
- Local User Groups
- Partners
-
More
Join Us for CPX 360
23-24 February 2021
Important certificate update to CloudGuard Controller, CME,
and Azure HA Security Gateways
How to Remediate Endpoint & VPN
Issues (in versions E81.10 or earlier)
IDC Spotlight -
Uplevel The SOC
Important! R80 and R80.10
End Of Support around the corner (May 2021)
If my Active device went Hang mode, I guess the CCP Packets will not be send and now its State by device time to start initiating the CCP and take over the active device state but this is not normally happening, If the active device went Hang mode even then the secondary remains in statndy.
Could you please explain why this strange things happening and also correct me if my understanding is wrong.
What do you see in CLI when issuing
cphaprob state
cphaprob -l list
cphaprob -a if
on the nodes?
If the standby node does not get an update from the active node it will switch to active after a timeout is reached. See Advanced Technical Reference Guide (ATRG) for ClusterXL - R6x, R7x and R8x for details how a failover can be triggered.
Let me Brief you two issues that i have faced.
1. I run a debug on my active device and it went to hang mode but failover doesn't happened automatically, It happened only after hard reboot.
2.My 15600 box whenever i try to install policy during Peak working hours active firewall went hang mode and failover doesn't happenes automatically even after freeze timeout, It happenes only after hard reboot.
What i think is like if firewall doesnot passess any traffic it will send CCP Packets in that case failover must be happen.
What do the three commands show in that case ? As we have no details on how ClustwerXL is configured, it is hard to say anything here.
If you plug out a monitored network cable failover should occur immediately...
Please define exactly what you mean by "hang". Is it:
1) Hard hang - kernel unresponsive, no traffic flowing, interfaces do not answer ping, console port stuck. Generally caused by a hardware interrupt or other atomic kernel operation never completing or bad hardware, must power cycle to recover
2) Process hang - Kernel is consuming all available CPU and/or memory, interfaces still answer pings, basic firewall traffic not requiring process space interaction still flows through the firewall, and the console is stuck as the servicing process cannot get any CPU slices or sufficient memory. Will probably require a power cycle to recover from, or when enough processes die off from lack of memory the console port may become available to initiate a reboot. Incidentally if the hard drive on a security gateway permanently dies (or just stops responding until the whole system is power cycled), the gateway will exhibit all of these "process hang" attributes. The kernel preloads all needed code and data into RAM at startup and can continue to operate, while most processes require interaction with the hard drive for paging and such and will eventually die.
3) NIC hang - All NIC's appear to be operating normally based on the LEDs, but no network traffic of any kind (including pinging firewall interfaces) functions. Console port is still responsive to initiate a reboot.
4) Interaction hang - System has crashed and is trying to boot back up, but is waiting for some user interaction on the console before proceeding. Typically this is to approve some action such as fsck repairing disk corruption, or acknowledging some kind of failure. Console is responsive, but network interfaces may not respond at all depending on how far the system has managed to boot up.
--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com
I would involve TAC - hanging / freezing is not a feature but a bug.
There is a new article about ClusterXL and failovers here: Switch or Cabel between ClusterXL members for the sync network?
About CheckMates
Learn Check Point
Advanced Learning
WELCOME TO THE FUTURE OF CYBER SECURITY