I have the problem that the Cluster XL change several times a go from active to standby. How can I analyze this issue? How get I the change out?
A typical issue is ClusterXL under freeze. ClusterXL administrator would like to suppress the messages printed by the Cluster Under Load (CUL) mechanism (see sk92723) in the /var/log/messages file and in the dmesg. I always enable this on the cluster to solve this "under freeze" issue.
1) Open vi and add the following settings
# vi $FWDIR/boot/modules/fwkern.conf
add the Line:
2) Reboot all Gateways
If that is not the issue, please send a message. Then I can give you further debugging informations.
- check cluster state (cphaprob stat)- check interface error (cphaprob -a if )- check change time (clish -c "show routed cluster-state detailed")- check /var/log/messages
In your firewall logs look for "Control" log entries (the associated icon is a wrench), as these will tell you exactly why the cluster failed over. Filter "type:Control" can be used to find these log entries in the R77.30 SmartLog GUI or the R80+ SmartConsole.
-- Second Edition of my "Max Power" Firewall Book Now Available at http://www.maxpowerfirewalls.com
I can see cluster flapping during policy installation.
Sorry, configure this on gateway and reboot the gateway.
Is this setting permanent after reboot?
Yes, it is permanent.
it works perfectly. The gateway no longer flippig during policy installation.
Thanks for the help.
If you add parameters in $FWDIR/boot/modules/fwkern.conf file, then they survive a reboot and applied only after a reboot. This is mentioned in the provided sk92723, please read it carefully first. You can additionally read Changing kernel global parameters article.
You can also try to change a parameter by the following commands (applied on-the-fly, not survive a reboot):
fw ctl get in <paramater>
fw ctl set int <parameter> <value>
Example for CUL mechanism, that Heiko provided:
fw ctl get int fwha_freez_state_machine_timeout - prints current value of the parameter
fw ctl set int fwha_freez_state_machine_timeout 0 - sets value for the parameter
I would recommend to try it first, and see if it helps.
Retrieving data ...