- CheckMates
- :
- Products
- :
- CloudMates Products
- :
- Cloud Network Security
- :
- Discussion
- :
- Re: HA Failover in Azure
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HA Failover in Azure
Hello Team
I would like to request Checkpoint to provide more sk's with different scenarios specially regarding HA in Azure.
The only sk that most of the guys point to is "How to deploy checkpoint cluster in Azure" which is a good platform to cover most of the stuff (Because I see lot of folks running into issues with creds or service account related issues) but there are some scenarios which the sk does not cover.
Example: I deployed a vsec cluster in Azure according to the sk and my HA test script came back with "All tests are successful". One day suddenly the service account used for the HA has initiated an API call to Azure to point all the routes to the standby node and standby node is still in standby according to cphaprob state. So all the traffic stopped passing the firewall. I dont know the command like clusterxl_admin up in an Azure enviroment, so I had to change the priority in the dashboard and push policy.
My questions are:
1) Why the API call was triggered automatically ? what caused it?
2) Why did the failover fail even after the tests are successful ?
3) Is there any command to generate a failover in Azure gateways (Except shutting down an interface) ?
Please correct me if I am wrong.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1) API calls are triggered when, clusterXL detect change between active/standby
2) I've never seen an issue that test would be on 100 % and clusterXL would not trigger API calls. Probably something changed between time when you made a test and time when failover happened.
3)In Expert mode:
clusterXL_admin down -p;sleep 2;cphaprob stat;clusterXL_admin up -p;sleep 2;cphaprob stat
that will trigger failover and move member to "down" state, wait for 2 seconds, give you status after "down" registration and then put member back to "standby" status and again show you cphaprob status
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the info Martin!
1) I want to know what change did the clusterXL see to trigger the API call.
2) It did trigger the API call successfully and the API also changed the routes to standby node successfully but the standby node did not change from standby to active, that is my question.
3) I've tried clusterxl_admin up on the standby and looks like there is no such command. Also got to know form other sources that shutting down an interface is the only option in azure, no commands yet.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1) it's based on cluster member state, there is daemon which monitor cluster state and if it see that failover occured it will initiate API calls
2) that should not happen
3) you must do it in expert mode, otherwise from clish you can do only "set interface eth1 state off/on"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1) Checking azure_had.elg file ?
3) Yeah did that from expert mode, does not exist.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1) elg file containt dump of jsons used for api calls and showing all errors related to api calls, you can enable debug on azure_ha.json file to get more details in that log file
3)
[Expert@gateway1:0]# fw ver
This is Check Point's software version R77.30 - Build 024
[Expert@gateway1:0]# cphaprob stat
Cluster Mode: High Availability (Active Up) with IGMP Membership
Number Unique Address Assigned Load State
1 (local) 10.8.104.196 100% Active
2 10.8.104.197 0% Standby
[Expert@gateway1:0]# clusterXL_admin down -p
Setting member to administratively down state ...
Member current state is Down
[Expert@gateway1:0]# cphaprob stat
Cluster Mode: High Availability (Active Up) with IGMP Membership
Number Unique Address Assigned Load State
1 (local) 10.8.104.196 0% Down
2 10.8.104.197 100% Active
[Expert@gateway1:0]# cat /etc/in-azure
gey_hvm-48-205.vhd