- CheckMates
- :
- Products
- :
- CloudMates Products
- :
- Cloud Network Security
- :
- Discussion
- :
- Cloudguard Azure HA Failover
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Cloudguard Azure HA Failover
Hi Mates,
I inherited an old cluster running R80.40. At some point it developed a problem with HA. Long story, but I needed to rebuild it anyway, so did so. A new build side-by-side on R81.20 but found it too had a problem with HA (the secondary device doesn't pass traffic so you need to failback).
I ran the azure_ha_test.py and found our error message in the sk175023 ATRG: [Forbidden] Error: HTTP/1.1 403 Forbidden" error
We got the permissions update using the gateway managed identities and the test script now runs clean. Yay! Now the actual problem.
The HA template in Azure Marketplace creates 3 public IPs (PIPs), one of which is the "cluster-vip" which gets attached to the active gateway in Azure. As part of the rebuild and migration, I changed this is Azure to our established egress PIP in Azure, which is whitelisted by many external services.
Now when failover occurs, the cluster-vip is changing back to the PIP that was created by the template, and removing the one I selected in Azure and I don't know why.
I found a reference to the old PIP in $FWDIR/conf/azure-ha.json (caps below replace actual IPs)
"name": "cluster-vip"
"addr": "PRIVATE IP"
"pub": "TEMPLATE_PIP"
So I changed this to
"name": "cluster-vip"
"addr": "PRIVATE IP"
"pub": "ESTABLISHED_EGRESS_PIP"
I tested the failover again but the same thing is happening. Does anyone know where the command to use the template cluster-vip is coming from?
Thanks in advance.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @wanartisan
The admin guide's upgrade section explains how to use the old cluster-vip, and you are trying to use a different public IP as VIP,
In this case, here’s what you need to do:
- Go to the active member’s ETH-0 NIC resource in the Azure portal.
- Navigate to Settings > IP configurations and replace the Public IP address of the cluster-vip with the new address (see attached screenshot).
- Edit the azure_ha.json file on both members (as you’ve done with "pub": "ESTABLISHED_EGRESS_PIP").
- Run the following command on both members to apply the updated configuration from azure_ha.json: "$FWDIR/scripts/azure_ha_cli.py restart"
That's it, failover should work with the new Public IP
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Follow the steps in the UPGRADE section of the Azure HA admin guide:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This makes me think of the HCP tool and if enhancements in that, specifically for CloudGuard, could help resolve issues like this in the complex web that is the public cloud.
https://community.checkpoint.com/t5/General-Topics/HCP-roadmap-question/m-p/229324#M38304
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CC @Amir_Senn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That looks perfect. I hadn't seen that. I'll try again the "new" way and report back.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @wanartisan
The admin guide's upgrade section explains how to use the old cluster-vip, and you are trying to use a different public IP as VIP,
In this case, here’s what you need to do:
- Go to the active member’s ETH-0 NIC resource in the Azure portal.
- Navigate to Settings > IP configurations and replace the Public IP address of the cluster-vip with the new address (see attached screenshot).
- Edit the azure_ha.json file on both members (as you’ve done with "pub": "ESTABLISHED_EGRESS_PIP").
- Run the following command on both members to apply the updated configuration from azure_ha.json: "$FWDIR/scripts/azure_ha_cli.py restart"
That's it, failover should work with the new Public IP
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
sk175023 ATRG suggests
"$FWDIR/scripts/azure_ha_cli.py reconf"
Is this correct? I will be testing later.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes, do that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem has been fixed. FYI, I did the following
1. "$FWDIR/scripts/azure_ha_cli.py reconf" - no change
2. Moved the PIP from the old resource group to the new one - no change
3. Changed to the "new" way in $FWDIR/conf/azure-ha.json - no change
4. Ran $FWDIR/scripts/azure_ha_cli.py restart - fixed
It might have been a combination of the above but 'reconf' didn't appear to do anything, 'restart' had a pause as if it was doing something before giving me the cursor back..
In the upgrade guide it talks about "image build number". I'm not sure which build number it is referring to. There seem to be at least a couple. Can anyone clarify?
Thanks for all your help.