Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
wanartisan
Participant
Jump to solution

Cloudguard Azure HA Failover

Hi Mates,

I inherited an old cluster running R80.40. At some point it developed a problem with HA. Long story, but I needed to rebuild it anyway, so did so. A new build side-by-side on R81.20 but found it too had a problem with HA (the secondary device doesn't pass traffic so you need to failback). 

I ran the azure_ha_test.py and found our error message in the sk175023 ATRG: [Forbidden] Error: HTTP/1.1 403 Forbidden" error

We got the permissions update using the gateway managed identities and the test script now runs clean. Yay! Now the actual problem.

The HA template in Azure Marketplace creates 3 public IPs (PIPs), one of which is the "cluster-vip" which gets attached to the active gateway in Azure. As part of the rebuild and migration, I changed this is Azure to our established egress PIP in Azure, which is whitelisted by many external services.

Now when failover occurs, the cluster-vip is changing back to the PIP that was created by the template, and removing the one I selected in Azure and I don't know why. 

I found a reference to the old PIP in $FWDIR/conf/azure-ha.json (caps below replace actual IPs)

"name": "cluster-vip" 
"addr": "PRIVATE IP"
"pub": "TEMPLATE_PIP"

So I changed this to

 "name": "cluster-vip" 
"addr": "PRIVATE IP"
"pub": "ESTABLISHED_EGRESS_PIP"

I tested the failover again but the same thing is happening. Does anyone know where the command to use the template cluster-vip is coming from?

Thanks in advance. 

 

0 Kudos
1 Solution

Accepted Solutions
yairra
Employee
Employee

Hi @wanartisan 
The admin guide's upgrade section explains how to use the old cluster-vip, and you are trying to use a different public IP as VIP, 

In this case, here’s what you need to do:

  1. Go to the active member’s ETH-0 NIC resource in the Azure portal.
  2. Navigate to Settings > IP configurations and replace the Public IP address of the cluster-vip with the new address (see attached screenshot).
  3. Edit the azure_ha.json file on both members (as you’ve done with "pub": "ESTABLISHED_EGRESS_PIP").
  4. Run the following command on both members to apply the updated configuration from azure_ha.json: "$FWDIR/scripts/azure_ha_cli.py restart"

That's it, failover should work with the new Public IP

View solution in original post

8 Replies
Nir_Shamir
Employee Employee
Employee

Follow the steps in the UPGRADE section of the Azure HA admin guide:

https://sc1.checkpoint.com/documents/IaaS/WebAdminGuides/EN/CP_CloudGuard_Network_for_Azure_HA_Clust...

 

Don_Paterson
Advisor
Advisor

This makes me think of the HCP tool and if enhancements in that, specifically for CloudGuard, could help resolve issues like this in the complex web that is the public cloud.

@Tal_Paz-Fridman 

https://community.checkpoint.com/t5/General-Topics/HCP-roadmap-question/m-p/229324#M38304

 

Tal_Paz-Fridman
Employee
Employee

CC @Amir_Senn 

wanartisan
Participant

That looks perfect. I hadn't seen that. I'll try again the "new" way and report back.

0 Kudos
yairra
Employee
Employee

Hi @wanartisan 
The admin guide's upgrade section explains how to use the old cluster-vip, and you are trying to use a different public IP as VIP, 

In this case, here’s what you need to do:

  1. Go to the active member’s ETH-0 NIC resource in the Azure portal.
  2. Navigate to Settings > IP configurations and replace the Public IP address of the cluster-vip with the new address (see attached screenshot).
  3. Edit the azure_ha.json file on both members (as you’ve done with "pub": "ESTABLISHED_EGRESS_PIP").
  4. Run the following command on both members to apply the updated configuration from azure_ha.json: "$FWDIR/scripts/azure_ha_cli.py restart"

That's it, failover should work with the new Public IP

wanartisan
Participant

sk175023 ATRG suggests

"$FWDIR/scripts/azure_ha_cli.py reconf"

Is this correct? I will be testing later. 

0 Kudos
Nir_Shamir
Employee Employee
Employee

yes, do that.

0 Kudos
wanartisan
Participant

The problem has been fixed. FYI, I did the following

1. "$FWDIR/scripts/azure_ha_cli.py reconf" - no change
2. Moved the PIP from the old resource group to the new one - no change
3. Changed to the "new" way in $FWDIR/conf/azure-ha.json - no change
4. Ran $FWDIR/scripts/azure_ha_cli.py restart - fixed

It might have been a combination of the above but 'reconf' didn't appear to do anything, 'restart' had a pause as if it was doing something before giving me the cursor back..

In the upgrade guide it talks about "image build number". I'm not sure which build number it is referring to. There seem to be at least a couple. Can anyone clarify?

Thanks for all your help. 

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.