FWIW, I ran my query through MS copilot AI and below is what it gave me. I verified CP side, looks good, but sent the rest to the client to check on Azure, just to verify.
*******************
Hey — this symptom pattern in Azure HA is super telling:
- Forwarded traffic from workload VMs fails only when member2 is Active
- But member2 itself can browse/ping out just fine
That almost always means Azure is dropping transit/forwarded packets (or the ILB/UDR is still sending flows to the wrong place) rather than a pure policy issue.
Below are the top 3 root causes I’d chase, in the order I’d chase them, with exact checks and what “good” looks like.
1) Azure IP Forwarding is not enabled on all NICs of member2 (classic)
If Azure NIC “IP forwarding” is off, the VM can still originate traffic (so fw2 itself can ping 8.8.8.8), but Azure will not allow it to forward traffic for other VMs. This matches your symptoms perfectly.
Check Point’s own Azure HA tester explicitly validates: “IP forwarding is enabled on all network interfaces of the Cluster Member.” [sc1.checkpoint.com]
What to do (fast)
On each cluster member, run:
Shell
expert
$FWDIR/scripts/azure_ha_test.py
Show more lines
If you see something like:
In Azure Portal (member2)
For each NIC attached to fw2 (external + internal, and any additional):
- Networking → NIC → IP forwarding → Enabled
If member1 was deployed/configured earlier (or via a different template pass), it’s not uncommon for member2 NIC settings to be missed.
2) Internal LB health probe is wrong → ILB keeps sending flows to the Standby member
In Azure HA, load balancers are what steer inbound / east-west / outbound inspection, and they should forward only to the Active member.
If the probe is configured to “is the VM alive?” instead of “is the VM Active?”, the ILB can happily send workload traffic to the Standby gateway, which will drop it. [supportcen...kpoint.com], [sc1.checkpoint.com]
What to verify in Azure
In the Internal Load Balancer:
- Health probe is TCP
- Port is 8117
- Backend pool contains both members’ relevant NICs
- Your LB rule(s) reference that probe
Check Point’s Azure HA ATRG also calls out: ensure health probes using port 8117 are configured, otherwise gateways won’t respond correctly. [supportcen...kpoint.com]
Quick on-gateway validation
On fw2 (when it is Active), confirm it is answering the probe:
Shell
expert
fw ctl zdebug -m cluster cloud
Show more lines
3) Cluster VIP / egress IP attachment isn’t moving to member2 (AZURE_HAD / permissions)
In CloudGuard Azure HA, during failover the newly Active member associates the cluster private/public IPs to itself using Azure API calls.
If that mechanism is broken on member2 (permissions, AZURE_HAD down, API access blocked), you can end up with: [supportcen...kpoint.com]
- fw2 becomes Active (ClusterXL shows fine)
- fw2 can go out using its own interface IPs
- but forwarded traffic that NATs behind the cluster egress IP/VIP gets dropped by Azure because that VIP is still attached to member1
The ATRG even notes expected outbound failover timing depends on LB probe + Azure API and can take longer than east-west.
But in your case it never recovers, so we’re looking for “VIP never moved”. [supportcen...kpoint.com]
What to check
On fw2:
Shell
expert
cpwd_admin getpid -name AZURE_HAD
Show more lines
Then check logs:
On the Azure side, during/after failover, verify:
- The cluster private IP configuration (and public IP if used) is attached to fw2’s external NIC IP configurations
- It is not still attached to fw1
A short, practical “do this now” runbook (10–15 minutes)
Step A — Prove whether traffic reaches fw2 and whether fw2 forwards it
When fw2 is Active, from a workload VM try:
ping 8.8.8.8
tracert 8.8.8.8 (Windows) / traceroute 8.8.8.8 (Linux)
On fw2:
Shell
expert
tcpdump -eni <inside_if> host <workload_vm_ip>
tcpdump -eni <outside_if> host 8.8.8.8
Show more lines
Interpretation:
- See packets on inside but nothing on outside → forwarding/NAT path issue (often Azure IP forwarding or policy/NAT)
- See packets on outside but no replies → SNAT/VIP attachment/return path issue
Step B — Run the Check Point Azure HA tester on fw2
Shell
expert
$FWDIR/scripts/azure_ha_test.py
Show more lines
This script checks the exact Azure HA prerequisites, including IP forwarding, and is the fastest way to catch mismatches between members. [sc1.checkpoint.com]
Step C — Validate ILB probe correctness
Step D — Validate AZURE_HAD + VIP movement
Most likely answer (based on your exact symptoms)
If I had to bet: Azure NIC IP forwarding is disabled on fw2 (or on one of fw2’s NICs). It’s the cleanest explanation for “fw2 can reach the internet but doesn’t forward for other VMs,” and Check Point explicitly calls out verifying/enabling IP forwarding on all interfaces as part of Azure HA validation. [sc1.checkpoint.com]
A couple quick questions (so I can pinpoint it faster)
- Are your workload subnets’ UDRs pointing to an Internal Load Balancer frontend IP as next hop, or directly to fw1’s internal IP?
- When fw2 is Active, does the cluster VIP / private cluster IP show attached to fw2’s NIC IP configurations in Azure?
If you paste the output of:
azure_ha_test.py (fw2)
cpwd_admin getpid -name AZURE_HAD
- and a screenshot/text of the ILB probe config (protocol/port) …I can tell you exactly which of the above it is.
Best,
Andy
"Have a great day and if its not, change it"