VPN tunnel from CheckPoint OnPrem to Cloudguard Azure went down and won't come back
So we have two sites plus one Azure environment.
We have two site2site tunnel. One from each site to Azure. Everything has been working just fine for quite awhile until today.
Both tunnel went down and they won't come back up.
One is giving this in SmartConsole log
Ike: Initial exchange: Exchange failed: timeout reached.
The other is giving this:
Encryption Failure: no response from peer.
TAC has asked for logs and upon reviewed declared
It looks like we are not getting response from Azure gateway. Could you please check on Azure side if they have changed anything on configurations.
Nothing has been changed in the Azure environment.
The issue appeared after a policy was pushed to the gateways. It was a minor change unrelated to VPN. We reverted to the previous policy but it didn't make a difference.
On the Azure side we tried the little connection troubleshoot util and it shows "Connectivity is allowed"
Those error messages do say there is a connectivity issue with Azure GW.
have you tried to run TCPDUMP with the Azure GW IP address and see if this is really the case ? that you really don't see any traffic coming from the Azure side ?
Yes I agree there is certainly a connectivity issue with the Azure GW. I admit I have limited knowledge of Azure but I don't know what in Azure could cause that VPN issue. Everything else works fine with that gateway.
From the onprem gateway with tcpdump I see connections attempt to the Azure gw (udp 500 & 4500). On the Azure gw with tcpdump I don't see these incoming connections. Same thing happens from Azure to Onprem I see the outgoing connections to the Onprem gateway but I don't see them incoming on the Onprem gateway.
Both Onprem & Azure are cluster
I re-checked the UDR and NSG for the GW frontend and backend subnet and they’re all the same. I’ve looked at the Activity log on these and there’s no activity/changes.
Also in Azure, for these 4 resources (UDR & NSG for GW frontend and backend subnet), the option Diagnose and solve problems shows that there has been no changes to these resources in the last 72 hours.
Same thing on both gateways I see outgoing connections but no iccoming on the other side. On the Azure gateway sometimes I see this: [vs_0][fw_2] eth0:Oe:
What does the last e mean in eth0:Oe?
So when we deployed the Azure cluster it automatically creates a frontend_lb, is it involved at all in communications between gateways?
I'm trying to understand the flow from OnPrem to Azure and how it can be that I don't see any incoming.
Also on the OnPrem gateway if I do a treceroute to a random public IP I'm seeing results but if I do a traceroute on the public vip of the Azure cluster I get nothing. I verified and the public IPs are not in the encryption domain.
the public IPs are not in the encryption domain but traffic between them will be encrypted.
better continue this issue with Azure and Check Point Support together so they will work together to find the issue.