Good day,
I am having an issue with a VPN tunnel between 2 Checkpoint clusters dropping when both of my service provider's links are available. The tunnel establishes and works when only 1 ISP path is connected.
The cluster at Site A is a pair of open server systems running 81.10. The cluster at Site B is a pair of 3800 appliances running 81.10 as well.
Our local service provider has provisioned two distinct L2 circuits for us with unique VLANs within their network, and handed off to us at each location on access ports.
This configuration has been in place since earlier this year, but we found out recently that the provider's secondary circuit (shown in red in the attached image) has not been functioning correctly and was not passing traffic. Our tunnel has worked properly and failover between the cluster nodes has worked correctly as well, but that was only when the provider's primary path was available. When that path went down, the tunnel went down and didn't re-establish over the secondary path.
When the provider fixed the secondary path and it became active (the paths are active-active) the tunnel went down and didn't come back after 5 minutes. We tried reseeting the tunnel and that did not help. When we physically disconnected the secondary circuit, the tunnel re-established. Further tests have shown that when both circuits are available, the tunnel drops.
In our design, the FW interfaces are connected to a pair of our switches (themselves clustered/stacked too) with each switch uplinked to one of the ISP switches. The firewalls are on the same internal VLAN (53 or 24) as the access ports to the ISP switches. The intention here is to merge the provider VLANs on our switches and have each FW node be able to communicate over either ISP path. The problem seems to be that when both paths are up, something gets confused and the traffic doesnt go through to establish the tunnel.
Does this design make sense and *should* work, or are we missing something that would cause it not to function as intended?
Thanks
Chris