We are running an active\standby cluster with R81.10 take 87 and have setup a site-to-site with a 3rd party. The connection establishes but loses connection randomly - I have a continuous ping going and it can be down for up to 90secs. As a comparison, I run the same test on the 5 other site-to-sites and they do not drop a ping.
We are using IKEv2, tried it without PFS. The ED is 192.168.199.48 - 192.168.199.63 local, 21x.xxx.xxx.57/32 remote (both network objects). Also we NAT our traffic behind 192.168.199.49 to their endpoint.
The community is configured with the local gateway using a VPN domain group with our local internal subnet and the 192.168.199.48/28 subnet. The remote gateway is configured using a group (we will eventually add another ip addess in) with the network object of 21x.xxx.xxx.57/32.
One thing I have observed is that their end will send loads of IPSec SA's rekeys, sometimes over 20 in 5 mins. I've seen their config and P2 is definitely set to 3600secs, the same as our side. We never have an issue with the IKE SA.
The couple of questions I have is;
Their remote gateway is 21x.xxx.xxx.56. When I look in the VPN routing table, it suppersubnets the peer and endpoint together. Could this be an issue?
I have asked them to test using a 10.x.x.x address as the remote endpoint as a test (still get drops). The SA's rekey every 46 mins initiated from their end. Is using a remote endpoint 'real-world' IP an issue for the site-to-site?
When we tested with the 10.x.x.x IP, we notice the FW uses the IPSec SA agreed in the Auth stage of the tunnel, when we use the external IP as the remote endpoint, the FW creates 4 - 5 addition Child SA's, all exactly the same before traffic will pass over the tunnel.