MTU and MSS Clamping on gateways in Azure
we have some CloudGuard gateways running in Azure and some Asian sites have issues reaching them...or the other way around.
We can ping from appliances such as 730 models to servers in the encryption domain of the vSEC gateways but not the other way around. And then suddenly it works, and then not anymore. We tried permanent tunnels but it doesn't seem to help much.
I'm starting to look at MTU and MSS Clamping issues but I wonder how you can detect the need for them.
We sometimes see drops because of "SYN retransmit with different window scale" being logged.
Some sites are DAIP sites, some others have fixed IP but most lines seem poor quality. Should we set those variables both on the 730 models as well as on the R80.10 CloudGuard gateway in Azure?
What are your experiences here?
Thanks for feedback,
I've been hitting the MTU issue with AZURE VPN over the Express route and the only solution was to lower the MTU on the VPN interface to 1400 as recommended by one of the Azure tech support guys.
Setting ipsec_dont_fragment did not work, neither sim_keep_DF_flag=0 (might not be needed) and MSS clamping doesn't apply (see sk98074)
See also sk120122 - I have to try getting that hotfix.
The thing with MSS and MTU is that it does not make sense to lower the interface your VPN runs on as that would lower the actual MSS even further.
MSS = MTU - (40bytes IP/TCP header + IPSEC header size)
So lowering the MTU further, it would make the MSS even lower, unless the Azure gateway does not really care about the setting of the MTU, but still lowers the MSS to 1360 thus lowering it by 100 bytes from the default value of 1460.
I hope this is not going to be a long post but I will try to explain this. Indeed there are some MTU things you need to be aware of when working in Azure. By default Azure Virtual Network will fragment your packets at 1400 bytes. This behavior is very well documented .
The default MTU for Azure VMs is 1,500 bytes. The Azure Virtual Network stack will attempt to fragment a packet at 1,400 bytes.
Note that the Virtual Network stack isn't inherently inefficient because it fragments packets at 1,400 bytes even though VMs have an MTU of 1,500. A large percentage of network packets are much smaller than 1,400 or 1,500 bytes.
From my experience and knowledge of MTU and Azure (trust me, I've been on that platform since 2011) you shouldn't change the MTU of the interfaces as it has some broader impacts. So I have followed several guides on this matter trying to reach some sort of conclusion for my own environment as well . My scenario was very specific, one VPN tunnel with main connectivity over Express Route + failover on the Internet IP, and a second VPN tunnel over the Internet with another gateway.
Due to the nature of the network VTI's over IPSEC were used to exchange routing information with BGP. So naturally we had to adjust the MSS in this case. Initially what you need to do on the Check Point gateways is to set the MTU to 1400 on your tunnel interfaces , not "physical" ethernet interface. After this has been set, then check the following parameters that are set to 1 (or enabled as they say).
fw ctl get int sim_ipsec_dont_fragment -a
fw ctl get int sim_clamp_vpn_mss -a
fw ctl get int fw_clamp_vpn_mss -a
Once this is has been verified then open the GuiDbEdit and edit the select the network objects and fine-tune the mss_value for the VTI interface on both cluster members and gateway cluster where the VTIs are created . The value I've used is 1350 as described in the Microsoft arcticle.
Hope this is clear and helps someone in setting up their environment.