Hey guys,
Apologies if I posted this in the wrong section. Just wanted to get some thoughts/opinions on this subject. First off, sorry for not having diagram that would probably explain things a bit better, but will do my best to make it as clear as I can.
So we have a client that already has 4 route based tunnels with Azure and they all work fine, no issues. We had a requirement to set up brand new one, which we did couple of months ago, but had been struggling since to make traffic through it work properly.
Here is the catch though...so not sure how many of you are familiar with xpress route mechanism (in layman's terms, its essentialy a way of extending your internal nets to the cloud via private network) and thats what comes into play here. Customer has BGP configured and LOTS of routes are learned dynamically.
What they wanted to do is this with brand new tunnel...have say 10.199.0.0/16 subnet go through xpress route and then have smaller subset of that (10.199.4.0/24) go through S2S tunnel. Ok, great...we had calls with Azure support few months ago, they confirmed it would work, then we set up a tunnel and realize, guess what, it does not work : - (
At this point, they opened Azure support case and I also had ticket opened with TAC that went to Esc. team and guy was excellent, we did bunch of checks and confirmed that traffic going to Azure was going through the tunnel, but on the way back, it was hitting xpress route interface and NOT the desired VTI interface. That provided to us customer had to go back to Azure support and take this further with them, which they did and they were advised to modify some vnet routing via the portal to try make this work.
Again, apologies as I dont have exact info what they changed on Azure side (will try get that), BUT, here is where I need some thoughts on. So last night, we tested the connection, and pings worked fine from Azure host to CP inside host, but rdp was failing. We did some debugging and looked like ips issue, but even after adding global exception, it still failed. TAC joined the call and as soon as we disabled sxl, rdp started working, which I guess makes sense, since icpm would not be subjected to securexl mechanism, so it wuld take F2F path, if thats the right term.
Now, here is the "kicker". Before TAC guy joined, we saw bunch of anti-spoofing error drops, we when we added 10.199.4.0/24 subnet to anti spoof exempt group on external interface, that worked as well.
Here is where IM sort of confused...if this indeed is sxl issue, then how come those other 4 tunnels work fine and there are no traffic issues with any of those? TAC made a suggestion to also consider below steps, but I dont personally feel too "cozy" about it, specially considering its S1C instance, so we dont even have ssh access to the mgmt server
https://support.checkpoint.com/results/sk/sk104468
Customer wanted to know if its safe to leave that azure subnet in anti spoof exempt group, which I agree with TAC, I dont see why not, as it is VPN tunnel already, so its not as if you are exmpting known malicious external IP ranges.
Anyway, sorry for rambling on too much about this, but it would be nice to get thoughts on what you guys think about it. Should we suggest making sxl modifications or simply leave that subnet in exemp anti spoof group on external interface?
In the meantime, I will confirm with customer exact change they made on Azure.
Thanks as always! 🙌🙌