AWS CloudGuard Multiple Static NAT rules
All, please assist with this.
I am about 90% there with my CloudGuard configuration and seem to be stuck at the last hurdle.
Here's what I have and I am sure it's something straight forward for one of the Gurus on here.
Internal network 10.99.1.0/24 - private
External network 10.99.0.0/24 - public
Checkpoint eth0 has primary and secondary IPs 10.99.0.230 & 10.99.0.235
each has an EIP (elastic IP address) associated with it.
Checkpoint has eth1 assigned single IP address 10.99.1.230 in private
Route tables are set:
Public 0.0.0.0/0 through the AWS Internet Gateway
Private 0.0.0.0/0 through eth1 of Checkpoint
I have Hide behind Gateway set as NAT for Checkpoint gateway object
I have a manual static NAT rule for an internal Host 10.99.1.x to NAT to a cloned host object with the secondary EIP (which is assigned to eth0 of checkpoint) set as the translate address.
I have an opposite rule for translate back from Public secondary IP to internal host set.
I have a policy rule which Accepts traffic from the secondary external IP address to Any.
When I delete the NAT rule I can access the internet from the internal host (NATed through the Gateway Public IP address). With the Static NAT rule active it's not returning anything, although I see the traffic from the internal host hitting the firewall and an Accept entry in the Log - just nothing seems to come back.
What have I missed?
So I needed to set the NAT to the internal secondary address of the eth0 interface and it worked.
AWS converts this to the external Public IP when it leaves the checkpoint destined for the Interwebs.
The problem I have now though is a VPN tunnel I need to create with the Endpoint of my Gateway and the Destination Gateway set up and then a Public IP address inside the tunnel for encrypting the host traffic over the tunnel.
So, I NAT 10.99.1.10 to 10.99.0.235 --> this is sent out from the eth0 as 10.99.0.235 - which is mapped by AWS to 52.56.x.x --> traffic leaves via the AWS Internet Gateway and heads out. Traffic comes back to 52.56.x.x and gets sent to 10.99.0.235 (the checkpoint) where it gets translated (NAT) to 10.99.1.10 - Success.
Now, sending traffic over IPSec VPN using Phase I and Phase II IKE v2 --> tunnel established between 35.177.x.x (my GW) to 198.x.x.x (destination gateway) and then the traffic is encrypted at the firewall with 10.99.0.235 (10.99.1.10 is NATed) and sent across the tunnel as 52.56.x.x - the destination sees this and expects 52.56.x.x to have encrypted the traffic but it's encrypted using the 10.99.0.235 and then AWS maps 52.56.x.x to that packet. 52.56.x.x is in the destination encryption domain but not 10.99.0.235 (we did not use these addressed due to possible CIDR range conflict at the destination network - so we want to use the public address.
How to encrypt the traffic at the firewall as 52.56.x.x instead of 10.99.0.235? In AWS the checkpoint never sees traffic destined for 52.56.x.x because it gets translated from that elastic IP (EIP) before hitting the firewall - send as 52.56.x.x receive as 10.99.0.235.
it's got me stumped.
Sorry, it is difficult to keep track of all the permutations without the diagram, or may be it's just Friday:)
Anyway, you should, theoretically, be able to include fake public IP address in your encryption domain and NAT it to legitimate 10.X.X.X.
If you are trying to use actual EIP attached to the ENI, this will likely fail, as the firewall will not be able to perceive it as being both, the external and internal address in its topology.
can you explain a little more about the fake IP? I will try to explain more clearly how this "seems" to be working.
The two gateways (AWS and Prem) can communicate and establish tunnel. The IP address of AWS gw is 10.99.0.230 which is how checkpoint and AWS recommend configuring it - the external IP is added at the interface by AWS so 10.99.0.230 becomes 35.177.x.x. NAT on this gateway is set to Hide behind Gateway external IP.
Link Selection for the AWS gw is set to Always use this IP - 35.177.x.x
The internal host 10.99.1.10 is NATed a second IP address on the same interface (AWS allow you to add multiple network interfaces to the same "physical" device. So the EC2 instance (VM) has two IP addresses on its eth0 interface - 10.99.0.230 (mapped/EIP to 35.177.x.x) and 10.99.0.235 (mapped/EIP to 52.56.x.x). One the gateway this is set up as an alias on eth0. So appears like eth0:1 in the interface table of the gateway. So 10.99.1.10 is getting NATed to 10.99.0.235 which in turn is associated with 52.56.x.x when it leaves the interface. AWS does the association of the EIP to the internal IP.
Now, I can ssh to this internal host 10.99.1.10 using the EIP 52.56.x.x (I have allowed this in Firewall Policy and NAT is obviously configured correctly).
When I add this external IP 52.56.x.x to the encryption domain on the remote gateway, the negotiation and encryption never completes. What I am trying to do is map 1:1 the internal host to a public IP so devices on a conflicting CIDR range at the remote site, can see and connect to this host using the public IP - and the local host can communicate to devices with a similar public 1:1 NATing on that network too.
What I have been told by Checkpoint support is that the encryption of the local host 10.99.1.10 (which is being NATed to 10.99.0.235 : 52.56.x.x)is being encrypted outgoing with the 10.99.0.235 address at the checkpoint and then 52.56.x.x is associated by AWS when it leaves the interface. The remote host receives the packet from 52.56.x.x over the tunnel between their remote/prem gateway, then tries to decrypt the packet, encrypted with 10.99.0.235 address, which is not in their encryption domain.
Like I said above, if I send packets from internet to that host 10.99.1.10, it works, because the NAT/EIP association is happening at the interface on the way in. But on the way out, it doesn't get associated until it leaves the interface (or on the interface - it's not clear in AWS documentation I read exactly how that works).
I tried NATing 10.99.1.10 to 10.99.0.235 and then adding a static NAT from 10.99.0.235 to 52.56.x.x but this didn't help. I have a host configured with 10.99.0.235 which I use for the NAT and also a host with 52.56.x.x which I tried using for the NAT - but I think this fails because the return packet gets converted at AWS side to 10.99.0.235 before it hits the firewall on the way back, something like this:
10.99.1.10 to gw --> 10.99.0.235 NAT --> 52.56.x.x NAT on gw --> (packet leaves gw as 52.56.x.x) ->destination
destination --> 52.56.x.x (aws) --> by AWS to 10.99.0.235 (packet arrives back at gw as 10.99.0.235) -X-> 10.99.1.10
Does that make any more sense? So what I need is a method of adding that second public IP 52.56.x.x and receiving it back on that address and not 10.99.0.235. I am not sure if this is just a quirk of AWS and is not going to work with this use case.
"So the EC2 instance (VM) has two IP addresses on its eth0 interface - 10.99.0.230 (mapped/EIP to 35.177.x.x) and 10.99.0.235 (mapped/EIP to 52.56.x.x).", then it is using AWS' IGW for the gateway, and not Check Point vSEC.
There should not be any EIPs attached to the server you are trying to secure, (exception being is temporary public IP that you can use for out of band console access during preliminary configuration and testing of concepts).
If you want to use Public IP for your server as well as have it addressed by public IP via VPN, you should use additional public IP.
In the diagram below, I am using "fake_IP" and "Fake_IP_Net", which could be any public IPs that do not belong to you and will not interfere with normal Internet access. I.e. I would not use 184.108.40.206, 220.127.116.11, etc..
Should you want to play it safe, you can reserve EIP from AWS but do not actually attach it to anything and use it for both, "Fake_IP_Net/32" and the "fake_IP".
52.26.x.x is only used for accessing the server from the Internet
Fake_IP_Net could be any public IP network (may be with /30), that you are fairly reasonable will not conflict with your normal Internet usage.
fake_IP is a single IP from the Fake_IP_Net.
Local ED is the vSEC's Encryption Domain containing only Fake_IP_Net.
Remote ED is your Peer Gateway's Encryption Domain
I think I may have confused the issue with my terminology. Just for clarification, when I mentioned two interfaces attached to the EC2 instance, I meant the EC2 instance on which the Checkpoint GAIA vSec (Cloudguard) is installed - not the internal host.
So, I followed your instructions and disassociated the EIP from the secondary interface on the checkpoint. I also created a host on the Checkpoint with the public IP address 52.56.x.x (now not associated to anything so I guess fits your fake_IP requirement (and set up a NAT rule from 10.99.1.10 to 52.56.x.x) and created an automatic NAT rule for the Internal Network 10.99.0.0/16 NATed behind the firewall primary IP.
I can ping the internet from the internal host 10.99.1.10 (to 18.104.22.168) successfully.
When I ping a device on the other side of the VPN tunnel, it starts the tunnel and encrypts a packet - sends the packet but nothing ever gets returned. I look at the encrypt log entry and it seems to be encrypting with 10.99.1.10 and not 52.56.x.x - I am guessing this packet is dropped at the destination because it isn't in the encryption domain there (although 52.56.x.x is in the encryption domain).
Checkpoint engineers seem lost and out of their element. I am going to raise another ticket with AWS and try to get both parties on the line to work out how this should be configured. I can't be the only person in the world trying to do this on AWS ... surely?
Let's try to figure it out.
Instead of "a host on the Checkpoint with the public IP address 52.56.x.x", please create a network with /32 mask. Reason being is that CP is by default configured for subnet-pairs in the VPN in communities, which is the way you want it to be:
Add this network to the vSEC's Encryption Domain, not the host.
And when you are saying that "and set up a NAT rule from 10.99.1.10 to 52.56.x.x", do those look like:
And are not superseded by any other rules higher-up that may contradict this behavior?
Once you have it setup in this fashion, please kill the tunnel (using vpn tu command) and try to re-establish it.
Check the logs pertaining to the key exchange and tunnel setup and see what subnets are being listed there.
Additionally, on the remote peer, please define same subnet as a part of vSEC's encryption domain and not the 52.56.x.x host.
As to ping in particular: it's fine for the purpose of monitoring the establishment of a tunnel, but you have to be sure that ICMP is permitted all the way if it is your primary diagnostics tool.
thank you for taking the time to look into this. I have created the /32 network with the fake IP and added this to the encryption domain. I need to contact remote end to have them add as network and not host.
For clarification, on the NAT rule, are you saying change the NAT 10.99.1.10 to the Fake_IP /32 network or still to the fake_IP_host? And the reverse rule. (both outgoing and incoming NAT rules are at the top of the NAT rules table).
I am getting the illustrated behaviour (see images) when I set the FAKE_IP_NET on the NAT. Which looks like it's almost there?
It shows the source as 10.99.1.10 but in the XlateSrc it now shows the 52.56.x.x address. I presume this is now just being dropped at the remote end due to the fake_host_IP rather than fake_network being added? I'll contact the remote end and see what they see on their logs.
Also, the host I am pinging - I have been informed by remote end that this host will respond to the ping and traffic will traverse their firewall correctly.
Now, this looks better:)
In regards to "For clarification, on the NAT rule, are you saying change the NAT 10.99.1.10 to the Fake_IP /32 network or still to the fake_IP_host? And the reverse rule. (both outgoing and incoming NAT rules are at the top of the NAT rules table).":
In the NAT rules, you should use Fake_IP_Host object, not the network.
Network goes into the Encryption domain only.
Please let me know when you can test with the remote peer configured for the network in your vSEC's encryption domain.
we have just spent two hours trying to resolve this and work out what the issue is.
I am pasting a debug from the vSEC console so you can see the errors we are getting.
The encrypted packed is being dropped with the following error message:
;[cpu_3];[fw4_0];fw_log_drop_ex: Packet proto=1 52.56.x.x:2048 -> 148.168.x.x:49900 dropped by vpn_encrypt_chain Reason: encryption failure: could not get route params;
;[cpu_3];[fw4_0];fw_log_drop_ex: Packet proto=1 52.56.x.x:2048 -> 148.168.x.x:8846 dropped by vpn_encrypt_chain Reason: encryption failure: could not get route params;
;[cpu_3];[fw4_0];fw_log_drop_ex: Packet proto=1 52.56.x.x7:2048 -> 148.168.x.x:27951 dropped by vpn_encrypt_chain Reason: encryption failure: could not get route params;
;[cpu_3];[fw4_0];fw_log_drop_ex: Packet proto=1 52.56.x.x:2048 -> 148.168.x.x:34512 dropped by vpn_encrypt_chain Reason: encryption failure: could not get route params;
;[cpu_3];[fw4_0];fw_log_drop_ex: Packet proto=1 52.56.x.x:2048 -> 148.168.x.x:55153 dropped by vpn_encrypt_chain Reason: encryption failure: could not get route params;
;[cpu_3];[fw4_0];fw_log_drop_ex: Packet proto=1 52.56.x.x:2048 -> 148.168.x.x:12819 dropped by vpn_encrypt_chain Reason: encryption failure: could not get route params;
So the packet is being encrypted "correctly"? but the firewall does not think it has a route to destination 148.168.x.x (which should be inside the tunnel and is included in the VPN Domain of the interoperable device - the on-prem gateway - as a network range in the topology settings).
The checkpoint knows it is in the tunnel, and the encryption domain, which is why it's encrypting the traffic, then it seems to lose the plot and forget where it has to send the encrypted packet.
Could this be something quirky on the AWS side or entirely on the Checkpoint config? Did I miss something somewhere?
I do not believe this has anything to do with AWS. This looks to me like strictly Check Point related problem.
The only mention of this error though is in the EndPoint VPN scenarios: "dropped by vpn_encrypt_chain Reason: encryption failure: could not get route params;" error message...
If you are working with TAC, let them know about this sk, it may let them better pin-down the problem
Do you perhaps, have the topology of the Interoperable device defined in its' object properties on CheckPoint side (i.e. actual interfaces)?
If the answer to the above is "Yes", please clear it out. The only properties required to be defined for it are its external address, VPN domain containing the network in question and VPN community it belongs to.
Remember to clear the tunnel each time you are making changes (using vpn tu).
I would like to see which subnet pairs are being listed in the logs during VPN initialization.
I think we finally have it sorted out! After running through your recommendations and making all the changes, we still could not get the final piece tied up. Then a checkpoint engineer suggested un-checking the NAT Traversal (Support NAT Traversal) setting on our checkpoint gateway and everything started flowing!
We can now ping and send/receive https requests from the AWS network into the on-prem network using the public IP addresses.
Now we just need to test the final piece which is to get the incoming traffic (from on-prem users) contacting the devices inside our AWS network using the public IP addresses.
Thank you so much for the help - all of this is now being documented so we can replicate in production now
Glad to be of help and am happy you are getting closer to nailing it:)
If I may make a suggestion, include in documentation filtered logs depicting normal tunnel setup process (if possible, from both sides), as well as output from tcpdump and fw-monitor (with fwaccell off) from vSEC and tcpdump from the server behind vSEC.
Hopefully, you will never have to use it, but if you are, having a baseline for comparison is always nice.