Re: Can't monitor secondary node over IPSec tunnel

Johannes_Schoen · ‎2019-02-21

Dear Community,

as an ISP we are monitoring our customer environments throug IPSec tunnels from our datacenter.

I don't know why, but two of our Check Point installations are strange - I cannot access the secondary node through IPSec - other sites work well with the same design. One troublemaker runs an old VRRP cluster (R77.30), the other on is a clusterXL (R80.20).

This is the general setup:

The montoring server is able to contact the MGMT VIP and node one, but obviously we need to monitor the second node as well.

The kernel param "fwha_forw_packet_to_not_active" is set to yes on both nodes, but packets are getting dropped as "received unencrypted packet...should be encrypted". I also tried to do a hide nat with a dummy ip to masquerade the access to the second node, as if it is sourced from that dummy ip - didn't work either.

I can't find the point I'm missing here - hopefully the community can help?

Best Regards

Johannes

Maarten_Sjouw · ‎2019-02-21

For VRRP there are 2 settings in the dash, on the cluster object, forward to cluster member and hide behind cluster IP:

Turn them both off and these type of issues are no longer happening.

For the clusterXL double check that the correct member is dropping the traffic.

Regards, Maarten

Johannes_Schoen · ‎2019-02-21

Dear Maarten,

many thanks for your response, I wasn't aware of that VRRP setting.

But unfortunately that doesn't work, I still got this error (on the second node):

Maarten_Sjouw · ‎2019-02-21

Then you still have a problem with either that community (maybe a excluded service) or the VPN topology is not correct.

Regards, Maarten

Johannes_Schoen · ‎2019-02-21

Doublechecked that.

VPN domain monitoring site: <monitoring-dmz-net>

VPN domain production site <management-net)

No natting between both of them

No excluded service in the domain.

The strange thing is, that we have working sites with the same configuration, so I'm wondering why that "unencrypted received...expected encrypted" occurs

G_W_Albrecht · ‎2019-02-21

This shows that the ping is not sent thru VPN tunnel...

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Johannes_Schoen · ‎2019-02-21

I can see on the interoperable ipsec device that the monitoring traffic for the primary and secondary node is sent to the tunnel.

Communication to primary node is working well, so I guess that is no routing issue.

I think the primary Checkpoint terminates the IPSec tunnel, removes the encryption (as expected) and sends the data to the secondary node with the nearest interface (in this case management interface). Somehow the secondary node expects the traffic to be encrypted anyway.

Or do you mean I need two vpn tunnels against the Checkpoints? (it's possible due to that "we-need-3-public-ips-even-if-we-use-only-one-thing")

Maarten_Sjouw · ‎2019-02-21

There is no way to do this indeed, as you say the traffic is decrypted on the active node and forwarded to the backup, and dropped there because it is cleartext.

You cannot build separate tunnels to the different members.

Regards, Maarten

G_W_Albrecht · ‎2019-02-21

Maybe the workaround from sk106425 can help ?

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Johannes_Schoen · ‎2019-02-21

Thanks for the suggestion, but the sk describes drops due to address spoofing, I think I have a different problem here

Maarten_Sjouw · ‎2019-02-21

The spoofing error is due to the fact that the traffic comes from the other member, as I said before, look at the incoming interface the traffic is being dropped upon.

Regards, Maarten

Johannes_Schoen · ‎2019-02-21

To be honest, I don't understand what you mean/referring to.

The incoming interface for the monitoring traffic on the second node (which drops that traffic) is the management-interface with ip <mgmt-subnet>.3. Topology is defined as per ip and subnet and anti-spoofing is set to detect for that interface.

The firewall detects address spoofing, but the drop reason is due to the screenshot above.

If you really cannot monitor the hardware through vpn, we cannot recommend checkpoint products for any customer.

I don't know any other relevant firewall vendor where that issue is expected behavior (without setting options which will decrease the security) - please correct me if I'm wrong

G_W_Albrecht · ‎2019-02-21

Why not open a ticket with TAC ? Maybe here we have different # fw ctl get int fw_allow_simultaneous_ping settings, or just different routing ! Monitoring of hardware mostly uses SNMP, not ICMP...

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Johannes_Schoen · ‎2019-02-21

Yeah, I guess that is the next step.

It's the same when using snmp, but a continous ping from commandline is not that distracting for our colleagues, than having our monitoring system sending snmp requests, which result in an error state.

Maarten_Sjouw · ‎2019-02-21

What I mean by the interface is that when the traffic comes through the VPN tunnel to FW1 to go to FW2, it will be sent from FW1 to FW2 over a specific interface, so when you look at the details of the log entry, what is the interface you see the traffic come in on FW2 where it is being dropped, therefore the spoofing could because it is coming in over an interface where this traffic should not be entering normally.

Lets say you use the eth6 interface for monitoring and the inteface that the VPN terminates on is eth1, you do not have the monitoring system as a network you would expect to enter on eth6, you expect that to enter on eth1, butr as fw1 decrypts the traffic and forwards it to fw2 eth6 this would not accept this traffic there and drop it on spoofing.

Regards, Maarten

Johannes_Schoen · ‎2019-02-22

Ah, alright - I understand what you mean.

Right, the src:<monitoring-server> was received on the <mgmt> interface on the secondary node- that could had been a problem.

I set a static route on the primary host saying dst:<secondary-mgmt>/32 interface <external>

Now the traffic is received on the secondary firewall on the <external> interface, but the error message stays the same.

I will keep you updated how the ticket ends

Johannes_Schoen · ‎2019-02-21

So it's not possible to monitor two hardware node through an ipsec tunnel with Check Point?!

Maarten_Sjouw · ‎2019-02-21

Not if they are in a cluster and the tunnel terminates on the cluster. Been there.

Regards, Maarten

Koby_Kagan · ‎2019-02-21

Hi Johannes,

I know there is was an issue with "fwha_forw_packet_to_not_active" in R80.20.

A hotfix was released today on top Jumbo Take 43.

I suggest to open a ticket to TAC and request this Hotfix:

In the description mention that you have R80.20 and fwha_forw_packet_to_not_active is not working as before, and request this Hotfix R80_20_t43_jhf_218_main.

I suspect it will solve your issue.

Thanks,

Koby

Norbert_Bohusch · ‎2019-02-21

Cool, we also encountered issues with this kernel parameter not working at some customer with R80.20.

Koby_Kagan · ‎2019-02-21

sk147493

G_W_Albrecht · ‎2019-02-22

This is the answer to the second part of the issue, the clusterXL R88.20. Why this also occurs on the VRRP cluster R77.30 is still the question. Does this work on other VRRP R77.30 clusters ?

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Johannes_Schoen · ‎2019-02-22

I can't tell, we only got one VRRP Cluster with R77.30 in our support

Johannes_Schoen · ‎2019-02-22

Dear Koby,

thanks for that suggestion, in two weeks I'm with the customer, we will try that out.

I'll keep you updated

Johannes

Koby_Kagan · ‎2019-03-09

Hi Johannes,

you will need to use the kernel parameter "fwha_forw_packet_to_not_active=1" and in case of IPS drop (i would expect drops from IPS as the cluster is not updating it about this traffic due the info below) we need set an exception.

So after enable the parameter we should look in IPS logs for drops, and set an exception.

As per R&D:

Currently APPI and IPS are not supported in R80.20 with these types of connections: - Connections from/to Standby via IPSec tunnel - Connections to Standby that are routed via the Active member This limitation will be lifted in a future update. Till then the workaround is to add IPS exception for connections to/from gateway’s private IPs.

I hope this info is helpful.

Thanks,
Koby

JozkoMrkvicka · ‎2019-02-21

... or downgrade to R77.30 where "fwha_forw_packet_to_not_active" is working well (solution in place within our environment).

Kind regards,
Jozko Mrkvicka

LadislavNemecek · ‎2019-02-21

Would like to share our experiences with this topic. Result is we were decided/forced to disable this "feature".

By default as per design it's not possible access standby node via IPSec, but CP implemented "fwha_forw_packet_to_not_active" which allows this and you can monitor/access standby node.

But since this time we experienced quite big issues (sporadically!) on vpn sites implemented this. Symptomts: connections get lost, regular traffic packet drops, no explanation for.

Long story short - vpn sequence numbers mishmash and gw drops packets as a out of sequence vpn attack. Regular traffic and standby node traffic index numbers are different and when arrived out of sequence frames were dropped. Quite funny as more we focused on troubleshooting or monitoring standby/primary more out of sequence and more drops...

Check your logs, bet you can find out of sequence drops as well

Outcome of CP discussions:

First of all monitoring of standby node not supported
if you want you can implement fwha_forw_packet_to_not_active but can cause this behavior
works as design
can be resolved by disable of "vpn out of sequence" attack prevention - verified

Koby_Kagan · ‎2019-03-09

Regarding fwha_forw_packet_to_not_active in R80.20

This is the reason why "vpn out of sequence" attack is preventing the traffic (bellow)

An update from R&D: ( they are planing to fix it )

Currently APPI and IPS are not supported in R80.20 with these types of connections:
- Connections from/to Standby via IPSec tunnel
- Connections to Standby that are routed via the Active member

This limitation will be lifted in a future update. Till then the workaround is to add IPS exception for connections to/from gateway’s private IPs.

JozkoMrkvicka · ‎2019-02-22

Or, if possible, change monitoring port to something very special (like tcp_43634) in order to exclude this port from VPN.

Kind regards,
Jozko Mrkvicka

Olavi_Lentso · ‎2019-03-01

Ladislav expained the topic very well, but I would like to add few words.

This not so elegant design isn't specific to R80.x at all, we had similar sporadic VPN disruptions under R77.x and the CP support explained that the standby cluster member is not supposed to send any traffic, including response packets to the VPN, at first we couldn't believe that this is being considered acceptable design, but you learn something new every day.

Such a limitation should be mentioned in VPN admin guide in capital letters and with many exclamation marks!

Any packets from the standby node addressed to any remote VPN encryption domain will trigger standby firewall to start speaking directly to the remote VPN peer and this activity is able to break existing tunnels between the remote Check Point VPN peer and the active node. Size of the impact is determined by tunnel granularity: the biggest impact is when there is single tunnel defined between the gateways and the least impact happens when there exist a SA pair for each host pair.

In the ideal world the standby node should route any traffic to remote enc domains via the active node, as it knows it is in standby state, but the existing design is different and the standby node just tries to negotiate IKE and IPSec SA-s in parallel.

As mentioned above, IPSec negotiation traffic from the standby node is detected as IPSec Replay Attack in remote CP peer and the VPN will be down for some time.

Are you a member of CheckMates?

Can't monitor secondary node over IPSec tunnel