Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Vladimir
Champion
Champion

Problem accessing standby cluster member from non-local network

Log shows accepted traffic on SSH and 443, cluster members connected to number of Cisco switches with VLANs in L2 mode.

No problem accessing both members from connected network.

vMAC in the cluster object IS ENABLED.

Any suggestions will be appreciated.

Thank you.

27 Replies
AlekseiShelepov
Advisor

How about fwha_forw_packet_to_not_active? Helps with similar situations with ping also.

Cluster debug shows "FW-1: fwha_forw_ssl_handler: Rejecting ssl packets to a non-active member" 

Try with just entering # fw ctl set int fwha_forw_packet_to_not_active 1, and if works, enable on permanent basis in fwkern.conf.

I hope there is still fwkern.conf on R80.10

Vladimir
Champion
Champion

I'll give it a shot shortly and let you know if it works.

Thank you.

0 Kudos
Vladimir
Champion
Champion

Kaspars,

The situation is as you described, i.e. accessing from the "other" side.

However, ssh and https traffic to the management interface of the standby member logged as "accepted".

That being said, I'll try adding the route and see if it does the trick.

Thanks!

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

This is what happens if you don't have the specific /32 to the standby. Packet arrives on active FW. It gets accepted and needs to be forwarded to standby box. It will do so based on topology, that is outside interface. But when this packet arrives to standby on outside interface with source IP from inside network, it will get dropped as spoofed. Unless spoofing is off Smiley Happy

Vladimir
Champion
Champion

Ah, but I've run the infamous fw ctl zdebug drop on the standby member (not yet in production), and have not seen the drops there.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Will try tomorrow in one of clusters. Interesting. I just recall seeing spoofing drops in logs. Let you know.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Here you go, fw monitor shows packet being accepted on incoming interface on active cluster member. But no outgoing packet there

fw ctl zdebug shows on active member

Note that this is without enabling fwha_forw_packet_to_not_active. After I enable it it started working immediately. Without static routes. Smiley Happy

But I have seen your case not that long ago and that's why I remembered about /32 option..

Sorry won't have much time to dig into my past notes what happened there.

Vladimir
Champion
Champion

Thank you again for looking into it: I have not seen the drops because I was looking for them on the standby member, thinking that if I see "accepts" in the log, the primary was not dropping them.

Apparently it drops them on egress.

Of two solutions available, which one do you like better?

AlekseiShelepov
Advisor

Hehe, I think I found exactly your situation Smiley Happy

Connection from one side of the ClusterXL destined to the physical IP address of a non-Active cluster member on the other side of the ClusterXL fails - sk42733 

I just wanted to mention that this fix with fwha_forw_packet_to_not_active is also connected to many other possible issues:

Simultaneously pinging the cluster members and the VIP address...  

"Contract entitlement check failed" error on policy installation failure 

Cluster debug shows "FW-1: fwha_forw_ssl_handler: Rejecting ssl packets to a non-active member" 

"ERR_CONNECTION_REFUSED" error is displayed in web browser when connecting to Gaia Portal 

Updates For Anti-Virus/Anti-Bot/Application Control/URLF blades are not working on standby ClusterXL... 

So, it might be even a "best practice" Smiley Happy  But I have only experience with versions before R80, don't know how it is there. Also these "strange" routes might be a bit confusing for a new administrator for example.

Vladimir
Champion
Champion

I agree that given how many issues this setting resolves, it may make better sense to have it on by default and have an option of commenting it out.

I wander what is the reason for it not to be and what possible side effects of it being set are.

Vladimir Yakovlev

973.558.2738

vlad@eversecgroup.com

0 Kudos
Vladimir
Champion
Champion

In the process of deploying new cluster of 15400s and ended-up using this suggestion.

Just wanted to tip my hat to you.

Thanks,

Vladimir

Huseyin_Rencber
Collaborator

I had similar problem, solved after setting "fw ctl get int fwha_forw_packet_to_not_active" value to 1. sk42733 was helpful

Kaspars_Zibarts
Employee Employee
Employee

You probably trying to connect to IP that's on the "other" side of the firewall and not locally connected IP. If you have only one route to the firewall VIP then traffic will get dropped as spoofed between firewalls. You can create manual static route for IP on the other side to point to standby memebr's locally connected IP interface address

Hope it makes sense Smiley Happy

In other words if you have inside  member-act 1.1.1.1 and member-stb 1.1.1.2 with VIP 1.1.1.3. And outside has IPs 2.2.2.1 and .2 and .3 then to connect to 2.2.2.2 via inside interface you will need to add static /32 on the router: 2.2.2.2 next-hop 1.1.1.2

Or I misunderstood maybe the problem?

0 Kudos
Vladimir
Champion
Champion

Kaspars and Alexei, thank you guys!

Ended-up using /32 routes as per Kaspars suggestion since I've recalled being in similar situation before and this being a less intrusive solution.

I will keep Alexei's solution in my toolbox.

That said, I still am baffled as to why zdebug drop did not yield anything on standby member when it was failing.

Regards,

Vladimir 

Kaspars_Zibarts
Employee Employee
Employee

Jerry
Mentor
Mentor

got similar story where

standby device in A/S HA is accessible by IP but GAIA Portal (http2 daemon) isn't working at all

I've regenerated self-signed certs

Still no go

httpd won't start

I can ssh to the standby device but one thing isn't accessible - https/http either on 443 or any other port simply HTTP daemon won't start on that device (there were no chances to that cluster since nearly 1y in terms of the PKI/FQDN etc.)

any ideas folks? alreago got a SR with TAC Smiley Happy just from yesterday.

Cheers

Jerry

Jerry
0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

I would say it's a separate issue as you can SSH to standby cluster member. So L3 routing is working end to end. There are multiple SKs regarding dead httpd, really depends on SW version you are running and actual errors you see. You might want to try this

How to debug the Gaia Portal 

0 Kudos
Jerry
Mentor
Mentor

I did debug even with Ottawa TAC man - still no joy Smiley Sad

Cheers Kaspars, I do appreciate what you've wrote just now but all this is already known, we know it isn't routing or access lists issue but httpd dead as you called it.

now clue what's the problem but that HA runs pretty much latest "takes" ...

Jerry
0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

I just meant to say that this thread probably is irrelevant to your case so best would be to start new one  

Jerry
Mentor
Mentor

I did not nobody replied yet ...

thanks. eot.

Jerry
0 Kudos
Johannes_Schoen
Collaborator

Hi all,

found this SK, but the kern.conf option is not helping.
I'm on R80.20.

As soon as the cluster moves, the connection to the standby node is not possible (neither ping, gaia-ssh, gaia-web).

I'm on a Windows host connected to a server segment (default route to vip) and behind the check point is the management-network with both Check Point nodes (vip + primary, secondary node address). So <myserver> <servernet> <checkpoint><mgmt-net>.

The Gaia management interface is set to the interface, connected in mgmt-net.

Any ideas?
Traffic log says accept SSH/HTTPS but unfortunately I cannot do tcpdumps, due to the fact, that the device is not reachable.
(physical console connect is very complicated here).

Best Regards
Johannes
0 Kudos
Kim_Moberg
Advisor

Hi Johannes

you need to create a TAC to get a accpack fix for this. I was told it will be included in the next omgoing take above take 50.

My TAC are not closed yet because I am missing another fw_wrapper.

for R80.30 take 8 I recieved both accl pack and fw HF to solve it.

I will keep you updated

Best Regards
Kim
0 Kudos
Johannes_Schoen
Collaborator

Hi Kim,

we are on R80.20 JHF Take 91 GA - should this be included?
Is it a SecureXL problem when reading accpack? It's a 3200 ClusterXL.

R80.30 will need to earn trust in the next month - currently I'm not happy what I see on the community regarding R80.30

Best Regards
Johannes
0 Kudos
Kim_Moberg
Advisor

Hi Johannes,
I actually got it fixed on R80.20 and then I lifted the version to R80.30.
I am not sure R80.20 JHF take 91 is included.. then you will need to raise a TAC and asking for fix to "problem accessing standby cluster member from non-local network".

For me the problem is via site2site vpn were I cannot access standby members ssh or Webui. Ping works. Meanwhile I use active member as jump host but this should be fixed can I know R&D fixed it but I don't know if they will port til R80.20 GA version yet?

The issue is not related to appliances type. It is clusterXL in general. Worked in R80.10 but something changed in R80.20 and also in R80.30.

I was very interested in using the Threat Extraction while users surfing the internet. Very nice feature..
What issues are you concerned about in R80.30?
I am very satisfied with the version and for me quite stable so I am interesting to hear concerned about moving to R80.30?
Best Regards
Kim
0 Kudos
Johannes_Schoen
Collaborator

Hi Kim,

the problem with accessing the standby over VPN is known and I stopped looking for a solution after hearing several times, it's expected behavior. All my customers with IPSec VPNs between corporate sites will get another vendor, who's firewall are able to be monitored by a VPN-remote system. I found a temporary solution by using a dummy-nat construction, but from time to time, the solution doesn't work anymore and we receive monitoring alerts again .... so it's expected behavior and Check Point will not be sold to these kind of customers.
In this case, the problem is without VPN, two networks are directly connected to the cluster, Jumphost is in Net-A, CP-MGMT Interface is in Net-B. The cluster members can ping themselves over the sync and Monitored-Interfaces, but SSH is not possible. We got several customers with ClusterXL and I never experienced that before.

As soon as you reboot the active node, you need to wait ~15 seconds, and then the secondary is reachable.

And regarding R80.30: We got so many issues when going to R80.20, so I will use the oldest still supported software, to avoid problems. E.g. for Endpoint on R80.30 there were a few interesting posts in Checkmates, which doesn't provide trust in the version yet.

Best Regards
Johannes
0 Kudos
Kim_Moberg
Advisor

A short update

yesterday I deployed r80.30 ongoing take 71 which seems to have solved the issue though the PTJ-1657 does not be listed as an issue fixed in the take.

Best Regards
Kim
0 Kudos
artem_kruhlyi
Contributor

Hi all,

I got the same issue accessing standby cluster member from non-local network. I did "ctl set int fwha_forw_packet_to_not_active 1" and now "fw ctl zdebug drop" shows "dropped by fwha_forw_flush_callback Reason: Successfully forwarded to other member" when I'm trying to reach second node via ssh. However, I see no packets at second node.

 

Any ideas?

 

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events