Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
RobJames
Explorer

Connection terminated before the gateway could make a decision

I have 2x 9300 Gateway on R82 take 44 in an Elastic XL active-active setup that i'm having an odd issue with.

When trying to connect to certain websites from inside the network, they just timeout. Tested from home, or on mobile etc away from the network the sites work fine, so I know its a local issue.

In the logs I see an Accept log entry, but it also says "Connection terminated before the Security Gateway was able to make a decision: Insufficient data passed. To learn more see sk113479.". I've read that KB which seems to basically say theres just not enough data to classify the site. Funnily one of the affected sites was this community.checkpoint.com until a few days ago, when the site changed IP address, now it works. But its also giving us grief with all kinds of random applications, even signing in to gmail/yahoo etc fails now. One site we know fails is www.pingman.com but it works externally just fine.

I logged a TAC case, and after doing some packet captures, we've confirmed the request/connection to the site(s) does leave the firewall, it shows:

  • TCP 3-way handshake completes successfully (SYN → SYN/ACK → ACK)
  • Client sends TLS Client Hello (SNI = community.checkpoint.com)
  • After 30 sec(Timeout), the client sends FIN, ACK
  • No Server Hello or TLS response is seen
  • This means the connection is closed very early, right after the Client Hello.

So TAC have basically said its not a Checkpoint issue, they've said its an upstream issue and I have a case logged with our provider. It kind of feels like a routing issue where the initial connection goes through, but the server Hello or TLS response goes who knows where. But i'm wondering if somewhere the connection is going to the other node in the Elastic XL cluster, but since it has no record of the "conversation", it drops it. Maybe. Or its an external routing issue.

Its worth noting that a tracert to the affected sites all stop at one of the upstream providers devices.

I'm at a loss and could do with any suggestions anyone might have.

Thanks

Bob.

0 Kudos
12 Replies
the_rock
MVP Diamond
MVP Diamond

Hey Bob,

Literally, in simple words, what all that means that somewhere along the way, 3 way handshake is NOT completing, but its not on the fw side.

Best,
Andy
"Have a great day and if its not, change it"
the_rock
MVP Diamond
MVP Diamond

Btw, check out below link and all the sub links included, I believe it fully explains this bahavior.

 

https://community.checkpoint.com/t5/Security-Gateways/quot-CPNotEnoughDataForRuleMatch-quot-and-quot...

Best,
Andy
"Have a great day and if its not, change it"
the_rock
MVP Diamond
MVP Diamond

@RobJames 

Hope that helped?

Best,
Andy
"Have a great day and if its not, change it"
0 Kudos
RobJames
Explorer

It sort of helped, its basically what I already knew, mostly i'm trying to work out why we're not getting the completed handshake.

0 Kudos
the_rock
MVP Diamond
MVP Diamond

You can always do fw monitor on the fw and see what you get. 

This site my colleague made while ago is good reference.

https://tcpdump101.com/#

I feel that if you get the other side to do remote and verify, should be more clear.

Best,
Andy
"Have a great day and if its not, change it"
0 Kudos
Bob_Zimmerman
MVP Gold
MVP Gold

What kind of captures did you run? Specifically, do you have an fw monitor proving the TLS Client Hello actually makes it out the firewall's Internet-facing interface?

0 Kudos
RobJames
Explorer

We ran captures directly on the firewall from the command line using the following on both the ingress and egress interfaces:

tcpdump -nnei bond9.704 host 108.139.10.129 -s 0 -w /var/log/tcpdump_ingress.pcap

0 Kudos
the_rock
MVP Diamond
MVP Diamond

Can you send the output here (if allowed)? That way, we can review and see what gives.

Best,
Andy
"Have a great day and if its not, change it"
0 Kudos
RobJames
Explorer

This is a zip file with tcpdumps for ingress and egress trying to access community.checkpoint.com which at the time resolved to 108.139.10.75.

At that time we couldnt access it, but it now resolves to 108.139.10.85 and we can access it.

0 Kudos
PhoneBoy
Admin
Admin

Since this is TLS and we perform SNI verification, the gateway will initiate a connection to explicitly verify the SNI against the destination IP's SAN.
I believe that connection will be initiated from the gateway's external IP.

From what you've said so far, it sounds like an upstream issue.

0 Kudos
emmap
MVP Gold CHKP MVP Gold CHKP
MVP Gold CHKP

If it were an issue with the connection getting to the other gateway, we wou'd expect a 'tcp out of state' drop in the traffic logs. That said, the system shouldn't work like that - all packets between the same two IPs should get to same gateway. 

As you said this is a perimeter gateway I assume you're doing hideNAT. Doesn't seem like this should be the issue, but would be good to enable the NAT hotfix. You have it installed with JHFt44 but it needs enabling with part 2 here. https://support.checkpoint.com/results/sk/sk183481

The ultimate way to test whether this is a clustering issue when it comes to EXL is to set one of the gateways down. If it works with either gateway by itself but not with both, then it's an EXL issue. If it's still broken with just one gateway, then it's something else.

0 Kudos
the_rock
MVP Diamond
MVP Diamond

Defiinitely logical idea, Emma.

Best,
Andy
"Have a great day and if its not, change it"
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events