Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Andy_Nguyen
Explorer
Jump to solution

Long-lived TCP connection got timed-out ungracefully. First packet isn't SYN. TCP-Flag: PUSH-ACK

Checkpoint Next Generation FW: R80.10

Aggressive aging: enabled

Virtual session timeout: 3600(s)

We have a long-lived TCP connection over the Checkpoint gateway firewall. After 1 hour of idle, the connection got timed-out by checkpoint, and on the checkpoint we found the error: "First packet isn't SYN. TCP-Flag: PUSH-ACK"

Is this because Checkpoint doesn't drop the connection nicely (not sending the FIN flag to the source) which caused the source keep sending data without initiate a new connection? If it's the case, how can we configure Checkpoint to send FIN to the source when it drops connection and should we do that?

1 Solution

Accepted Solutions
Timothy_Hall
Legend Legend
Legend

You can't have the Check Point send a FIN upon state table connection timeout, but you can have it send a RST to both sides which will immediately notify them that the connection is dead.  From my book:

Sending a TCP RST upon Connection Expiration


Some applications (such as web servers backending into a database server with a SQL
connection) create a large number of TCP connections, yet never send any appreciable
amount of data in them. All stateful inspection firewalls (including Check Point) enforce
an idle timer on all open connections. Check Point’s TCP connection idle timer is set to
60 minutes by default. Unfortunately when the Check Point TCP idle timer expires a
TCP connection, by default that connection is silently removed from the firewall’s state
table, and no notification of any kind is sent to the two systems who established the TCP
connection. When one of them attempts to send some data in the now-dead connection,
it is promptly dropped by the firewall with a “TCP out of state” message. Regrettably
some applications are just too stupid to quickly realize the connection is dead; they
continue attempting to use it while their traffic continues to be dropped.
At some point, one or both of the end systems involved finally figures out their
connection is truly and irrevocably dead, launches a new TCP connection that is
immediately permitted by the firewall, and everything starts working again. Depending
on how long it takes one or both sides to figure out what happened, the application may
appear to be hung and the user’s perception of overall application performance will be

terrible. Undoubtedly the firewall will be blamed for the application’s behavior, and
once again the firewall administrators must exonerate themselves. There are three
solutions to this problem:


1. Enable TCP keepalives on one or both of the systems participating in the TCP
connections that keep getting idled out. The keepalives will need to be sent at
least every 60 minutes. This is not a popular choice as it involves application
system changes to fix what is perceived as a “firewall problem”.


2. Increase the idle timeout for the Service object (SQL in our example) on the
Advanced tab of the service object from the default of 60 minutes up to a
maximum of 86400 seconds (24 hours) This will probably help but is not
foolproof:

3. Configure the firewall to send a TCP RST packet to both participants of a
connection that has been idled out. Upon receipt of the TCP RST, both
participants instantly realize their connection is gone and launch a new one
immediately to recover from the situation. To enable this feature “on the fly”, the

command is fw ctl set int fw_rst_expired_conn 1 . This setting will
not survive a reboot, so to make it permanent you’ll need to see the following for
more details: sk19746: How to force a Security Gateway to send a TCP RST
packet upon TCP connection expiration.


In some cases however #3 will not fully remediate the situation, and you will be
forced to go one step further with this: fw ctl set int
fw_reject_non_syn 1 . A classic example of an application that requires this
firewall setting is SAP HANA traffic. This setting also handles client port reuse
out of state errors when RST packets from the server to the clients get lost (e.g.
due to policy install or packet loss). Bear in mind however that this setting is
quite likely to make your friendly auditor/penetration tester upset with you, since
the firewall will now issue a TCP RST for all received packets that are out of
state and have the ACK flag set. An auditor running a TCP ACK nmap scan will
have it light up like a Christmas tree with tens of thousands of ports showing up
as filtered instead of closed. For this reason, using this setting is generally not
recommended on an Internet perimeter firewall, but may be acceptable on some
internal firewalls.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com

View solution in original post

9 Replies
Timothy_Hall
Legend Legend
Legend

You can't have the Check Point send a FIN upon state table connection timeout, but you can have it send a RST to both sides which will immediately notify them that the connection is dead.  From my book:

Sending a TCP RST upon Connection Expiration


Some applications (such as web servers backending into a database server with a SQL
connection) create a large number of TCP connections, yet never send any appreciable
amount of data in them. All stateful inspection firewalls (including Check Point) enforce
an idle timer on all open connections. Check Point’s TCP connection idle timer is set to
60 minutes by default. Unfortunately when the Check Point TCP idle timer expires a
TCP connection, by default that connection is silently removed from the firewall’s state
table, and no notification of any kind is sent to the two systems who established the TCP
connection. When one of them attempts to send some data in the now-dead connection,
it is promptly dropped by the firewall with a “TCP out of state” message. Regrettably
some applications are just too stupid to quickly realize the connection is dead; they
continue attempting to use it while their traffic continues to be dropped.
At some point, one or both of the end systems involved finally figures out their
connection is truly and irrevocably dead, launches a new TCP connection that is
immediately permitted by the firewall, and everything starts working again. Depending
on how long it takes one or both sides to figure out what happened, the application may
appear to be hung and the user’s perception of overall application performance will be

terrible. Undoubtedly the firewall will be blamed for the application’s behavior, and
once again the firewall administrators must exonerate themselves. There are three
solutions to this problem:


1. Enable TCP keepalives on one or both of the systems participating in the TCP
connections that keep getting idled out. The keepalives will need to be sent at
least every 60 minutes. This is not a popular choice as it involves application
system changes to fix what is perceived as a “firewall problem”.


2. Increase the idle timeout for the Service object (SQL in our example) on the
Advanced tab of the service object from the default of 60 minutes up to a
maximum of 86400 seconds (24 hours) This will probably help but is not
foolproof:

3. Configure the firewall to send a TCP RST packet to both participants of a
connection that has been idled out. Upon receipt of the TCP RST, both
participants instantly realize their connection is gone and launch a new one
immediately to recover from the situation. To enable this feature “on the fly”, the

command is fw ctl set int fw_rst_expired_conn 1 . This setting will
not survive a reboot, so to make it permanent you’ll need to see the following for
more details: sk19746: How to force a Security Gateway to send a TCP RST
packet upon TCP connection expiration.


In some cases however #3 will not fully remediate the situation, and you will be
forced to go one step further with this: fw ctl set int
fw_reject_non_syn 1 . A classic example of an application that requires this
firewall setting is SAP HANA traffic. This setting also handles client port reuse
out of state errors when RST packets from the server to the clients get lost (e.g.
due to policy install or packet loss). Bear in mind however that this setting is
quite likely to make your friendly auditor/penetration tester upset with you, since
the firewall will now issue a TCP RST for all received packets that are out of
state and have the ACK flag set. An auditor running a TCP ACK nmap scan will
have it light up like a Christmas tree with tens of thousands of ports showing up
as filtered instead of closed. For this reason, using this setting is generally not
recommended on an Internet perimeter firewall, but may be acceptable on some
internal firewalls.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Andy_Nguyen
Explorer

This is also what we suspected. Thank you very much for your thorough explanation. 

Christopher_Rag
Participant

With option 3 (sk19746) enabled, I am looking for the best way to verify the option is working. I have run a TCPDUMP on the gateway but do not see any reset packets for the traffic that is timing out. Is there any debug command that could be run to verify the option is working? I have some traffic set to a custom idle timeout of 900 seconds (5 minutes). I can see the connections timing out by monitoring the connection table (fw tab -t connections -u) but I do not see any reset packets.

I have a support case open with Checkpoint, but so far no help.

A little supporting information. R80.30 gateways, open hardware, fw accel off during the TCPDUMP. Monitored both interfaces in both directions of the traffic flow.

Any assistance will be greatly appreciated.

0 Kudos
Niels_Bijleveld
Explorer

I have had a same sort of issue. We have a customer who had a problem with an application that was non tcp compliant.
The traffic was dropped with the famous "First packet isnt syn message". This was for the tcp flag PSH & RST.
When I followed the stream. I saw that the connection was removed from the connection table, but the appliaction was still thinking to reuse the connection.
There was a mismatch in the tcp communication. I saw this thread and thought about the options, but they are all globally.

I found another solution. I trust the source and destination and found the sk11088. What says the following:

Bypassing this mechanism for specific scenarios means that non-synchronized packets that do not belong to an established
connection in the Security Gateway's connections table OR non-TCP compliant traffic will not be dropped,
but instead matched against the Security Policy (Rule base).

So, I made a bypass for this source and destination. It is not nicest solution, but its better than do a global setting that's affecting all traffic passing through the firewall.

If you trust the traffic than this can be a live saver.

0 Kudos
Niels_Bijleveld
Explorer
I wanted to add the solution to my own problem and I will like to share it with you. It can be helpfull for the first packet isn syn message. I monitored the connection table for a specific ip. I couldnt match why is will be dissappeared, but finally I saw it and tested it. The problem lay in the aggressive aging of the https service port. After removing this the connection it keeped stable and the RST packet can know be send becasue the connection will be still there and now it will not be dropped. So I revert the bypass method as described above and disabled the aggressive aging of https service. This works much better now.
0 Kudos
Gerard_van_Lee1
Participant
option 3 (first part "fw ctl set int fw_rst_expired_conn 1") solved it.
Thanks for you explanations about this subject.
0 Kudos
Alexander_Wilke
Advisor

Will not work reliably if you upgrade to R80.20 and higher.

Keep this in mind.

0 Kudos
Jelle_Hazenberg
Collaborator
Collaborator

Thanks for your answer Tim. If i might ask... i find sk19746 a bit confusing. If i read the SK i see this part:

Important: due to code limitation, starting from R80.20, this feature does not work when SecureXL is enabled.
As a workaround, add "fw_reject_non_syn=1" to the $FWDIR/modules/fwkern.conf file.
This will provide the same result and will send RST packet post “First packet isn’t SYN”.

Does this mean the "on the fly" part only works with SecureXL Off OR this entry "fw_reject_non_syn=1" in the $FWDIR/modules/fwkern.conf file?

The best way is to test this in my my lab i know(and i will)... but i would like to know if i am misreading this information...

0 Kudos
Alexander_Wilke
Advisor

This will not work for accelerated connections:

fw_rst_expired_conn

 

So if you have SecureXL enabled and you are running R80,20 and higher this will not work for you for most of the connections because they are accelerated in some way. SXL, PXL, QXL, ... and only very few F2F.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events