<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Performance issue generating VPN instability in Firewall and Security Management</title>
    <link>https://community.checkpoint.com/t5/Firewall-and-Security-Management/Performance-issue-generating-VPN-instability/m-p/217478#M41423</link>
    <description>&lt;P&gt;Have you tried vpn accel off? Also, any drop log?&lt;/P&gt;
&lt;P&gt;Andy&lt;/P&gt;</description>
    <pubDate>Thu, 13 Jun 2024 18:21:14 GMT</pubDate>
    <dc:creator>the_rock</dc:creator>
    <dc:date>2024-06-13T18:21:14Z</dc:date>
    <item>
      <title>Performance issue generating VPN instability</title>
      <link>https://community.checkpoint.com/t5/Firewall-and-Security-Management/Performance-issue-generating-VPN-instability/m-p/217352#M41387</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After a recent upgrade to R81.20 we are seeing recurring issues with some IPSEC site-to-site VPNs. There are roughly 30 tunnels hosted on a 6200 box with not so much traffic but critical importance as it is connecting AWS, Azure and on-prem for different web applications.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The facts I have up to know are that&amp;nbsp;&lt;/P&gt;&lt;P&gt;[1] The issue never manifested on R80.40&lt;/P&gt;&lt;P&gt;[2] Happened 5 times in the last 2 weeks since we went to R81.20&lt;/P&gt;&lt;P&gt;[3] The CPU slowly goes into 100% and then stays there and this probably impacts IKE negotiation.&amp;nbsp;&lt;/P&gt;&lt;P&gt;[4] Graphs show steady increase in interface errors before the issue happens&lt;/P&gt;&lt;P&gt;[5] Graphs show constant increase in F2F packets and at some time the issue happens&lt;/P&gt;&lt;P&gt;[6] The only valid solution is reboot and switchover to the other member and wait for the ramp-up to bring it to breaking point&lt;/P&gt;&lt;P&gt;Opened a TAC ticket and there is no meaningful response other than, it may have happened to other customers and we are working on it but usually on SND cores and not FW Workers (this particular case)&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Questions:&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;[A] Do you know of any issues on R81.20 JHF53&amp;nbsp; where traffic is not accelerated properly?&amp;nbsp;&lt;/P&gt;&lt;P&gt;[B] Should IPSEC traffic terminating on the device be accelerated?&lt;/P&gt;&lt;P&gt;[C] What can be the source of "Failed to get native SXL device" issue seen in CPVIEW &amp;gt; Advanced tab?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Long Details:&lt;/P&gt;&lt;P&gt;Upon investigation we are seeing that in spike detective the top usage comes from two process (spike_detective):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;##head -30 perf_thread_452.lo&lt;BR /&gt;# To display the perf.data header info, please use --header/--header-only options.&lt;BR /&gt;#&lt;BR /&gt;#&lt;BR /&gt;# Total Lost Samples: 0&lt;BR /&gt;#&lt;BR /&gt;# Samples: 3K of event 'cycles'&lt;BR /&gt;# Event count (approx.): 3651993252&lt;BR /&gt;#&lt;BR /&gt;# Overhead Command Shared Object Symbol&lt;BR /&gt;# ........ ....... ..................... .....................................................&lt;BR /&gt;#&lt;BR /&gt;73.07% fwk0_2 [kernel.kallsyms] [k] fwmultik_do_seq_on_packet&lt;BR /&gt;17.79% fwk0_2 [kernel.kallsyms] [k] native_queued_spin_lock_slowpath&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;##head -30 perf_cpu_3.lo&lt;BR /&gt;# To display the perf.data header info, please use --header/--header-only options.&lt;BR /&gt;#&lt;BR /&gt;#&lt;BR /&gt;# Total Lost Samples: 0&lt;BR /&gt;#&lt;BR /&gt;# Samples: 3K of event 'cycles'&lt;BR /&gt;# Event count (approx.): 3307288381&lt;BR /&gt;#&lt;BR /&gt;# Overhead Command Shared Object Symbol&lt;BR /&gt;# ........ ....... ..................... ................................................................................&lt;BR /&gt;#&lt;BR /&gt;63.14% fwk0_0 [kernel.kallsyms] [k] fwmultik_do_seq_on_packet&lt;BR /&gt;18.13% fwk0_0 [kernel.kallsyms] [k] native_queued_spin_lock_slowpath&lt;/P&gt;&lt;P&gt;Looking at the stats on the firewall, there is almost no acceleration (IPS was enabled a day ago as a mitigation for SYN attacks based on other posts by Tim Hall and Heiko):&lt;/P&gt;&lt;P&gt;##enabled_blade&lt;BR /&gt;fw vpn ips identityServer mon&lt;/P&gt;&lt;P&gt;##fwaccel stats -&lt;BR /&gt;Accelerated conns/Total conns : 634/9357 (6%)&lt;BR /&gt;LightSpeed conns/Total conns : 0/9357 (0%)&lt;BR /&gt;&lt;STRONG&gt;Accelerated pkts/Total pkts : 500464710/5892962467 (8%)&lt;/STRONG&gt;&lt;BR /&gt;LightSpeed pkts/Total pkts : 0/5892962467 (0%)&lt;BR /&gt;&lt;STRONG&gt;F2Fed pkts/Total pkts : 5392497757/5892962467 (91%)&lt;/STRONG&gt;&lt;BR /&gt;F2V pkts/Total pkts : 4696613/5892962467 (0%)&lt;BR /&gt;CPASXL pkts/Total pkts : 0/5892962467 (0%)&lt;BR /&gt;PSLXL pkts/Total pkts : 274040788/5892962467 (4%)&lt;BR /&gt;CPAS pipeline pkts/Total pkts : 0/5892962467 (0%)&lt;BR /&gt;PSL pipeline pkts/Total pkts : 0/5892962467 (0%)&lt;BR /&gt;QOS inbound pkts/Total pkts : 0/5892962467 (0%)&lt;BR /&gt;QOS outbound pkts/Total pkts : 0/5892962467 (0%)&lt;BR /&gt;Corrected pkts/Total pkts : 0/5892962467 (0%)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Reason for no Acceleration: Local traffic (IPSEC ends on the gateway) and Native SXL device cannot be found.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="offload.png" style="width: 999px;"&gt;&lt;img src="https://community.checkpoint.com/t5/image/serverpage/image-id/26218iCE2B88330107786B/image-size/large?v=v2&amp;amp;px=999" role="button" title="offload.png" alt="offload.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Steady interface error increase before the issue (maybe because the CPU is handling packets and it is at 100% so it discards erratically)&lt;/P&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image (1).png" style="width: 999px;"&gt;&lt;img src="https://community.checkpoint.com/t5/image/serverpage/image-id/26219i1F8E2771640EE68B/image-size/large?v=v2&amp;amp;px=999" role="button" title="image (1).png" alt="image (1).png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jun 2024 07:25:18 GMT</pubDate>
      <guid>https://community.checkpoint.com/t5/Firewall-and-Security-Management/Performance-issue-generating-VPN-instability/m-p/217352#M41387</guid>
      <dc:creator>Cezar_Octavian_</dc:creator>
      <dc:date>2024-06-13T07:25:18Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue generating VPN instability</title>
      <link>https://community.checkpoint.com/t5/Firewall-and-Security-Management/Performance-issue-generating-VPN-instability/m-p/217445#M41415</link>
      <description>&lt;P&gt;A few points:&lt;/P&gt;
&lt;P&gt;1) The high SND load would appear to be caused by this known issue involving TCP Sequence Number validation:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://support.checkpoint.com/results/sk/sk181996" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;sk181996: Traffic outages may occur because of high utilization of CPU cores that run CoreXL SND instances&lt;/SPAN&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;2) When you say interface errors I assume you mean RX-DRP events?&amp;nbsp; Technically that is not an interface error but is usually just a side effect of overloaded SNDs.&amp;nbsp; If it is indeed RX-OVR or RX-ERR you need to get that fixed immediately.&lt;/P&gt;
&lt;P&gt;In response to your questions:&lt;/P&gt;
&lt;P&gt;A) Almost certainly due to the configuration of the IPS blade.&amp;nbsp; Please run and post the output of &lt;STRONG&gt;hcp -r "Threat Prevention" &lt;/STRONG&gt;and &lt;STRONG&gt;hcp -r "Performance"&lt;/STRONG&gt;; you may need to first run&amp;nbsp;&lt;STRONG&gt;hcp --enable-product "Performance"&lt;/STRONG&gt;&amp;nbsp;and/or&amp;nbsp;&amp;nbsp;&lt;STRONG&gt;hcp --enable-product "Threat Prevention"&lt;/STRONG&gt;&amp;nbsp;to unlock these hidden reports depending on your version of hcp.&amp;nbsp; In the meantime you can try running &lt;STRONG&gt;ips off&lt;/STRONG&gt; on the fly along with &lt;STRONG&gt;fwaccel stats -r&lt;/STRONG&gt; and &lt;STRONG&gt;fwaccel stats -s&lt;/STRONG&gt; and see if it gets most traffic out of the F2f/slowpath.&amp;nbsp;&lt;STRONG&gt; ips on&lt;/STRONG&gt; is used to resume IPS enforcement on the fly.&lt;/P&gt;
&lt;P&gt;B) Normally IPSec traffic terminating on the gateway can remain fully accelerated on the SND core in the fastpath, unless some deeper inspection is being called for by IPS which will require the traffic going to at least the Medium Path on a worker core, but in your case looks like it is going into F2F/slowpath which is not good.&amp;nbsp; See above, almost certainly caused by an IPS protection with a Critical performance impact rating being manually enabled, the hcp reports will show that.&lt;/P&gt;
&lt;P&gt;C) No idea, never seen it.&amp;nbsp; Probably just cosmetic.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jun 2024 15:15:09 GMT</pubDate>
      <guid>https://community.checkpoint.com/t5/Firewall-and-Security-Management/Performance-issue-generating-VPN-instability/m-p/217445#M41415</guid>
      <dc:creator>Timothy_Hall</dc:creator>
      <dc:date>2024-06-13T15:15:09Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue generating VPN instability</title>
      <link>https://community.checkpoint.com/t5/Firewall-and-Security-Management/Performance-issue-generating-VPN-instability/m-p/217478#M41423</link>
      <description>&lt;P&gt;Have you tried vpn accel off? Also, any drop log?&lt;/P&gt;
&lt;P&gt;Andy&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jun 2024 18:21:14 GMT</pubDate>
      <guid>https://community.checkpoint.com/t5/Firewall-and-Security-Management/Performance-issue-generating-VPN-instability/m-p/217478#M41423</guid>
      <dc:creator>the_rock</dc:creator>
      <dc:date>2024-06-13T18:21:14Z</dc:date>
    </item>
  </channel>
</rss>

