Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted

First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

Ok, we were finally "forced" to go ahead and upgrade our gateways from R80.10 to R80.30 for fairly small things - we wanted to be ale to use O365 Updatable Object (instead of home grown scripts) and improve Domain (FQDN) object performance issues when all FWK cores were making DNS queries causing a lot of alerts (see https://community.checkpoint.com/t5/General-Topics/Domain-objects-in-R80-10-spamming-DNS/m-p/19786)

Positive things - upgrades were smooth and painless - both on regular gateways and VSX.

All regular gateways seems to be performing as before, but I have to be honest that they are "over-dimensioned" and having rather powerfull HW for the job - 5900 with 16 cores.

VSX though threw couple of surprises.

SXL medium path usage. CPU jumped from <30% to above 50% on the busiest VS that only has FW and IA blades enabled. Ok, there is also VPN but only one connection:

image.png

 

 

 

 

 

 

I haven't spent enough time digging into it but for some reason 1/3 of all connections took medium path whereas before in R80.10 it was nearly all fully accelerated. And most of it was HTTPS (95%) with next most used LDAP-SSL (2%)

I used the SXL fast accelerator feature (thanks @HeikoAnkenbrand  https://community.checkpoint.com/t5/General-Topics/R80-x-Performance-Tuning-Tip-SecureXL-Fast-Accele...) to exclude our proxies and some other nets and you can see that on friday CPU load was reduced by 10% but nowhere near what it used to be.

I just find it impossible to explain why would gateway with only FW blade enabled start to to throw all (by the looks of it) traffic via PXL. And statistics are a bit funny too:

image.png

 

FQDN alerts in logs. I can definitely confirm that only one core now is doing DNS lookups (against all DNS server you have defined, in our case 2). But we are still getting a lot of alerts like these: Firewall - Domain resolving error. Check DNS configuration on the gateway (0)

 

image.png

 

 

 

 

 

Especially after I enabled updatable object for O365 in the rulebase.

As said before - I have not spent too much time on this as we had other "fun" stuff to deal with on our chassis, so it's fairly "raw". I will report more once I had some answers

 

1 Solution

Accepted Solutions
Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

I got a "tip off" from inside CP! Verifying if I'm allowed to publish it here but seems like my PXL issue is resolved! Yeehaa! Power of community! Thanks to @Ilya_Yusupov 

 

And the "secret stuff" here:

  1. Regarding medium path – you see most traffic in medium path due to a known bug we have since R80.20, TLS parser is enabled when the following combinations of blades are enabled
    1. FW + IDA or/and Monitoring or/and VPN (exactly our case!)
  2. You can validate me by running the following command - “fw ctl get int tls_parser_enable” it will bring 1
  3. As WA you can disable it by running the following on the fly - “fw ctl set int tls_parser_enable 0” è for permanent disabled put it under $FWDIR/boot/modules/fwkern.conf  tls_parser_enable=0 and reboot.
  4. The above will bring the traffic to be fully accelerated as in previous version.

View solution in original post

11 Replies
Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

Just had a closer look at the IPs that are being sent to medium path and all points to O365 / MS.

Strange as O365 object is fully removed now from rules and DB. 

0 Kudos
Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

Ouch! Penny just dropped, not even sure how I overlooked the fact that CPUSE upgrade changed our hyper-threading from OFF to ON but (!) kept original manual affinity settings. So not surprising that CPU usage was screwed.  I.e. our multiqueue was running on 6 "half" cores instead of 6 "full"! etc etc

Something to watch out for if you are using manual affinities on VSX!

0 Kudos
Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

 

Hi @Kaspars_Zibarts 

The Fast Acceleration (picture 1 green) feature lets you define trusted connections to allow bypassing deep packet inspection on R80.20 JHF103 and above gateways. This feature significantly improves throughput for these trusted high volume connections and reduces CPU consumption.

During my tests, I could reduce CPU (Core) usage about 10%-30%. It is also logically, no more content inspection is executed.

I like that you showed that graphically.👍

fast_accel_3 (1).PNG

 

Tags (1)
Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

Hi Kaspars,

Thanks for your report, a few comments:

1) What do you have set in the Track field of your rules?  If using Detailed or Extended logging this can pull traffic into PSLXL to provide the extra detail being requested.  Found out about this one while writing the third edition of my book.

2) Do you have any services with Protocol Signature enabled in the Network/Firewall policy if using ordered layers, or in the top level of rules if using inline?  This can also cause some of what you are seeing and you should try to stick to simple services (just a port number) in those layers if possible, then call for Protocol Signatures and applications/URLs/content in subsequent layers.

3) As far as that wacky Accelerated Conns percentage, you must have very large amount of stateless traffic, see sk109467: 'Accelerated conns' value is higher than 'Accelerated pkts' in the output of 'fwaccel stat....

4) As you noticed the gateway is much more dependent on speedy DNS starting in R80.20 due to Updatable objects, rad, wsdnsd and a lot of other daemons.

 

Book "Max Power 2020: Check Point Firewall Performance Optimization" Third Edition
Now Available at www.maxpowerfirewalls.com
Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

I got a "tip off" from inside CP! Verifying if I'm allowed to publish it here but seems like my PXL issue is resolved! Yeehaa! Power of community! Thanks to @Ilya_Yusupov 

 

And the "secret stuff" here:

  1. Regarding medium path – you see most traffic in medium path due to a known bug we have since R80.20, TLS parser is enabled when the following combinations of blades are enabled
    1. FW + IDA or/and Monitoring or/and VPN (exactly our case!)
  2. You can validate me by running the following command - “fw ctl get int tls_parser_enable” it will bring 1
  3. As WA you can disable it by running the following on the fly - “fw ctl set int tls_parser_enable 0” è for permanent disabled put it under $FWDIR/boot/modules/fwkern.conf  tls_parser_enable=0 and reboot.
  4. The above will bring the traffic to be fully accelerated as in previous version.

View solution in original post

Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

Wow nice one Kaspars, don't think I would have ever figured that one out.  Will disabling the TLS parser as shown cause issues with other blades should they get enabled later?

 

Book "Max Power 2020: Check Point Firewall Performance Optimization" Third Edition
Now Available at www.maxpowerfirewalls.com
0 Kudos
Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

@Timothy_Hall as far as I understood R&D are working on proper long term solution to fix it.

As for FQDN alerts, now I can confirm that O365 updatable object is definitely causing it but only on our busy VSX. I haven't seen the same issue on regular gateways.

According to CP, alert is issued when resolver cannot get response to checkpoint.com query. I took a tcpdump and confirmed that DNS is actually responding but it does generate wsdnsd log, here's example of packet capture and matching wsdnsd.elg entry:

image.png

 

 

 

[wsdnsd 32546]@vsx1-ext[20 Jan 9:10:33] Warning:cp_timed_blocker_handler: A handler [0xf6f213d0] blocked for 44 seconds.
[wsdnsd 32546]@vsx1-ext[20 Jan 9:10:33] Warning:cp_timed_blocker_handler: Handler info: Library [/opt/CPshrd-R80.30/lib/libResolver.so], Function offset [0x2b3d0].
[wsdnsd 32546]@vsx1-ext[20 Jan 9:10:33] Warning:cp_timed_blocker_handler: Handler info: Nearest symbol name [_Z10Sock_InputiPv], offset [0x2b3d0].

 

Still digging through my packet capture to see if i can find any strange names / responses etc

Highlighted
Employee+
Employee+

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

@Timothy_Hall  - Indeed when you will enabled blades that will require tls parser you will need to remove the WA i suggested.

The WA is currently only for the combinations i sent.

0 Kudos
Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

That makes sense, thanks.  Will add this workaround to the upcoming R80.40 addendum but be careful to add caveats for which blades are enabled.

 

Book "Max Power 2020: Check Point Firewall Performance Optimization" Third Edition
Now Available at www.maxpowerfirewalls.com
0 Kudos
Highlighted
Ivory

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

👍

0 Kudos
Highlighted

Re: First impressions R80.30 on gateway - one step forward one (or two back)

Jump to solution

Just a quick update on FQDN object alerts.

All was caused by missing rule that would permit DNS requests using TCP from gateway. I have added full details at the corresponding thread about FQDN here:

https://community.checkpoint.com/t5/General-Management-Topics/Domain-Objects-FQDN-An-Unofficial-ATRG...