Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Copper

Question on traffic handling between Kernel Mode FW and Usermode FW

Hi,

 

as we switched from R80.10 to R80.30 kernel 3.1 with UMFW I am looking for some in-depth information if packet handling changed because of e.g. SecureXL no handled by usermode. I found all the great information gathered by @HeikoAnkenbrand but until now could not find infos on the topic below.

After the upgrade we could see much less cpu usage on SNDs then before but much higher load on the fw_worker instances.

Actually before the change we had average 80% CPU on SNDs and 30% on FW_workers and now we have 10% on SNDs but 70% on FW-workers. We upgraded hardware to faster CPUs (3,2 GHz in opposite to 2 GHz) but core number stayed the same with 16 cores. We used multi-queue before but with R80.30 we could now use multiqueue for all interfaces.

Nevertheless my feeling is that some processing that was done by SNDs before has now moved to the fw_workers.

Note that we currently have a high number of VPN clients connected due to Corona.

Thanks for any insights.


regards Thomas

 

 

31 Replies
Highlighted

This is an exciting question. I can confirm similar behavior on some firewalls. What surprises me is that the basic process is already producing about 10%-20% CPU load (without firewall traffic).

In UMFW the fw instances are threads of the fwk0_dev_0 so by default the top shows all the threads cpu utilization under the main thread. Top has the option to present the utilization per thread as well.

A small calculation sample for the utilization of process fwk0_dev_0:

                                 max_CoreXL_number            max_CoreXL_number
fwk0_dev_0      =      ∑       fwk0_x                    +                fwk0_dev_x          +        fwk0_kissd        +          fwk0_hp
                                 x=0                                              x=0

Thread from process fwk0_dev_0:

- fwk0_X              ->  fw instance thread that takes care for the packet processing
- fwk0_dev_X      -> the thread that takes care for communication between fw instances and other CP daemons 
- fwk0_kissd       -> legacy Kernel Infrastructure (obsolete)
- fwk0_hp            ->  (high priority) cluster thread

More read here:
R80.x - Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall 

Tags (1)
Highlighted
Copper

 

Hi Heiko,

 

yes I also found the Shift+H option for top to display the single fw_worker processes.

But if there is a change in handling traffic distribution differently there is at least in my case the need to change SND/CoreXL distribution configuration as well.

But before I wanted to understand why I can see this (heavy) shift in load from SNDs to fw_worker´s CoreXL instances.

 

Regards Thomas 

0 Kudos
Highlighted
Copper

I´ll have to add that there are these new "snd" processes which consume also cpu:

2020-04-27_13h48_05.png

So at least there is another process which seems to be related to SND processing ...

Regards Thomas

0 Kudos
Highlighted
Platinum

This is in R80.40:

...

8445 admin 0 -20 2295088 1.058g 126768 S 0.3 14.1 5:10.29 1 fwk0_dev_0
8479 admin 0 -20 2295088 1.058g 126768 S 0.0 14.1 0:00.00 3 fwk0_kissd
8593 admin 0 -20 2295088 1.058g 126768 S 0.3 14.1 4:22.95 3 fwk0_0
8594 admin 0 -20 2295088 1.058g 126768 S 0.0 14.1 4:49.87 1 fwk0_1
8595 admin 0 -20 2295088 1.058g 126768 S 0.7 14.1 4:11.93 2 fwk0_2
8617 admin 0 -20 2295088 1.058g 126768 S 0.0 14.1 0:01.29 3 fwk0_service
8618 admin 0 -20 2295088 1.058g 126768 S 0.0 14.1 1:06.37 1 fwk0_dev_1
8620 admin 0 -20 2295088 1.058g 126768 S 0.0 14.1 1:08.77 2 fwk0_dev_2
8621 admin 0 -20 2295088 1.058g 126768 S 0.0 14.1 0:29.05 2 fwk0_HeavyIoctl

0 Kudos
Highlighted
Admin
Admin

Probably the biggest change between R80.10 and R80.30 is templating. It used to be handled in SecureXL. Now all templates are moved to FWK. 

If FWK is matching a packet to a template (and accelerated connection), it re-injects it back to SXL. As shown in sk153832:

 

Packet%20Flow%20-%20No%20AC1903180433

 

 

Highlighted
Nickel

Does it mean, a re-calculation of the fw_worker cores has to be done? like more fw_worker and less SNDs?
0 Kudos
Highlighted
Admin
Admin

Not really. However, in R80.40, there is something called "dynamic split" (sk164155).

With this feature, system is automatically balancing amount of SNDs and FWKs to keep reasonable CPU utilization.  

0 Kudos
Highlighted

Definitely, which is why I documented these changes in my book.  There is a shift in responsibilities to Firewall Workers in R80.20+ which increases their CPU load.  Running in USFW mode instead of kernel mode for the Firewall Workers also causes additional overhead reaching the Firewall Workers, which incurs additional CPU load.  But now with the 40 core limit lifted by USFW you can have lots and lots of Firewall Worker cores to handle these new responsibilities. 

This shift may well require reducing the number of SND cores after an upgrade to R80.20+, but this is not a hard and fast rule and highly depends on how much traffic is fully accelerated by SecureXL (Packets/sec in fwaccel stats -s output).  With the R80.20 changes, in some cases much more traffic can be fully accelerated by SecureXL than before, thus increasing the load on the SND/IRQ cores...

 

Book "Max Power 2020: Check Point Firewall Performance Optimization" Third Edition
Now Available at www.maxpowerfirewalls.com
0 Kudos
Highlighted
Platinum

Now, I am even more worried about my 4-core 3600 that has UMFW enabled by default 😄

0 Kudos
Highlighted
Admin
Admin

Why do you have it in the first place? There is no point to enable UMFW with less than 40+ cores

0 Kudos
Highlighted
Platinum

That's how it came from CheckPoint. Do you mind ask R&D to confirm it is not enabled by mistake ? It does not make sense to me either...

0 Kudos
Highlighted
Admin
Admin

It should not be enabled, unless you are running R80.40. Do you?

0 Kudos
Highlighted
Platinum

It came from the factory with R80.30 and UMFW enabled by default. I am sure about that. Now it is indeed running R80.40. Shall I leave it like that ?

0 Kudos
Highlighted
Admin
Admin

yes you can

0 Kudos
Highlighted
Platinum

All right, good to know. Thank you!

Highlighted
Iron

Just a quick FYI. I upgraded a 4400 series cluster to R80.40 and UMFW was enabled after installing via a BLINK upgrade. My firewall is now averaging about twice as high of CPU load as compared to when it was running R80.30. I have been following this and other UMFW threads closely because it sure sounds like we should be disabling UMFW on any firewall with less than 40 cores, but I'm hesitant to do that considering that Checkpoint obviously started enabling UMFW by default in R80.40 regardless of the number of cores you have (or they have a bug in their code and it's incorrectly enabling it.)
0 Kudos
Highlighted
Platinum

Sorry, to hijack this thread with something else. I will open another one concerning UMFW on lower CPU appliances in a hope to get it clarified.

Highlighted
Admin
Admin

USFW is default on any appliance released in 2020.
It’s also default in any appliance with more than 40 cores.
The overall performance impact of this (especially in R80.40) should be neutral to positive.
0 Kudos
Highlighted
Iron


@PhoneBoy wrote:
The overall performance impact of this (especially in R80.40) should be neutral to positive.



After upgrading to R80.40 from R80.30 on my 4400 appliance, I can tell you that my CPU process is WAY higher. I can't say if that's because of UMFW or not, but I can say UMFW was turned on as part of the upgrade on a 4400 appliance with a whopping 2 cores on it. For all I know, it could be that R80.40's HTTPS inspection is handling more HTTPS traffic than ever and the increase is due to that. I don't have the full expertise or time to dig in and find out exactly why the CPU is so much higher than R80.30, I just know I'm getting more CPU alerts than ever since upgrading. Thankfully this year we are do for hardware upgrades, so I'm just limping along until that can occur.

0 Kudos
Highlighted
Admin
Admin

Not sure why USFW was enabled on a 4400 as it's definitely not necessary there.
Could be something else.
0 Kudos
Highlighted
Iron

So is it worth researching into turning this off on my 4400 and 4800 series appliances, or just leave it alone?
0 Kudos
Highlighted
Admin
Admin

Unless you're experiencing major issues, I would probably leave it alone.
0 Kudos
Highlighted
Iron

Quick update on this.  I was working with CP support on an unrelated issue and they noticed that usermode FW was turned on for our 4000 series gateway and they turned it off.  My CPU usage was cut in half when they did this.  Clearly this should not be turned on for a box this small.  Just a reminder, I did not turn on usermode fw, it was turned on automatically by the R80.40 install.

0 Kudos
Highlighted

That's not supposed to happen, see my response here:

https://community.checkpoint.com/t5/General-Topics/USFW-on-appliances-with-less-than-40-cores/m-p/86...

However whether USFW is enabled by default has been in a bit of flux over time, can you recall when you fresh-loaded R80.40 on your 4000?

Book "Max Power 2020: Check Point Firewall Performance Optimization" Third Edition
Now Available at www.maxpowerfirewalls.com
0 Kudos
Highlighted
Iron

Hehe, I know it's not supposed to happen, that was the entire reason I replied back to let you know that it is being enabled by default on an R80.40 install on a certified 4000 series appliance that does not meet the minimum requirements to have it enabled.

 

I wrote the following about a month ago:

Just a quick FYI. I upgraded a 4400 series cluster to R80.40 and UMFW was enabled after installing via a BLINK upgrade. My firewall is now averaging about twice as high of CPU load as compared to when it was running R80.30. I have been following this and other UMFW threads closely because it sure sounds like we should be disabling UMFW on any firewall with less than 40 cores, but I'm hesitant to do that considering that Checkpoint obviously started enabling UMFW by default in R80.40 regardless of the number of cores you have (or they have a bug in their code and it's incorrectly enabling it.)

 

0 Kudos
Highlighted
Platinum

There will be a post from @shais that shall provide us with more info and answers as to why, when and where is USFW enabled, what is the performance impact, etc. So, stay tuned. 

0 Kudos
Highlighted

SecureXL underwent dramatic changes in R80.20, and these changes apply regardless of whether the Firewall Workers are in kernel mode or USFW.  More overall responsibilities were shifted to the Firewall Workers, and this is a bit more discernible when in USFW mode as you can see the CPU being used by the individual fwk* processes/threads mentioned by Heiko, instead of all the CPU time just being lumped into sy/si in kernel mode. 

All packets still come through SecureXL/sim/SND first after being emptied from interface ring buffers by SoftIRQ in R80.20+, but unless the packet matches an existing connection in SecureXL's state table, the packet is sent to a Firewall Worker instance which decides whether the connection matches an Accept template, which path the connection should be processed in, etc.  This shift in responsibilities is so important to tuning that I created these tables in the third edition of my book documenting the shift in tasks between SND/IRQ cores and Firewall Workers that occurred in R80.20, as well as how the processing paths changed.  Hopefully these tables will help...

 

Comparison of Processing Paths: R80.10 vs. R80.20+Comparison of Processing Paths: R80.10 vs. R80.20+

 

SND/IRQ Core Tasks: R80.10 vs. R80.20+SND/IRQ Core Tasks: R80.10 vs. R80.20+

 

Firewall Worker Instance Tasks: R80.10 vs. R80.20+Firewall Worker Instance Tasks: R80.10 vs. R80.20+

Book "Max Power 2020: Check Point Firewall Performance Optimization" Third Edition
Now Available at www.maxpowerfirewalls.com
Highlighted
Copper

Hi Timothy,

thanks for the detailed information.

Especially table 6 was exactly what I was looking for.

 

This confirms my feeling that we´ll need to re-revaluate our SND/fw_worker distribution setting.

 

Thanks again

Thomas

Highlighted
Admin
Admin

In R80.40, you’ll probably not want to mess with these settings as SND/worker distribution should happen dynamically.
However, if you manually adjust the settings, you lose the ability to leverage that.