Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 

USFW enablement not clear (SK needed)

Hello,

any idea why 3600 appliance has USFW enabled by default ?

25 Replies
Maarten_Sjouw
Champion
Champion

0 Kudos
HeikoAnkenbrand
Champion
Champion

Hi @HristoGrigorov 

I think it is a R80.40 intallation with a 3.10 kernel. In this case UMFW is enabled by default.

In “Kernel Mode Firewall” KMFW, the maximum number of running cores is limited to 40 because of the Linux/Intel limitation of 2GB kernel memory, and because CoreXL architecture needs to load a large driver (~42MB) dozens of times (according to the CPU number, and up to 40 times). Newer platforms that contain more than 40 cores e.g., 23900 or open server are not fully utilized.

The solution of the problem is a firewall in the user mode of the Linux operating system.

USFW “User Space Firewall” or UMFW stands for “User Mode Firewall”, and it is based on proven VSX code. This mode was introduced in R80.10.

According to SK the UMFW is enabled from R80.30 by default and is customized via the installation process. 

 

GAIA version/ Kernel/ Cores Firewall mode Check
R80.30 kernel 3.10 more then 35* cores UMFW is enabled checked on HP DL 380 G10 2 * Platinum 8180MProcessor 28 cores = 56 cores
R80.30 kernel 3.10 less then 35* cores KMFW is enabled checked on HP DL 380 G10 1 * Platinum 8180MProcessor 28 cores
R80.30 kernel 2.6 KMFW is enabled checked on VMWare with 30 cores and with 46 cores
R80.40 (default 3.10 kernel) UMFW is enabled by default checked on VMWare with 4 cores

 

To make sure that UMFW or KMFW is activated or to switch between modes read this article: 

R80.x - Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall 

Thank you for the detailed reply. I however find this in R80.40 release notes kind of strange:

Appliance support for User Space Firewall (USFW)

The following appliances run in USFW mode by default:

3600, 6200, 6600, 6900, 16000T, 26000, 26000T and 23900.

Note - All other Check Point appliances will boot in kernel mode by default.

Open Server / Cloud setup, VMware will boot in USFW when using 40 cores or more.

---

How is 3600 any different to be on this list ? I think there is something specific in the hardware and I am curious to know what it is 🙂

0 Kudos

Ahh, or you mean it is coming with R80.40 pre-installed ?

0 Kudos
HeikoAnkenbrand
Champion
Champion

Hi @HristoGrigorov 

I have described everything in detail in this article for USFW vs KMFW:

R80.x - Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall 

 

R80.40 3.10 kernel > USFW default

R80.30 2.6/ 3.10 kernel with more the 35 cores > USFW

R80.20 2.6/ 3.10 kernel with more the 35 cores > USFW

 

 

Maarten_Sjouw
Champion
Champion

As these, 3600, 6200, 6600, 6900, 16000T, 26000, 26000T and 23900, are all newer models they come with the 3.10 kernel, even in R80.30 and therefore come with the USFW enabled?
Regards, Maarten
0 Kudos
PhoneBoy
Admin
Admin

My guess? Hyper-threading.

@PhoneBoy That's very good guess and the only reasonable explanation so far. Thank you.

I have one of these on its way to me. Can't wait to play with it and will provide some more info later 😀

0 Kudos
HeikoAnkenbrand
Champion
Champion

Hi @PhoneBoy,

We have been discussing this topic here in the forum for about 2 months.

R80.x - Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall  
High CPU utilization during process fwk0_dev_0 (UMFW vs. KMFW) 
- this article

Only this  information can you found in the KB! I think that is not enough.

sk149973: How to enable USFW (User-Space Firewall) on a 23900 appliance 

I have also spent some time here at LAB to test this with HP DL360 servers with 28 cores vs 56 cores for R80.20 / R80.30 / R80.40 with kernel 3.10  and R80.20 / R80.30 with kernel 2.6. 

My request, please write a SK, in which the USFW vs. KMFW are described. I think that would be very helpful for all of us. 

Thanks in advance.

Heiko

 



0 Kudos
HeikoAnkenbrand
Champion
Champion

I think the 3600 has four cores. Therefore with HT 8 cores. Now the big question, why USFW is enabled with  8  (or 4) cores for R80.30.

On an open server only with more than 35 cores USFW is enabled. (I have checked this in the LAB)?

Exactly what I am thinking as well. The number of cores on 3600 (4) and 3600-T (8) is just far bellow what's reasonable to enable USFW by default even with HT enabled.  So, it is either mistake in release notes or just there is something we are missing or unaware...

@PhoneBoy I also second Heiko that we need SK about this.

0 Kudos
Wolfgang
Leader
Leader

To create some more confusion...

Last weeks I did new installs of R80.30 on different HP DL 380/360 G9 / G10 hardware. All are with less then 28 cores and all

had USFW enabled after install.

@PhoneBoy I support @HeikoAnkenbrand request for a knowledgebase article about both modes.

Wolfgang

Timothy_Hall
Champion
Champion

I'd like to also cast my vote for a full disclosure SK specifying exactly under which conditions USFW will be enabled and why. @PhoneBoy   While researching the third edition of my book I ran across this same question, and the only answer I could ever get is "it depends".  So I had to kind of punt in my book to some degree, show how to determine if USFW was enabled, and say to not change the default without consulting TAC.

I don't understand why USFW would be enabled by default on a 6900 or lower (16 cores or less), perhaps USFW can take advantage of SMT much more efficiently, but I doubt it.  16000 and higher (32-48 cores), sure USFW makes sense.  It looks like the ability to disable SMT via cpconfig has been removed on certain Check Point appliances as well, thus only enabling TAC to disable it from the BIOS, which I don't understand either.

 

"Max Capture: Know Your Packets" Video Series
now available at http://www.maxpowerfirewalls.com

I renamed this thread as it is not only about 3600 appliances.

0 Kudos
PhoneBoy
Admin
Admin

Some of the questions in this thread are answered here: https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...
This SK is missing all the appliances we announced this year, but they are similar to the 6500/6800 in the sense SMT is enabled by default.
Further, expect USFW to be enabled by default in the most recent code versions for the newest appliances.

I've also asked the SK team to update sk93000 with the new appliances and to clarify the limitations section as it contradicts the "supported platforms" statements in a few areas.

(Edited to reflect confirmation from R&D on SMT on newer appliances). 

I hope for a technical explanation from R&D as to why is USFW enabled by default on appliances with low number of CPU cores (even with SMT enabled). 

0 Kudos
PhoneBoy
Admin
Admin

I assume it's because it provides for better performance than without, particularly when you're using NGTP/NGTX Software Blades.

In terms of kernel memory utilization, USFW is a significant improvement as we're not having to load a large CoreXL driver for each core (real and virtual).
Even with a small number of cores, that adds up.
0 Kudos

All right. I think we are getting into some kind of loop here so I will just halt it at this point with not understanding why USFW is enabled by default on one 8 core appliance and disabled on another.

0 Kudos
PhoneBoy
Admin
Admin

In the Check Point appliances where it is enabled by default, we have newer Intel processors in there.
There must clearly be a benefit to having SMT enabled by default on these appliances, thus we do it.
It has nothing to do with the number of cores available in this case.

For systems with older CPUs with more than 40 cores, USFW is required to utilize them all.
I assume on those appliances, we would also enable USFW by default.

I hope that clears things up.

My knowledge of computer architectures is may be a bit rusty already but last I remember is that HT/SMT helps when you have multi-threaded program. If USFW is realized as such it definitely benefits from HT. My understanding however was it is separate processes in the operating system and if so task switching will kind of kill performance gain from HT if any.

As to why it is enabled on one and disabled on another appliances then yes, your answer clears it.

0 Kudos
PhoneBoy
Admin
Admin

Userspace programs can more easily be multi-threaded than kernel-level processes, which tend to crash when interrupted. 😬

Mystery resolved 🙂

0 Kudos
HeikoAnkenbrand
Champion
Champion

From a performance point of view I could not see any differences between UMFW and KMFW. I noticed that the process fwk0_dev_0 generates a very high CPU load in the UMFW. My guess as to the purpose of the fwk0_dev_0 is that it acts as the liaison between the multiple fwk firewall worker processes (fw instance thread that takes care for the packet processing) and the single fwmod kernel driver instance and the process for high priority cluster thread.

If you want to change the mode from UMFW to KMFW this can be done by changing the registry parameter FwIsUsermode by cpprod_util command. In UMFW the fw instances are threads of the fwk0_dev_0 so by default the top shows all the threads cpu utilization under the main thread. Top has the option to present the utilization per thread as well.

A small calculation sample for the utilization of process fwk0_dev_0:

                                 max_CoreXL_number            max_CoreXL_number
fwk0_dev_0      =      ∑       fwk0_x                    +                fwk0_dev_x          +        fwk0_kissd        +          fwk0_hp
                                 x=0                                              x=0

Thread from process fwk0_dev_0:

- fwk0_X              ->  fw instance thread that takes care for the packet processing
- fwk0_dev_X      -> the thread that takes care for communication between fw instances and other CP daemons 
- fwk0_kissd       -> legacy Kernel Infrastructure (obsolete)
- fwk0_hp            ->  (high priority) cluster thread

HeikoAnkenbrand
Champion
Champion

1) Search the prozess ID of process fwk0_dev_0

# top

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10219 admin 0 -20 1070m 449m 134m S 2 24.0 0:17.19 fwk0_dev_0

2) Now check the utilization of the threads. Here you can see the load of each CoreXL instance:

>>> fwk0_X              ->  fw instance thread that takes care for the packet processing

top -Hbn1 -p 10219

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10219 admin 0 -20 1070m 449m 134m S 0 24.0 0:03.49 fwk0_dev_0
10220 admin 0 -20 1070m 449m 134m S 0 24.0 0:00.00 fwk0_kissd
10436 admin 0 -20 1070m 449m 134m S 0 24.0 0:00.57 fwk0_0
10437 admin 0 -20 1070m 449m 134m S 0 24.0 0:00.64 fwk0_1
10438 admin 0 -20 1070m 449m 134m S 0 24.0 0:00.67 fwk0_2
10439 admin 0 -20 1070m 449m 134m S 0 24.0 0:00.80 fwk0_3
10440 admin RT -20 1070m 449m 134m S 0 24.0 0:00.76 fwk0_hp
10441 admin 0 -20 1070m 449m 134m S 0 24.0 0:00.15 fwk0_dev_1
10442 admin 0 -20 1070m 449m 134m S 0 24.0 0:00.09 fwk0_dev_2
10443 admin 0 -20 1070m 449m 134m S 0 24.0 0:00.09 fwk0_dev_3

It is interesting to see of what magnitude are the context switched of fwk0_dev_0 process:

# cat /proc/<pid>/status | grep ctx

0 Kudos