Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
HeikoAnkenbrand
Champion Champion
Champion
Jump to solution

R80.x Performance Tuning Tip - Elephant Flows (Heavy Connections)

Elephant Flow (Heavy Connections)

In computer networking, an elephant flow (heavy connection) is an extremely large in total bytes continuous flow set up by a TCP or other protocol flow measured over a network link. Elephant flows, though not numerous, can occupy a disproportionate share of the total bandwidth over a period of time.  When the observations were made that a small number of flows carry the majority of Internet traffic and the remainder consists of a large number of flows that carry very little Internet traffic (mice flows).

All packets associated with that elephant flow must be handled by the same firewall worker core (CoreXL instance). Packets could be dropped by Firewall when CPU cores, on which Firewall runs, are fully utilized. Such packet loss might occur regardless of the connection's type.

What typically produces heavy connections:

  • System backups
  • Database backups
  • VMWare sync.
Chapter

More interesting articles:

- R80.x Architecture and Performance Tuning - Link Collection
- Article list (Heiko Ankenbrand)

Evaluation of heavy connections


The big question is, how do you found elephat flows on an R80 gateway?

Tip 1
Evaluation of heavy connections (epehant flows)

A first indication is a high CPU load on a core if all other cores have a normal CPU load. This can be displayed very nicely with "top". Ok, now a core has 100% CPU usage. What can we do now? For this there is a SK105762 to activate "Firewall Priority Queues".  This feature allows the administrator to monitor the heavy connections that consume the most CPU resources without interrupting the normal operation of the Firewall. After enabling this feature, the relevant information is available in CPView Utility. The system saves heavy connection data for the last 24 hours and CPDiag has a matching collector which uploads this data for diagnosis purposes.

Heavy connection flow system definition on Check Point gateways:

  • Specific instance CPU is over 60%
  • Suspected connection lasts more than 10s
  • Suspected connection utilizes more than 50% of the total work the instance does. In other words, connection CPU utilization must be > 30%  
CLI Commands


Tip 2

Enable the monitoring of heavy connections.

To enable the monitoring of heavy connections that consume high CPU resources:

# fw ctl multik prioq 1

# reboot

Tip 3
Found heavy connection on the gateway with „print_heavy connections“

On the system itself, heavy connection data is accessible using the command:

# fw ctl multik print_heavy_conn
pq5.jpg

Tip 4
Found heavy connection on the gateway with cpview

# cpview                CPU > Top-Connection > InstancesX
pq3.png

 

Links


sk105762 - Firewall Priority Queues in R77.30 / R80.10 and above

 

 
➜ CCSM Elite, CCME, CCTE
1 Solution

Accepted Solutions
HeikoAnkenbrand
Champion Champion
Champion

Hi @Martin_Raska,

In “Kernel Mode Firewall” KMFW, the maximum number of running cores is limited to 40 because of the Linux/Intel limitation of 2GB kernel memory, and because CoreXL architecture needs to load a large driver (~42MB) dozens of times (according to the CPU number, and up to 40 times). Newer platforms that contain more than 40 cores e.g., 23900 or open server are not fully utilized.

The solution of the problem is a firewall in the user mode of the Linux operating system.

GAIA version/ Kernel/ Cores Firewall mode Check
R80.30 kernel 3.10 more then 35* cores UMFW is enabled checked on HP DL 380 G10 2 * Platinum 8180MProcessor 28 cores = 56 cores
R80.30 kernel 3.10 less then 35* cores KMFW is enabled checked on HP DL 380 G10 1 * Platinum 8180MProcessor 28 cores
R80.30 kernel 2.6 KMFW is enabled checked on VMWare with 30 cores and with 46 cores
R80.40 (default 3.10 kernel) UMFW is enabled by default checked on VMWare with 4 cores

 

To make sure that UMFW is activated, run the following command:

# cpprod_util FwIsUsermode

1 = User Mode Firewall
0 = Kernel Mode Firewall

For more information or to change the mode, read more in my article here:

R80.x - Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall  

➜ CCSM Elite, CCME, CCTE

View solution in original post

29 Replies
Josef_Pecher
Explorer

Hi @HeikoAnkenbrand,

Thank you for all the interesting articles about Performance Tuning you wrote.

You could write a book out of this link collection 😀.

R80.x Architecture and Performance Tuning - Link Collection 

 

 

Paul_Erez
Participant

Hi @HeikoAnkenbrand,

This article has helped me very well.

I followed the steps and actually found a database backup connection. The connection caused about 70% CPU load on one core. We have now limited the bandwidth of the connection via QoS.

Best Regards

Paul

 

Niroyec_Yerusha
Participant

We were able to identify a very similar problem.

thx

 

0 Kudos
Patricia_OSulli
Participant

We also had the problem with the elephant flows. This is a good way to find them quickly and easily.

HeikoAnkenbrand
Champion Champion
Champion

In the past years I had always been looking for a solution to find elephant flows. Check Point has built in a good solution.

➜ CCSM Elite, CCME, CCTE
Gaurav_B_
Explorer

I just tried that. This is a very interesting solution. A way to find elefant flows.

Thanks

Delia_Pele
Explorer

👌

0 Kudos
Dirk_Wisbey
Explorer

We have several connections with 5-7% utilization.

What can we do here?

Timothy_Hall
Champion
Champion

So glad you asked this question.  🙂

I will be speaking at CPX New Orleans and Vienna on the CheckMates track with a presentation called "Big Game Hunting: Elephant Flows" that will go through how to track down elephant flows (a.k.a. heavy connections), all the different remediation options, and the pros and cons of each.  PhoneBoy will be delivering this presentation for me at CPX Bangkok because I'll be very busy that week, with, uh, something else...

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Igor_Szkaradkie
Participant

This is an interesting approach to detect heavy connections. I had checked this after this article and could identify some systems that were causing problems. We have now created QoS rules to limit the bandwidth. That worked well.

 

Martin_Raska
Advisor
Advisor

Guys,

if you have a problem with elephant flow you may try this

SecureXL Fast Accelerator (fw fast_accel) for R80.20 and above - sk156672

0 Kudos
Josh_Dillig
Participant
Do we have to enable PrioQ to support the "fw ctl multik print_heavy_conn" command? The article suggests it, but the Tip# list isn't execution step#.

Also is this supported on R77.30 and R76SP.50?
CCMA
0 Kudos
Timothy_Hall
Champion
Champion

Priority Queues must be in mode 1 (Eviluator-only) to use that command; mode 1 is the default on a firewall that does not have USFW enabled. I'll be speaking about this very topic in detail at CPX New Orleans and Vienna.

Support for fw ctl multik print_heavy_conn was added in R80.20; I doubt it can be backported into earlier releases since I'm pretty sure it relies on the major changes introduced to SecureXL in R80.20.

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
uror
Contributor

This does not work with R80.40.

0 Kudos
HeikoAnkenbrand
Champion Champion
Champion

R80.40 gateways use USFW by default.

Unfortunately this is no longer possible with R80.40 in USFW. @Timothy_Hall  has already described this well for R80.20.

➜ CCSM Elite, CCME, CCTE
(1)
Martin_Raska
Advisor
Advisor

Could someone explain why FW was moved from kernel space to user space by default? What is the benefit except alocation more memory when you have more cores? What will be impacted, what is behind? Thanks

_Val_
Admin
Admin

That was discussed here in several posts, I think. 

In a nutshell, with more than 48 cores, kernel mode cannot utilise them all. To allow CoreXL use more cores on high performance boxes, User Mode is the only option. Plus, user mode add stability. If FWK instance crashes, it does not affect the whole machine. 

VSX is running User Mode FWK instances for ages, actually.

HeikoAnkenbrand
Champion Champion
Champion

Hi @Martin_Raska,

In “Kernel Mode Firewall” KMFW, the maximum number of running cores is limited to 40 because of the Linux/Intel limitation of 2GB kernel memory, and because CoreXL architecture needs to load a large driver (~42MB) dozens of times (according to the CPU number, and up to 40 times). Newer platforms that contain more than 40 cores e.g., 23900 or open server are not fully utilized.

The solution of the problem is a firewall in the user mode of the Linux operating system.

GAIA version/ Kernel/ Cores Firewall mode Check
R80.30 kernel 3.10 more then 35* cores UMFW is enabled checked on HP DL 380 G10 2 * Platinum 8180MProcessor 28 cores = 56 cores
R80.30 kernel 3.10 less then 35* cores KMFW is enabled checked on HP DL 380 G10 1 * Platinum 8180MProcessor 28 cores
R80.30 kernel 2.6 KMFW is enabled checked on VMWare with 30 cores and with 46 cores
R80.40 (default 3.10 kernel) UMFW is enabled by default checked on VMWare with 4 cores

 

To make sure that UMFW is activated, run the following command:

# cpprod_util FwIsUsermode

1 = User Mode Firewall
0 = Kernel Mode Firewall

For more information or to change the mode, read more in my article here:

R80.x - Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall  

➜ CCSM Elite, CCME, CCTE
Martin_Raska
Advisor
Advisor
Thanks Heiko, I will ask differently, what is a difference when FW code is running in kernel mode or user mode except for memory allocation.
0 Kudos
HristoGrigorov

Kernel mode - faster, direct access to hardware but in case of crash everything goes down

User mode - slower, limited access to hardware but in case of crash only app crashes

 

Also, writing and maintaining code in kernel mode is often pure nightmare compared to user mode. With current hardware performance really does not suffer that much if you do it well in user mode.

Martin_Raska
Advisor
Advisor
that is I was missing. Thanks
0 Kudos
TheGrave
Contributor
I'm not so convinced performance is acceptable in user mode. My bet is once the kernel limitations are tackled CheckPoint will be crawling back to KMFW.

Not to mention that if you put traffic through a FW with 40 cores and it can't handle it in kernel mode your design is obviously wrong or software processing it is pure crap. Load-balancing exists for ages.
0 Kudos
argur_007
Participant

Does that also exist for UMFW?

0 Kudos
_Val_
Admin
Admin

Kernel or User Mode, Elephant Flows are problematic in both cases

0 Kudos
Timothy_Hall
Champion
Champion

True, but there are much better tools for detection and remediation of elephant flows when in kernel mode.  With USFW enabled detection and remediation tools for elephant flows are quite limited, but based on a recent conversation I learned that Check Point is working on closing that capability gap as we speak. My CPX 2020 presentation summarizes all this here:

https://community.checkpoint.com/fyrhh23835/attachments/fyrhh23835/member-exclusives/430/4/Cloud%20T...

Also the Solution Center has a new feature available that allows the processing of a single elephant flow to be spread across multiple Firewall Worker instances, but this capability is not mainlined yet.  This feature was alluded to at the end of my CPX presentation above.

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
TheGrave
Contributor

You sure this is the correct file? I'm either blind or not seeing what you are referring to.

0 Kudos
Luis_Miguel_Mig
Advisor

Is there any way to detect elephant flows in fast path in R77.20 or earlier?

I have made the following summary reading your posts but I miss how to capture elephant flows in fast path in R77.20 or earlier.

Is this summary below correct? Am I missing anything? 



- In R77.20 or earlier, you can detect elephant flows with:

     * F2F traffic: with /proc/cpkstats/fw_worker_x_stats with or without cpview
     * Any traffic:  enabling accounting in a number of rules and looking at smartlog. 

- Between R77.30 and R80.40:

      * you can still use the above options
      * Any traffic: priority queues and connection load tracking - cpview and smartlog
                                           "fw ctl multik prioq 1"

                                           

- Between R80.20 Take 47 and R83.X

      * you can still use all the above

      * Any traffic: there is a new elephant flow detection mechanism for kernel mode
                                              "fw ctl multik print_heavy_conn"


0 Kudos
Timothy_Hall
Champion
Champion

I don't have a R77.20 gateway handy to test, but if the elephant flows are in the fastpath fw_worker stats will not show them.

Accounting is supported directly by SecureXL/fastpath and should work.

I don't think the fwaccel conns command will help much for finding elephant flows in the fastpath but give it a shot.  To my knowledge there are no direct elephant flow detection mechanisms in R77.20.

I can't remember if cpview has these screens and whether they will show elephant flows in the fastpath in R77.20, but look for these screens in cpview:

  • Network...Top Connections
  • CPU...Top Connections
  • Advanced...CoreXL...Instances...FW-Instance#...Top FW-Lock consumers

 

You can also try using the CPMonitor (sk103212: Traffic analysis using the 'CPMonitor' tool) and connstat (sk85780: How to use the 'connstat' utility) tools as described in my CPX 2020 presentation here: 

https://community.checkpoint.com/fyrhh23835/attachments/fyrhh23835/member-exclusives/432/3/CPX_Big_G...

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Amoli
Participant

This no longer works with a 3.10 kernel.

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events