- CheckMates
- :
- Products
- :
- General Topics
- :
- Re: R80.x Performance Tuning Tip - Elephant Flows ...
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
R80.x Performance Tuning Tip - Elephant Flows (Heavy Connections)
Elephant Flow (Heavy Connections) |
---|
In computer networking, an elephant flow (heavy connection) is an extremely large in total bytes continuous flow set up by a TCP or other protocol flow measured over a network link. Elephant flows, though not numerous, can occupy a disproportionate share of the total bandwidth over a period of time. When the observations were made that a small number of flows carry the majority of Internet traffic and the remainder consists of a large number of flows that carry very little Internet traffic (mice flows).
All packets associated with that elephant flow must be handled by the same firewall worker core (CoreXL instance). Packets could be dropped by Firewall when CPU cores, on which Firewall runs, are fully utilized. Such packet loss might occur regardless of the connection's type.
What typically produces heavy connections:
- System backups
- Database backups
- VMWare sync.
Chapter |
---|
More interesting articles:
- R80.x Architecture and Performance Tuning - Link Collection
- Article list (Heiko Ankenbrand)
Evaluation of heavy connections |
---|
The big question is, how do you found elephat flows on an R80 gateway?
Tip 1
Evaluation of heavy connections (epehant flows)
A first indication is a high CPU load on a core if all other cores have a normal CPU load. This can be displayed very nicely with "top". Ok, now a core has 100% CPU usage. What can we do now? For this there is a SK105762 to activate "Firewall Priority Queues". This feature allows the administrator to monitor the heavy connections that consume the most CPU resources without interrupting the normal operation of the Firewall. After enabling this feature, the relevant information is available in CPView Utility. The system saves heavy connection data for the last 24 hours and CPDiag has a matching collector which uploads this data for diagnosis purposes.
Heavy connection flow system definition on Check Point gateways:
- Specific instance CPU is over 60%
- Suspected connection lasts more than 10s
- Suspected connection utilizes more than 50% of the total work the instance does. In other words, connection CPU utilization must be > 30%
CLI Commands |
---|
Tip 2
Enable the monitoring of heavy connections.
To enable the monitoring of heavy connections that consume high CPU resources:
# fw ctl multik prioq 1
# reboot
Tip 3
Found heavy connection on the gateway with „print_heavy connections“
On the system itself, heavy connection data is accessible using the command:
# fw ctl multik print_heavy_conn
Tip 4
Found heavy connection on the gateway with cpview
# cpview CPU > Top-Connection > InstancesX
Links |
---|
sk105762 - Firewall Priority Queues in R77.30 / R80.10 and above
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Martin_Raska,
In “Kernel Mode Firewall” KMFW, the maximum number of running cores is limited to 40 because of the Linux/Intel limitation of 2GB kernel memory, and because CoreXL architecture needs to load a large driver (~42MB) dozens of times (according to the CPU number, and up to 40 times). Newer platforms that contain more than 40 cores e.g., 23900 or open server are not fully utilized.
The solution of the problem is a firewall in the user mode of the Linux operating system.
GAIA version/ Kernel/ Cores | Firewall mode | Check |
R80.30 kernel 3.10 more then 35* cores | UMFW is enabled | checked on HP DL 380 G10 2 * Platinum 8180MProcessor 28 cores = 56 cores |
R80.30 kernel 3.10 less then 35* cores | KMFW is enabled | checked on HP DL 380 G10 1 * Platinum 8180MProcessor 28 cores |
R80.30 kernel 2.6 | KMFW is enabled | checked on VMWare with 30 cores and with 46 cores |
R80.40 (default 3.10 kernel) | UMFW is enabled by default | checked on VMWare with 4 cores |
To make sure that UMFW is activated, run the following command:
# cpprod_util FwIsUsermode
1 = User Mode Firewall
0 = Kernel Mode Firewall
For more information or to change the mode, read more in my article here:
R80.x - Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @HeikoAnkenbrand,
Thank you for all the interesting articles about Performance Tuning you wrote.
You could write a book out of this link collection 😀.
R80.x Architecture and Performance Tuning - Link Collection
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @HeikoAnkenbrand,
This article has helped me very well.
I followed the steps and actually found a database backup connection. The connection caused about 70% CPU load on one core. We have now limited the bandwidth of the connection via QoS.
Best Regards
Paul
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We were able to identify a very similar problem.
thx
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We also had the problem with the elephant flows. This is a good way to find them quickly and easily.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the past years I had always been looking for a solution to find elephant flows. Check Point has built in a good solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just tried that. This is a very interesting solution. A way to find elefant flows.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
👌
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have several connections with 5-7% utilization.
What can we do here?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So glad you asked this question. 🙂
I will be speaking at CPX New Orleans and Vienna on the CheckMates track with a presentation called "Big Game Hunting: Elephant Flows" that will go through how to track down elephant flows (a.k.a. heavy connections), all the different remediation options, and the pros and cons of each. PhoneBoy will be delivering this presentation for me at CPX Bangkok because I'll be very busy that week, with, uh, something else...
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is an interesting approach to detect heavy connections. I had checked this after this article and could identify some systems that were causing problems. We have now created QoS rules to limit the bandwidth. That worked well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Guys,
if you have a problem with elephant flow you may try this
SecureXL Fast Accelerator (fw fast_accel) for R80.20 and above - sk156672
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also is this supported on R77.30 and R76SP.50?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Priority Queues must be in mode 1 (Eviluator-only) to use that command; mode 1 is the default on a firewall that does not have USFW enabled. I'll be speaking about this very topic in detail at CPX New Orleans and Vienna.
Support for fw ctl multik print_heavy_conn was added in R80.20; I doubt it can be backported into earlier releases since I'm pretty sure it relies on the major changes introduced to SecureXL in R80.20.
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This does not work with R80.40.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
R80.40 gateways use USFW by default.
Unfortunately this is no longer possible with R80.40 in USFW. @Timothy_Hall has already described this well for R80.20.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could someone explain why FW was moved from kernel space to user space by default? What is the benefit except alocation more memory when you have more cores? What will be impacted, what is behind? Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That was discussed here in several posts, I think.
In a nutshell, with more than 48 cores, kernel mode cannot utilise them all. To allow CoreXL use more cores on high performance boxes, User Mode is the only option. Plus, user mode add stability. If FWK instance crashes, it does not affect the whole machine.
VSX is running User Mode FWK instances for ages, actually.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Martin_Raska,
In “Kernel Mode Firewall” KMFW, the maximum number of running cores is limited to 40 because of the Linux/Intel limitation of 2GB kernel memory, and because CoreXL architecture needs to load a large driver (~42MB) dozens of times (according to the CPU number, and up to 40 times). Newer platforms that contain more than 40 cores e.g., 23900 or open server are not fully utilized.
The solution of the problem is a firewall in the user mode of the Linux operating system.
GAIA version/ Kernel/ Cores | Firewall mode | Check |
R80.30 kernel 3.10 more then 35* cores | UMFW is enabled | checked on HP DL 380 G10 2 * Platinum 8180MProcessor 28 cores = 56 cores |
R80.30 kernel 3.10 less then 35* cores | KMFW is enabled | checked on HP DL 380 G10 1 * Platinum 8180MProcessor 28 cores |
R80.30 kernel 2.6 | KMFW is enabled | checked on VMWare with 30 cores and with 46 cores |
R80.40 (default 3.10 kernel) | UMFW is enabled by default | checked on VMWare with 4 cores |
To make sure that UMFW is activated, run the following command:
# cpprod_util FwIsUsermode
1 = User Mode Firewall
0 = Kernel Mode Firewall
For more information or to change the mode, read more in my article here:
R80.x - Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Kernel mode - faster, direct access to hardware but in case of crash everything goes down
User mode - slower, limited access to hardware but in case of crash only app crashes
Also, writing and maintaining code in kernel mode is often pure nightmare compared to user mode. With current hardware performance really does not suffer that much if you do it well in user mode.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Not to mention that if you put traffic through a FW with 40 cores and it can't handle it in kernel mode your design is obviously wrong or software processing it is pure crap. Load-balancing exists for ages.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does that also exist for UMFW?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Kernel or User Mode, Elephant Flows are problematic in both cases
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
True, but there are much better tools for detection and remediation of elephant flows when in kernel mode. With USFW enabled detection and remediation tools for elephant flows are quite limited, but based on a recent conversation I learned that Check Point is working on closing that capability gap as we speak. My CPX 2020 presentation summarizes all this here:
Also the Solution Center has a new feature available that allows the processing of a single elephant flow to be spread across multiple Firewall Worker instances, but this capability is not mainlined yet. This feature was alluded to at the end of my CPX presentation above.
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You sure this is the correct file? I'm either blind or not seeing what you are referring to.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there any way to detect elephant flows in fast path in R77.20 or earlier?
I have made the following summary reading your posts but I miss how to capture elephant flows in fast path in R77.20 or earlier.
Is this summary below correct? Am I missing anything?
- In R77.20 or earlier, you can detect elephant flows with:
* F2F traffic: with /proc/cpkstats/fw_worker_x_stats with or without cpview
* Any traffic: enabling accounting in a number of rules and looking at smartlog.
- Between R77.30 and R80.40:
* you can still use the above options
* Any traffic: priority queues and connection load tracking - cpview and smartlog
"fw ctl multik prioq 1"
- Between R80.20 Take 47 and R83.X
* you can still use all the above
* Any traffic: there is a new elephant flow detection mechanism for kernel mode
"fw ctl multik print_heavy_conn"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't have a R77.20 gateway handy to test, but if the elephant flows are in the fastpath fw_worker stats will not show them.
Accounting is supported directly by SecureXL/fastpath and should work.
I don't think the fwaccel conns command will help much for finding elephant flows in the fastpath but give it a shot. To my knowledge there are no direct elephant flow detection mechanisms in R77.20.
I can't remember if cpview has these screens and whether they will show elephant flows in the fastpath in R77.20, but look for these screens in cpview:
- Network...Top Connections
- CPU...Top Connections
- Advanced...CoreXL...Instances...FW-Instance#...Top FW-Lock consumers
You can also try using the CPMonitor (sk103212: Traffic analysis using the 'CPMonitor' tool) and connstat (sk85780: How to use the 'connstat' utility) tools as described in my CPX 2020 presentation here:
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This no longer works with a 3.10 kernel.
