Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
HeikoAnkenbrand
Champion Champion
Champion

High Performance Gateways and Tuning

High Performance Gateways and Tuning

Timothy Hall  gave a very interesting presentation Security Gateway Performance Optimization with Tim Hall Video   in the last days. Thank you for the pressentation. Now we discuss all in the forum about the possibilities of the tuning. I would like to hear your experiences on this topic in the Checkmates forum.

 

More Tuning Tips

More interesting articles about R80.x performance tuning and architecture can be found here:

- R80.x Architecture and Performance Tuning - Link Collection
- Article list (Heiko Ankenbrand)

 


➜ CCSM Elite, CCME, CCTE
(1)
29 Replies
HeikoAnkenbrand
Champion Champion
Champion

I had written the following in another article this morning Show me yours. This gave me the idea to start this article.

I do a lot of performance tuning for customers and copied a few passages from my training material.

 

1) This is a typical firewall with many blades on! Here the PXL path is used.

 

Blades: fw vpn cvpn av ips identityServer anti_bot ThreatEmulation mon
Cores: 16 (4xSND, 10xFWK, 2xfwd[Logging,...])
MultiQueue: on (4 Interface)
Interface: 4 x 10 GBit
Connections: approximately 500K, peek 700K
CPU: 50% over all cores

 

# fwaccelstats -s


Accelerated pkts/Total pkts   : 1052964458/159849848978 (0%)
F2Fed pkts/Total pkts   : 2764823456/159849848978 (1%)
PXL pkts/Total pkts   : 156032061194/159849848978 (97%)

 

---


2) This is a typical firewall with many blades off! Here the acceleration path is used.

 

Blades: fw vpn
Cores: 16 (8xSND, 6xFWK, 2xfwd[Logging,...])
MultiQueue: on (4 Interface)
Interface: 4 x 10 GBit
Connections: approximately 500K, peek 700K
CPU: 30% over all cores

 


Accelerated pkts/Total pkts   : 191956815617/194432772885 (98%)
F2Fed pkts/Total pkts   : 3767408762/194432772885 (2%)
PXL pkts/Total pkts   : 0/194432772885 (0%)

 

---

 

What else are tuning parameters for me?

- Interface cards 1G, 10G and 40GB > (MQ, Errors, interupt distribution, more or less SND's...)

- Blades + CoreXL > (more or less FW_Worker's, https inspection, deep inspection, PSL, CPAS, R77.30 VPN on FW_Worker_0 [R80.10 multicore VPN], CPU utilization,...)

- SecureXL > (NAT templates, Drop templates, Rule optimization for access tamplates,...)

- Connection Tabel > (many connections in TCP start state  + timeout, UDP virtual session timeout

- ClusterXL > (sync or not sync from services,...)

- Logging > (optimize logging in the rules,more or less fwd cores for logging,...)

- IPS > (Signatures with high performance impact,...)

- VPN  > (3DES or AES with NI [high-speed hardware encryption],...)

- SecureXL > (SAM card or Falcon card (R80.20 and above) inside,...)

- and and and

 

I can found 100 points more that can be optimized.

 

I think performance tuning is a very individual process for each firewall. Here you should first talk about what you want to accomplish on the firewall. Like I said, I'd like to hear your opinion on tuning.

Regards,

Heiko


➜ CCSM Elite, CCME, CCTE
Amir_Arama
Advisor

"many connections in TCP start state + timeout, UDP virtual session timeout"

how can i get output of this?

0 Kudos
HeikoAnkenbrand
Champion Champion
Champion

THX, I think most of those who are involved in tuning know them. 

I find the experiences more interesting. There is always a lot of discussion about topics like medium path (PXL) and fast path (acceleration path). But from my point of view that's not all. I had already touched on a few other topics. And there's a few more. What do you check when you do performance tuning on a firewall?

I also look at the following things, for example:

- Interface cards 1G, 10G and 40GB > (MQ, Layer 2 errors, interupt distribution,ring buffer, more or less SND's, Broadcom vs. Intel drivers [e1000, igb, ixebe] and other i40e, mlx5_core...)

- Blades + CoreXL > (more or less FW_Worker's, https inspection, deep inspection, PSL, CPAS, R77.30 VPN on FW_Worker_0 [R80.10 multicore VPN], CPU utilization,...)

- SecureXL > (NAT templates, Drop templates, Rule optimization for access tamplates,...)

- Connection Tabel > (many connections in TCP start state  + timeout, UDP virtual session timeout

- ClusterXL > (sync or not sync from services,...)

- Logging > (optimize logging in the rules,more or less fwd cores for logging,...)

- IPS > (Signatures with high performance impact,...)

- VPN  > (3DES vs. AES with NI [high-speed hardware encryption],...)

- and many more

Regards,

Heiko


➜ CCSM Elite, CCME, CCTE
JozkoMrkvicka
Mentor
Mentor

- number of VLANs configured on each interface (we have limit of 250 VLAN per interface, 800 VLANs per box)

- number of configured DHCP relays, number of VLANs which has enabled DHCP helpers.

-number of free interfaces (2 in case of creating additional bond interface to offload some VLANs from overloaded interface)

- periodically check /var/log/messages and dmesg

- monitoring of CPU, memory, PSU, RAID, traffic based on SNMP

Best Practices - Rulebase Construction and Optimization 

- RX/TX drops 

- Monitoring of top talkers

Kind regards,
Jozko Mrkvicka
HeikoAnkenbrand
Champion Champion
Champion

- SecureXL > (SAM card or Falcon card (R80.20 and above) inside,...)

- Dynamic Dispatcher > (on or off)
- MultiQueue > (on or off)

- Hyperthreading (on or off)

- Fragmentation > (packets use F2F path,...)

- VLAN > (issues - for example all vlan's on a firewall interface,...)

- NAT > (NAT tabel full,...)

- Routing > (ICMP redirects,...)


➜ CCSM Elite, CCME, CCTE
Martin_Oles
Contributor

Interesting topic. I do have one firewall, VSX R77.30, one VS extra heavy used, but only Firewall and Monitoring blades are enabled. For my surprise, I have found, that much higher amount of traffic goes via PXL than I would expect.

[Expert@FW01A:22]# fwaccel stats -s
Accelerated conns/Total conns : 186543/218772 (85%)
Accelerated pkts/Total pkts   : 7051830336/12021550944 (58%)
F2Fed pkts/Total pkts   : 620021870/12021550944 (5%)
PXL pkts/Total pkts   : 4349698738/12021550944 (36%)
QXL pkts/Total pkts   : 0/12021550944 (0%)

Rulebase is rather heavy, but nothing unusual in it (no time or dns objects), Accept templates are used until some special rules, which are on very bottom, very occasionally hit. What might cause so many traffic be handled by PXL path.

To add more confusing, on that very same physical box I can see in other VS, again only Firewall and Monitoring blades are enabled, similar rulebase size:

[Expert@FW01A:7]# fwaccel stats -s
Accelerated conns/Total conns : 4410/4756 (92%)
Accelerated pkts/Total pkts   : 6637748854/6782160048 (97%)
F2Fed pkts/Total pkts   : 139702180/6782160048 (2%)
PXL pkts/Total pkts   : 4709014/6782160048 (0%)
QXL pkts/Total pkts   : 0/6782160048 (0%)

Now I really have to look into it, why there is such big difference on presumably the same virtual systems.

HeikoAnkenbrand
Champion Champion
Champion

Hi Martin,

which blades are used it's only fw and monitoring?

# enabled_blades

If you find any conspicuous PXL connections with the following command:

# fwaccel conns |grep S

Packet flow. Is one of the following points possible?

  • IPS (some protections) <<<
  • VPN (in some configurations) <<<
  • Application Control
  • Content Awareness
  • Anti-Virus
  • Anti-Bot
  • HTTPS Inspection <<<
  • Proxy mode <<<
  • Mobile Access <<<
  • VoIP <<<
  • Web Portals <<<

Regards,

Heiko


➜ CCSM Elite, CCME, CCTE
Martin_Oles
Contributor

Hi Heiko,

On on virtual systems only fw blade is enabled

[Expert@FW01A:22]# enabled_blades
fw

Due to rather high amount of connections I have tried to filter out connections with flag PXL enabled, to stay on safe side.

[Expert@FW01A:22]# fw tab -t connections -s
HOST                  NAME                                ID #VALS #PEAK #SLINKS
localhost             connections                       8158 124669 306821  374170
[Expert@FW01A:22]# fwaccel conns -f S
Source          SPort Destination     DPort PR Flags     C2S i/f S2C i/f Inst Identity
--------------- ----- --------------- ----- -- ----------- ------- ------- ---- -------
   XX.XX.XX.XX 49202     XX.XX.XX.XX   445  6 .......S... 36/16   16/36    1        0
   XX.XX.XX.XX 49785     XX.XX.XX.XX   445  6 .......S... 36/29   29/36    2        0
   XX.XX.XX.XX 12278     XX.XX.XX.XX 49155  6 F......S... 25/29   29/25    2        0
   XX.XX.XX.XX 59576     XX.XX.XX.XX   445  6 .......S... 36/39   39/36    1        0
   XX.XX.XX.XX 55185     XX.XX.XX.XX   445  6 .......S... 36/16   16/36    2        0
   XX.XX.XX.XX 54593     XX.XX.XX.XX   445  6 .......S... 36/16   16/36    1        0
   XX.XX.XX.XX   135     XX.XX.XX.XX 52294  6 F......S... 36/29   29/36    1        0
   XX.XX.XX.XX   445     XX.XX.XX.XX 58537  6 .......S... 36/10   10/36    2        0
   XX.XX.XX.XX   445     XX.XX.XX.XX 60836  6 .......S... 36/12   12/36    1        0
   XX.XX.XX.XX   445     XX.XX.XX.XX 61173  6 .......S... 36/16   16/36    1        0
   XX.XX.XX.XX 64636     XX.XX.XX.XX   139  6 .......S... 36/29   29/36    2        0
   XX.XX.XX.XX 49726     XX.XX.XX.XX   445  6 .......S... 36/16   16/36    1        0
   XX.XX.XX.XX 50610     XX.XX.XX.XX   139  6 .......S... 36/29   29/36    1        0
   XX.XX.XX.XX 50606     XX.XX.XX.XX   445  6 .......S... 36/12   12/36    1        0
...
Total number of connections: 28762

Packet chain looks pretty straightforward for me:

[Expert@FW01A:22]# fw ctl chain
in chain (11):
        0: -7f800000 (f5b04540) (ffffffff) IP Options Strip (in) (ipopt_strip)
        1: - 1fffff8 (f5b05c30) (00000001) Stateless verifications (in) (asm)
        2: - 1fffff7 (f5b484b0) (00000001) fw multik misc proto forwarding
        3: - 1000000 (f5bd8860) (00000003) SecureXL conn sync (secxl_sync)
        4:         0 (f5aa4150) (00000001) fw VM inbound  (fw)
        5:  10000000 (f5be30b0) (00000003) SecureXL inbound (secxl)
        6:  7f600000 (f5af8920) (00000001) fw SCV inbound (scv)
        7:  7f730000 (f5d0b810) (00000001) passive streaming (in) (pass_str)
        8:  7f750000 (f5f1eda0) (00000001) TCP streaming (in) (cpas)
        9:  7f800000 (f5b04250) (ffffffff) IP Options Restore (in) (ipopt_res)
        10:  7fb00000 (f6309080) (00000001) HA Forwarding (ha_for)
out chain (8):
        0: -7f800000 (f5b04540) (ffffffff) IP Options Strip (out) (ipopt_strip)
        1: - 1fffff0 (f5f1f030) (00000001) TCP streaming (out) (cpas)
        2: - 1ffff50 (f5d0b810) (00000001) passive streaming (out) (pass_str)
        3: - 1f00000 (f5b05c30) (00000001) Stateless verifications (out) (asm)
        4:         0 (f5aa4150) (00000001) fw VM outbound (fw)
        5:  10000000 (f5be30b0) (00000003) SecureXL outbound (secxl)
        6:  7f700000 (f5f1f270) (00000001) TCP streaming post VM (cpas)
        7:  7f800000 (f5b04250) (ffffffff) IP Options Restore (out) (ipopt_res)
[Expert@FW01A:22]#

Looking on traffic, which goes via PXL path only Microsoft AD related traffic is visible there, nothing else. Looking into rulebase (knowing now what I am looking for), one of the most hit rule is permitting workstations to AD servers. I did not found so far SK related to it, but it looks like DCE RPC traffic is accelerated by Accept template, but still handled via PXL path.

With regards,

Martin

HeikoAnkenbrand
Champion Champion
Champion

I don't quite understand that either.

sk32578 SecureXL Mechanism describes the following:

When SecureXL is enabled, all packets should be accelerated, except packets that match the following conditions:

  • Packets that are:

    • CIFS

Hmmmm!!!!

The service shows that the CIFS protocol is active here by default:

It's just an idea! I don't know if I think right here.

Maybe duplicate the service and deactivate the protocol type. Then build a test rule and see if it still goes into the PXL path or not.

I'd limit the rule to two test systems. This of course also has effects on the firewall behavior.

Regards

Heiko


➜ CCSM Elite, CCME, CCTE
JozkoMrkvicka
Mentor
Mentor

Or the better solution would be to move this particular rule to the last rule in rulebase (before clean-up of course).

What output of fwaccel stat says ?

"microsoft-ds" service (tcp/445) will stop creating Connection Templates, because this condition is met:

  • Rule with a service that has a 'handler' (where a specific protocol is chosen in 'Protocol Type' field - instead of 'None' ; go to service object - right-click - click on "Edit..." - click on "Advanced..." button - refer to "Protocol Type:" field).
Kind regards,
Jozko Mrkvicka
HeikoAnkenbrand
Champion Champion
Champion

Jozko Mrkvicka I don't think that's the problem. The question is, why does "microsoft-ds" service (tcp/445)  use the PXL path (medium path) and not the acceleration path (fast path)?

Hence the idea to set the protocol type to "none".

According to sk32578 CIFS should even use the F2F path.

I agree with Martin Oles, it's very strange.

Hmmm!


➜ CCSM Elite, CCME, CCTE
Timothy_Hall
Champion
Champion

Jozko Mrkvicka‌, moving rules around the rulebase has absolutely no impact on which path (SXL, PXL, F2F) the traffic takes, it only affects the formation of SecureXL Accept templates.  The different paths of SecureXL/CoreXL are referred to as Throughput Acceleration, and the templating function of SecureXL is called "session rate acceleration" or "connection rate acceleration" or "rulebase lookup caching".  Both functions are most definitely part of SecureXL but get confused with each other all the time.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Timothy_Hall
Champion
Champion

Heiko you are correct about port 445 (microsoft-ds) not being handled in the Accelerated Path as I have seen it before, where customers have only the most basic blades enabled but they have a high amount of CIFS/SMB traffic traversing the firewall, and that causes high PXL numbers.  I'm not exactly sure why this happens, but I suspect CIFS/SMB traffic requires some kind of streaming inspection that SecureXL can't perform.   There is actually a way to force this traffic through SecureXL (SXL path) by whitelisting it via the spii_dport_white_list directive, and you can see me referencing it in this thread:

https://community.checkpoint.com/message/10308-re-enforce-securexl-template?commentID=10308#comment-... 

Basically this is a way to tell SecureXL: "When you see this port number, just forward it yourself and don't inspect it any further".  It involves some SMS *.def file changes and can open up some gigantic security holes if used improperly, so please contact Check Point TAC if you need it. Edit: The ability to whitelist traffic is now available in SecureXL, see this: sk139772: SecureXL Fast Accelerator (sim fastaccel) for Non Scalable Platforms

Interestingly if you search for "spii_dport_white_list" in SecureKnowledge it matches on this SK:

sk106062: CPU load and traffic latency after activating Anti-Bot and/or Anti-Virus blade on Security...

although "spii_dport_white_list" doesn't actually appear anywhere in the text of the SK, at least that I can see with my partner-level access.  🙂  The SK does say that this is "fixed" in R80.10 but I'm not sure if it applies in this context.  Edit: While in Israel this week I found out that when an SK matches a search term that does not actually seem to appear in the visible text of the SK itself, that there are extra hidden notes attached to the SK causing the match that are only visible to Check Point employees internally.

 

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
HeikoAnkenbrand
Champion Champion
Champion

I'm agree with you Timothy.

What I noticed if you activate the "protocol type" in services that would lead to it, the PXL path or acceleration path is used. Depending on whether it is set to "none" or a protocol type. I've been able to simulate this in the lab. These were all protocols like CIFS and others. I think these services are detected and processed via the psl (protocol detection) and so they use the pxl path.  I'm not sure about this either! According to sk32578 CIFS should even use the F2F path.

Hmmmm!


➜ CCSM Elite, CCME, CCTE
Kaspars_Zibarts
Employee Employee
Employee

A while ago we had the same "problem" with file share protocol and PXL in firewall with only fw blade so we got a procedure from support how to remove PXL usage but they strongly advised not to do it so at the end we just accepted the fact that microsoft-ds will use PXL. I have to check with support of its ok to share the SK here

Kaspars_Zibarts
Employee Employee
Employee

Unfortunately procedure is classified "internal" so I cannot re-post it here but ask support for procedure to disable CIFS traffic inspection if you wish to do so. Just remember, seeing some percentage of PXL is not the worst case scenario as long as you can explain it and it's not consuming too much of your CPU time. Smiley Happy

Timothy_Hall
Champion
Champion

Just wanted to add for future reference that in R80.10 take 177 and later, traffic can be more generally whitelisted for processing directly in SecureXL.  The command is sim fastaccel and is discussed in this thread:

Accelerate specific traffic with SecureXL 

Edit: In R80.20 Jumbo HFA 103+ the ability to force acceleration of certain traffic has returned via the fw ctl fast_accel command:

sk156672: SecureXL Fast Accelerator (fw fast_accel) for Non Scalable Platforms R80.20 and above

 

--
"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Wolfgang
Authority
Authority

Hello CheckMates,

are there any changes of the behaviour with R80.40 or R81 ?

sk32578 SecureXL Mechanism still shows no acceleration for CIFS protocol. You can bring down your firewall with some clients copying files 😞

Wolfgang

0 Kudos
Timothy_Hall
Champion
Champion

R80.40 differences for SecureXL were documented in the R80.40 addendum for my book.  TCP/445 still will always go at least PSLXL in R80.40, and TCP/443 may also get inappropriately pulled into the TLS parser which keeps it from being accelerated as described here:  sk166700: High CPU after upgrade from R77.x to R80.x when running only Firewall and Monitoring blade....

As far as R81 I haven't had a chance to look at the SecureXL differences yet, but the big change is of course dynamic split adjustment being enabled by default in R81.  Other than that SecureXL doesn't seem to have changed a huge amount in R81 from R80.40 that I can see so far; there will be an R81 addendum for my book at some point documenting any differences.

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
HeikoAnkenbrand
Champion Champion
Champion

Here is a good article about IPS performance analysis from Omer Shliva:

IPS Analyzer Tool - How to analyze IPS performance efficiently 

Regards,

Heiko


➜ CCSM Elite, CCME, CCTE
HeikoAnkenbrand
Champion Champion
Champion

And the SK:

How to measure CPU time consumed by IPS protections 

Regards,

Heiko


➜ CCSM Elite, CCME, CCTE
Martin_Oles
Contributor

Hi,

just only info for previous discussion. I have created new service TCP 445 without protocol type and used it in policy instead of default microsoft-ds (and protocol type CIFS). During policy installation I have seen massive sessions cut, such was expected as re-matching correctly identified different protocol. And now surprise, even when traffic is matching service without protocol type, where I would assume, that inspection does not take place, still I can see, that such traffic is not being accelerated and is still using PXL path.

[Expert@FW01A:22]# fwaccel stats -s
Accelerated conns/Total conns : 96391/113772 (84%)
Accelerated pkts/Total pkts   : 76129847/126861392 (60%)
F2Fed pkts/Total pkts   : 15605402/126861392 (12%)
PXL pkts/Total pkts   : 35126143/126861392 (27%)
QXL pkts/Total pkts   : 0/126861392 (0%)
[Expert@FW01A:22]#

Result from fwaccel conns -f S is also showing, that CIFS connections are not accelerated and are handled by PXL.

Rather surprising. So, node's CPU is still participating heavily on global warming .

Kaspars_Zibarts
Employee Employee
Employee

Yap, that's what I tried to say earlier but i guess i was little short in my comments Smiley Happy tried and failed too

Isabel_Brenner
Participant

Yep, I guess!

HeikoAnkenbrand
Champion Champion
Champion

At R80.20 there's a new hack to turn the CoreXL instances on the fly on and off. Here is an example with 3 FW wokers!

Now start the command:

# fw ctl multik stop     -> stop CoreXL instance 1

# fw ctl multik stop    -> stop CoreXL instance 2

# fw ctl multik stop   -> The last instance cannot be disabled:-)

 

You can activate it again per:

# fw ctl multik start

 

Conclusion:
CoreXL 0 cannot really be deactivated under R80.10+:-)

 

Regards

Heiko


➜ CCSM Elite, CCME, CCTE
HeikoAnkenbrand
Champion Champion
Champion

More see here:

ATRG: CoreXL 


➜ CCSM Elite, CCME, CCTE
0 Kudos
C__Kallfass
Explorer

interesting discussion

0 Kudos
red_tomato
Participant

Is the information also correct for R80.40?

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events