Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Bart_Leysen
Contributor

Bad Performance

We recently moved from OpenServers to VSX Clusters and now performance is really bad. Traffic to the internet is terrible.

People working over VPN can't hardly work. We just don't know were to look anymore. we are on R80.10 take 121.

Clusters are 15600 and 23500 models.

Has anybody any idea were to look?

60 Replies
Hsu_Teddy
Participant

The VS on the VSX cluster is default using 1 CPU core.

For more performance , you need to increase the number of core.

To configure CoreXL on a Virtual System:

  1. Open SmartConsole.
  2. From the Gateways & Servers view or Object Explorer, double-click the Virtual System.

    The Virtual System General Properties window opens.

  3. From the navigation tree, select CoreXL.
  4. Select the number of firewall instances for the Virtual System.
  5. Click OK.

Check Point VSX R80.10 Administration Guide 

0 Kudos
Bart_Leysen
Contributor

we assigned 10 firewall instances to that VS

and we assigned 10 CPU's to this VS dedicated.

0 Kudos
AlekseiShelepov
Advisor

More information is required here.

What blades are enabled? Might be it's something connected with additional blades enabled for all traffic.

What you can see in top command output, what processes are using most of CPU?

Try to start with Super Seven commands from Tim Hall's presentation, they are integrated in Common Check Point Commands (ccc) under Gateway Performance Optimization section.

Bart_Leysen
Contributor

Next blades are enabled ips, anti-virus/anti-bot, URL filterening, application and VPN

I will watch the super Seven commans, thanks
0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Can you send output of 

fw ctl affinity -l

fw ctl multik stat

cpmq get

top (extended to see all individual cores)

Bart_Leysen
Contributor

top - 15:46:46 up 1 day, 23:20,  2 users,  load average: 4.14, 4.61, 5.00
Tasks: 519 total,   1 running, 518 sleeping,   0 stopped,   0 zombie
Cpu(s):  7.6%us,  1.5%sy,  0.0%ni, 89.6%id,  0.0%wa,  0.1%hi,  1.2%si,  0.0%st
Mem:  131774100k total, 19940328k used, 111833772k free,   389664k buffers
Swap: 33551672k total,        0k used, 33551672k free, 11931140k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6069 admin      0 -20 2653m 2.1g 178m S  379  1.6   5724:45 fwk2_dev
13600 admin     15   0  716m 227m  40m S    7  0.2   7:40.63 fw_full
13598 admin     15   0  292m  75m  39m S    1  0.1   0:50.09 cpd
 2952 admin      0 -20  710m 179m  31m S    1  0.1  22:39.70 fwk0_dev
 3845 admin     15   0  365m 342m  10m S    1  0.3   4:17.23 rad
 6071 admin      0 -20 1267m 734m  99m S    1  0.6  35:17.06 fwk1_dev
 3163 admin     15   0  286m  74m  44m S    0  0.1   0:24.01 cpd
 3503 admin     15   0  615m 100m  41m S    0  0.1   0:40.87 fw_full
12837 admin     15   0  292m  75m  39m S    0  0.1   0:30.06 cpd
12839 admin     15   0  563m  72m  40m S    0  0.1   0:04.51 fw_full
    1 admin     15   0  1976  724  624 S    0  0.0   0:04.91 init
    2 admin     RT  -5     0    0    0 S    0  0.0   0:00.04 migration/0
    3 admin     15   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/0
    4 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/0
    5 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 migration/1
    6 admin     15   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/1
    7 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/1
    8 admin     RT  -5     0    0    0 S    0  0.0   0:15.59 migration/2
    9 admin     15   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/2
   10 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/2
   11 admin     RT  -5     0    0    0 S    0  0.0   3:14.68 migration/3
   12 admin     15   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/3
   13 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/3
   14 admin     RT  -5     0    0    0 S    0  0.0   0:05.87 migration/4
   15 admin     15   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/4
   16 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/4
   17 admin     RT  -5     0    0    0 S    0  0.0   0:08.82 migration/5
   18 admin     15   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/5
   19 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/5
   20 admin     RT  -5     0    0    0 S    0  0.0   0:02.10 migration/6
   21 admin     15   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/6
   22 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/6
   23 admin     RT  -5     0    0    0 S    0  0.0   0:04.03 migration/7
   24 admin     15   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/7
   25 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/7
   26 admin     RT  -5     0    0    0 S    0  0.0   0:06.17 migration/8
   27 admin     15   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/8
   28 admin     RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/8
   29 admin     RT  -5     0    0    0 S    0  0.0   0:52.68 migration/9


[Expert@vsx-lvn-pub2:0]# fw ctl multik stat
fw: CoreXL is disabled

[Expert@vsx-lvn-pub2:0]# cpmq get

Active ixgbe interfaces:
eth1-01 [Off]
eth1-02 [Off]
eth1-03 [Off]
eth1-04 [Off]

Active igb interfaces:
Mgmt [Off]
Sync [Off]
eth2-01 [Off]

[Expert@vsx-lvn-pub2:0]# fw ctl affinity -l
Mgmt: CPU 0
Sync: CPU 0
eth1-01: CPU 1 2
eth1-02: CPU 3 4
eth1-03: CPU 5 6
eth1-04: CPU 7 8
eth2-01: CPU 9 10
VS_0: CPU 39
VS_0 fwk: CPU 39
VS_1: CPU 11 12 13 14 15 16 17 18 19 20
VS_1 fwk: CPU 11 12 13 14 15 16 17 18 19 20
VS_2: CPU 21 22 23 24 25 26 27 28 29 30
VS_2 fwk: CPU 21 22 23 24 25 26 27 28 29 30
VS_3 fwk: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
0 Kudos
_Val_
Admin
Admin

First question you need to ask yourself, how many core per VS are enabled? If more than one, what are affinity settings? 

0 Kudos
Bart_Leysen
Contributor

VS 2 is the VS with issues

Mgmt: CPU 0
Sync: CPU 0
eth1-01: CPU 1 2
eth1-02: CPU 3 4
eth1-03: CPU 5 6
eth1-04: CPU 7 8
eth2-01: CPU 9 10
VS_0: CPU 39
VS_0 fwk: CPU 39
VS_1: CPU 11 12 13 14 15 16 17 18 19 20
VS_1 fwk: CPU 11 12 13 14 15 16 17 18 19 20
VS_2: CPU 21 22 23 24 25 26 27 28 29 30
VS_2 fwk: CPU 21 22 23 24 25 26 27 28 29 30
VS_3 fwk: CPU 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Hi Bart! Your core allocation is set wrong. You are using hyperthreaded CPUs so you have to be mindful about numbering! 

You have allocated the same physical cores to SecureXL on interfaces (0-10) and VS2 (matching sibling cores 20-30)

Pls have a look at the article i wrote not that long ago

Security Gateway Performance Optimization - VSX 

Make your own spreadsheet and re-allocate cores correctly.

Note that having 2 CPUs per interface is not going to help you unless you use multiqueue - may as well stick with singe CPU per interface.

In your case I would do like this to start with and then tweak depending on CPU usage

Mgmt: CPU 0
Sync: CPU 0
eth1-01: CPU 1 (21)
eth1-02: CPU 2 (22)
eth1-03: CPU 3 (23)
eth1-04: CPU 4 (24)
eth2-01: CPU 5 (25)
VS_0: CPU 6 26
VS_0 fwk: CPU 6 26
VS_1: CPU 10 11 12 13 14 30 31 32 33 34
VS_1 fwk: CPU 10 11 12 13 14 30 31 32 33 34
VS_2: CPU 15 16 17 18 19 35 36 37 38 39
VS_2 fwk: CPU 15 16 17 18 19 35 36 37 38 39
VS_3 fwk: CPU 7 8 9 27 28 29

Or design your own but make sure you take into account hyper threaded numbering. i.e Physical core 0 also holds hyperthreaded core 20 so don't mix SecureXL and CoreXL on those!

Kaspars_Zibarts
Employee Employee
Employee

Let us know if you need more info or commands to set affinities

And remember to press 1 when you do top command so you see all individual cores not just summary Smiley Happy

0 Kudos
Bart_Leysen
Contributor

Thanks Kaspars,

I've read your article it was very helpfull, i've modified the affinity following your guidelines.

I will keep you posted next week when production starts again on monday.

Kaspars_Zibarts
Employee Employee
Employee

Btw, I had to guess-work some things, so ideally send us fw ctl multik stat command output to confirm that suggested config will work ok.

Also if you have possibility, set up some sort of SNMP graphs for all CPU cores to further fine tune your CoreXL and SXL

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

You also would have to adjust allocation depending on total number of cores. My example was for 40 HT cores

0 Kudos
Bart_Leysen
Contributor

What is you experience with multique? Is it advisable to enable it on some interfaces?

0 Kudos
Timothy_Hall
Legend Legend
Legend

If the RX-DRP rate on a busy interface is >0.1% (viewed with netstat -ni) even enough SND/IRQ cores have been allocated such that the busy interface has its own dedicated SND/IRQ core as shown by sim affinity -l, then Multi-Queue should be enabled.  Multi-Queue does cause some slight additional overhead on the SND/IRQ core to "stick" the packets associated with a single connection to the same queue every time to avoid out of order delivery, so enabling Multi-Queue is not always a no-brainer.  More SND/IRQ cores should be allocated first if possible.  Specifically if all SND/IRQ cores are very busy (>75% utilization) and you can't allocate any more due to a limited number of cores, enabling Multi-Queue will actually make things worse.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Bart_Leysen
Contributor

output of fw ctl multik stat after the change

[Expert@vsx-lvn-pub2:2]# fw ctl multik stat
ID | Active | CPU | Connections | Peak
----------------------------------------------
0 | Yes | 15-19+ | 1232 | 6691
1 | Yes | 15-19+ | 935 | 6653
2 | Yes | 15-19+ | 1120 | 9340
3 | Yes | 15-19+ | 932 | 6128
4 | Yes | 15-19+ | 1281 | 11506
5 | Yes | 15-19+ | 968 | 7107
6 | Yes | 15-19+ | 1073 | 8319
7 | Yes | 15-19+ | 1006 | 7035
8 | Yes | 15-19+ | 1072 | 6552
9 | Yes | 15-19+ | 1123 | 5669
[Expert@vsx-lvn-pub2:2]# vsenv 1
Context is set to Virtual Device vsx-lvn-pub2_fw-lvn-snx (ID 1).
[Expert@vsx-lvn-pub2:1]# fw ctl multik stat
ID | Active | CPU | Connections | Peak
----------------------------------------------
0 | Yes | 10-14+ | 391 | 115231
1 | Yes | 10-14+ | 183 | 76175
2 | Yes | 10-14+ | 312 | 116117
3 | Yes | 10-14+ | 860 | 114412
4 | Yes | 10-14+ | 445 | 90342
[Expert@vsx-lvn-pub2:1]# vsenv 3
Context is set to Virtual Device vsx-lvn-pub2_vlan802 (ID 3).
[Expert@vsx-lvn-pub2:3]# fw ctl multik stat
fw: CoreXL is disabled

[Expert@vsx-lvn-pub2:3]# vsenv 0
Context is set to Virtual Device vsx-lvn-pub2 (ID 0).
[Expert@vsx-lvn-pub2:0]# fw ctl multik stat
fw: CoreXL is disabled

0 Kudos
Muazzam_Saeed
Participant

I am on R77.30 (going to R80.10 soon). I thought the problem is fixed on R80.10 but seems like you are having the same issue that I see on R77.30. I am not sure if we can ever get the same performance on VSX, compared with regular gateways. You need to fine tune your VSX environment to improve the performance.

0 Kudos
Bart_Leysen
Contributor

I'll guess so, but even a policy push will make the cluster unstable or at least that VS.

Pushing policy will make the VS unresponsive for a couple of minutes.

0 Kudos
Muazzam_Saeed
Participant

Just want to share my experience. We have three 4-node VSX clusters, all the hardware is 23800. One cluster was upgraded to R80.10 (from R77.30) couple of months ago, and one a few days ago.

The only issue I have is that the performance is not the same as the normal (non-VSX) gateway. Also I wish there is no downtime on changing CoreXL value.

Other than that we have no other issues, it is stable and reliable. No issues ever noticed on pushing the policy.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

If you upgrade to R80.20 then there will be no downtime in changing cores.

As for performance, you can tweak it to be fairly close now with 64 bit kernel on VS but of course it will never be as fast as non-vsx. 

Jerry_Lee
Participant

What do the interfaces look like - are you clocking up RX or TX errors? 

watch netstat -ni

& ethtool for the i-faces - look at queueing

Also dig into  and look at the input queue config:

sk61143

you may need to increase it (from old memory here) as it feeds all the VSs

Good luck,

Jerry Lee

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Normally I would enable MQ on busy 10Gbps interface as single core cannot cope with that much traffic. 

We tried to enable it on 4x1Gbps bond but for some reason it didn't work that well. Bond kept losing members and was going up and down as a result every minute so we rolled it back. Didn't have time to investigate it any further I'm afraid but I believe there was something wrong in our config. Too many things "to do" at the moment.

0 Kudos
Timothy_Hall
Legend Legend
Legend

The strange situation you encountered with Multi-Queue is why I am generally not a fan of turning on features that aren't enabled by default, unless you know for sure that you need them.  The KISS principle is your friend...  🙂

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Jerry_Lee
Participant

We worked with o-tac and a diamond engineer to tshoot and tweak this issue.

.

All traffic inbound was being affected to all of the VSs on rhe 21400 4 node cluster.

Jerry

Get Outlook for Android<https://aka.ms/ghei36>

0 Kudos
Kris_Pellens
Collaborator

Hello Bart,

We've a similar set up as yours; but we're using the 23800 appliances. We've ditched R80.10 and replaced it with R80.20. Have you considered to upgrade to R80.20, because between the lines you can read that the performance on R80.10 could be better. From the R80.20 release notes:

  1. Significant boost to Virtual Systems performance, utilizing up to 32 CoreXL FW instances for each Virtual System.
  2. Dynamic Dispatcher - Packets are processed by different FW worker (FWK) instances based on the current instance load.
  3. Changes in the number of FW worker instances (FWK) in a VSLS setup do not require downtime.
  4. SecureXL Penalty Box supports the contexts of each Virtual System, see sk74520.

Has the user experience been improved after you made the proposed changes?

Many thanks.

Regards,

Kris

Kaspars_Zibarts
Employee Employee
Employee

Actually 64bit mode is available already in R80.10 Smiley Happy

VSX Enhancements:

  • 64-bit support for VSX Gateways, increasing concurrent connections capacity.
  • Content Awareness for VSX Gateways.
0 Kudos
Bart_Leysen
Contributor

We do see improvement after setting the CPU affinity right, i mean i definitely now see that all CPU's are used.

i also enabled multiqueing on my busiest interfaces and also here i see an improvement.

But still it's not what it should be, especially if we push a policy all connections are just dropped for 10 minutes. This is not good. we don't see this any other VSX cluster we have.

0 Kudos
Kaspars_Zibarts
Employee Employee
Employee

Which CPU cores are busy when it happens? Is it VS2 ten cores or interfaces?

As suggested by Kris, switching to 64bit kernel will boost a lot of memory if it's FWK that's maxing out

check with vs_bits -stat

0 Kudos
Bart_Leysen
Contributor

[Expert@vsx-lvn-pub2:0]# vs_bits -stat
All VSs are at 32 bits

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events