Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Contributor

What processes can be safely moved on to an unlicensed CPU?

I have a 16 Core openserver HA Active/Passive cluster running R80.10. Fully hotfixed. It is only licensed for 4 CPUs.

1 SND and 3 workers

The 1 SND sits around 30% utilisation 

The 3 workers are balanced around 15% utilisation.

affinity is shown here

fw ctl affinity -l -r
CPU 0: eth3 eth0 eth1 eth8 eth9 eth4 eth7
CPU 1: fw_2
rtmd fwd mpdaemon lpd in.ahclientd in.aclientd in.aftpd cprid cpd
CPU 2: fw_1
rtmd fwd mpdaemon lpd in.ahclientd in.aclientd in.aftpd cprid cpd
CPU 3: fw_0
rtmd fwd mpdaemon lpd in.ahclientd in.aclientd in.aftpd cprid cpd
CPU 4:
CPU 5:
CPU 6:
CPU 7:
CPU 8:
CPU 9:
CPU 10:
CPU 11:
CPU 12:
CPU 13:
CPU 14:
CPU 15:
All:
The current license permits the use of CPUs 0, 1, 2, 3 only.

We have a very sporadic issue where I believe there is a load problem on policy deployment where at least one of the CPU's is getting maxed out and can't cope which has a knock on effort on various processes not functioning properly. 

I can see that one of the ksoftirqd/0 process has clearly been running. Further background below but having looked at a top on deployment (when there is much lower load and therefore no problem) I see several of the daemons highlighted in bold above run at a significant amount of CPU for a second or two.

Our initial thinking is to use fw ctl affinity -s -n ........to move these processes onto the 12 CPUs that are not licensed thus taking the strain off the SND and three workers at point of policy deployment.

Can anyone confirm that these processes will work on unlicensed Cores without issue?

(I'm also planning on dropping to 2 workers and splitting out the SND to 2 processors for further balancing in case anyone suggests it)

 

Background info for those interested:

The firewall used to run around 120,000 concurrent connections until recent client device changes where we are now averaging around the 170,000 concurrent connections.

Occassionally on policy deployment we now see a problem whereby the Active member has significant stability issues for a period of time which can be anywhere from 10 minutes to 2.5 hours. During this time it logs nothing, no cpview data. Nothing in messages. No access to CLI. DHCP relay fails. but it continues to pass traffic albeit with more latency causing general slowness. When it recovers itself and start responding again it logs that it has restarted multiple fwd processes

 [22 May 14:26:41] fwd: pid 4413 is not responding, killing process
 [22 May 14:26:41] fwd: pid 4436 is not responding, killing process
 [22 May 14:26:41] fwd: pid 4437 is not responding, killing process
 [22 May 14:26:41] fwd: pid 4483 is not responding, killing process
 [22 May 14:26:41] fwd: pid 4503 is not responding, killing process
 [22 May 14:26:41] fwd: pid 17942 is not responding, killing process
 [22 May 14:26:41] fwd: pid 20064 is not responding, killing process
 [22 May 14:26:41] fwd: pid 20065 is not responding, killing process
 [22 May 14:26:41] fwd: pid 20076 is not responding, killing process

top shows all is normal but the load average is massive but quickly drops.

TAC call raised several times but with no logs outputted during the issue it is difficult to be come up with any hard solutions and there is no appetite to deliberately pick a busy time and try and deployment with lots of debugging on (assuming it continued to respond) as the instability is very service affecting and will go on for as long as it takes which as I say can be hours.)

Thanks all, hope you are having a good day! 

 

0 Kudos
4 Replies
Highlighted
Contributor

Update, have just tried the above process on a VM (admittedly running an EVAL license) set the affinity using fw ctl affinity -s -n ....
All looked good. Just rebooted and it has just dumped them back on to random cores again! Anyone know how to save this? Our TAC engineer has indicated using the fwaffinity.conf file can be unreliable and be lost during reboot!
0 Kudos
Highlighted
Champion
Champion

fwaffinity.conf is ignored at boot time if SecureXL is enabled.  What does the output of fwaccel stat say?

You could run $FWDIR/scripts/fwaffinity_apply after boot. 

Alternatively, in R80.10 the script $FWDIR/bin/taskset_us_all is what is assigning process affinities at boot, check out that script.

 

R80.40 addendum for book "Max Power 2020" now available
for free download at http://www.maxpowerfirewalls.com
Highlighted
Admin
Admin

Technically, you're not allowed to use processors you are not licensed to use.
I suspect any issues you run into as a result of trying to circumvent this limitation would classify as an unsupported configuration.
Best to contact your local Check Point office or reseller to get the appropriate license.
0 Kudos
Highlighted
Contributor

Thanks for the responses.
As I say we are in discussion with TAC. Its more a question of making them stick on the CPU's we've affined them too. Will have a play and update
0 Kudos