Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 

R81.10 New Recommended Jumbo - Take #130

eranzo
Employee
Employee
1 23 3,735

eranzo_0-1703511883632.jpeg

Hi All,

R81.10 Jumbo HF Take #130 is now our Recommended Jumbo take and is available for download to all via CPUSE (as recommended) and via Jumbo documentation (R81.10

A full list of resolved issues can be found in the Jumbo documentation (R81.10

Note:

  • Central Deployment allows you to perform a batch deployment of Hotfixes on your Security Gateways and clusters from SmartConsole!! For more information, see sk168597.
  • With Blink images, you can upgrade your environment to the required Major version including its recommended Jumbo hotfix in one Step, using a single image file.

You can install Blink images using CPUSE – More details can be found in sk120193

 

Thanks,

Release Operations Group

23 Comments
the_rock
Legend
Legend

Fantastic news for Christmas!

Best,

Andy

maxtaan
Contributor

Great News!

the_rock
Legend
Legend

We are definitely asking all our customers who are on R81.10, to update jumbo take to 130

Andy

Henrik_Noerr1
Advisor

Ok, Sometimes I am sure we are the only customer finding all kinds of issues in the JHF.

I mostly see praises here in Check Mates - and I have no idea why we 3 weeks after GA are the first to report this.

 

After installing on several VSX clusters we see high load after install. We saw cpcgroup and cxld using some load that were not normal.

It seems a new functionality was added - undocumented, that monitors SND load, this itself caused the heightened load,

I guess customer installs have simply not reached t130 yet.

Attached is screenshot after the JHF upgrade coming from 110 to 130. You easily see when the fix is made.

cpu load.png

Platform Open Server, some 32 cores

VSX - 40VS

fix:

fw ctl set int fwha_cpu_utlization_monitor_enable 0 

add to fwkern.conf for persistence.

 

6-0003825377

 

/Henrik

 

the_rock
Legend
Legend

@Henrik_Noerr1 Just wondering, doesnt that kernel parameter simply "hide" the issue?

Best,

Andy

Henrik_Noerr1
Advisor

Hey Andy,

what do you mean 'hide' the issue? It solved the high cpu load quite clearly.

 

/Henrik

the_rock
Legend
Legend

Seems to me thats what that kernel value does, but could be mistaken.

Mattias_Jansson
Collaborator

Interesting! I havent noticed this behaviour before you posted this.

We installed take 130 on december 28.


take130.JPG

 

Thanks for the tip!

 

 

Hugo_vd_Kooij
Advisor

To me the variable "fwha_cpu_utlization_monitor_enable" reads as "do (not) monitor the CPU load to determine if a cluster member is operating properly". So it should impact the pnote information.

I would expect that the if the monitoring creates more CPU load then the issue would result in cluster flapping. Has anyone seen that happen? Or is the impact significant but just to low to cause a failover?

Hugo_vd_Kooij
Advisor

We have a major issue as well with one VSX cluster. But the suggested setting does not have any impact as far as we can tell. R&D is working on it.

Auke__
Explorer

Same problem here on VSX cluster. R&D working on it...

the_rock
Legend
Legend

@Hugo_vd_Kooij Thats exactly how I understood that value as well.

Best,

Andy

Wolfgang
Authority
Authority

same here, decrease of 20% of CPU utilization after disabling this monitoring at 10:40am

This is Maestro VSX 16600 appliance

CPU.png

Anyone has an answer from TAC what exactly "fwha_cpu_utlization_monitor_enable" monitor ?

MatanYanay
Employee
Employee

Hi all

Thanks for raising this issue, we are aware of it and plan to fix it in the upcoming jumbo we aim to release by EOM.

Matan.

 

Michael_Terlien
Explorer

I am planning to upgrade a VSX cluster to the latest version. Is R81.20 unaffected by this bug? Or is it recommended to wait with upgrades until this bug is resolved?

David_C1
Advisor

Is this issue only affecting VSX? We have a few clusters with Check Point appliances, I have not observed an increase in CPU after applying Take 130.

Dave

MatanYanay
Employee
Employee

Hi,

The issue arises in VSX as each CXLD writes and reads the same file instead of file per VS, leading to high CPU usage.

Thanks 

Hugo_vd_Kooij
Advisor

So this issue is also releated to the number of virtual instances. You may see only a slight issue with 2 virtual system but a big issue with 20 virtual systems. As more processes start to fight over access to the same file.

PNH
Participant

My findings after upgrading 2 days ago (I like trying to dissect stuff like this, and this may be inaccurate)

fw ctl set int fwha_cpu_utlization_monitor_enable 0 

seems to (probably among other things) to make cxld process stop calling an affinity script 2 times a minute.

This script is told to write to a file /cpus.txt that imedately after I guess it has been read it is deleted (looks like it is in the actual root directory as the process does not seem to be in a jail or something??? isn't /tmp better ?)

(you see these calls logges in cxld.elg )

All the cxld processes I have seen going crazy has started behaving bad at different intervals, they all have the /cpus.txt file open, but lsof marks it as deleted.

The cxld processes that has not gone crazy yet does not have this file open (all the time), but the rest seems to be the same. But since I could not find anything similar to "strace" to see what system calls the process does, I do not know for sure.

Also the broken cxld processes write all its 8 times 21MB .elg files all within same second which possibly causes a little more load on disks too.

This parameter does not exist on jumbo 110 apparently, and I see the call to the affinity script there just on startup. (it too seems to write to /cpus.txt by the way, but if they don't happen in all context simultaneously I guess it is ok..)

You may need to restart to make the already hung cxld processes recover.

 

Sorry for my ramblings..

 

 

 

 

Henrik_Noerr1
Advisor
Javi_San
Explorer

Thanks @Henrik_Noerr1
Solution from sk181891 worked like a charm in our VSX cluster!!!

MatanYanay
Employee
Employee

Hi all 

please note we just released R81.10 Jumbo HF Take #132, including a fix for the issue that was discussed in this thread  

Thanks,

 

Matan. 

the_rock
Legend
Legend

Great news @MatanYanay 

Labels