Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
andy_currigan
Contributor

Gw on Dell R730 freezing after upgrading to r80.20

Dear mates,

I have a big problem...

I just upgraded to r80.20 take47 2x gateways in cluster on open server Dell R730, after 30-60 minutes the gateways completely freeze and I need to cold reboot it.

Before to upgrade them via cpuse I fortunately I create a snapshot image on r80.10, now before to revert to the previous image someone has as suggestion?

Where can I investigate?

Does anybody had a similar problems?

Many thanks.

ac

 

0 Kudos
16 Replies
Garrett_DirSec
Advisor

when you say "completely freeze", does this include no response from local command terminal (ie. local keyboard/video)?

If yes, can you track memory consumption immediately following reboot to see if there is particular process that is "going rogue" (ie. insert Sarah Palin joke)?
0 Kudos
andy_currigan
Contributor

Yes the gw are completely stuck, the console is there but no keyboard input possible.

To monitor memory consumption what commands do you suggest?

 

0 Kudos
zsh
Participant

sk149413 ?

0 Kudos
Garrett_DirSec
Advisor

Hello -- while it sounds like the solution to your observed issue may be elsewhere, the specific answer about "how to I detect memory leak?" is discussed in following:

 

How to detect a kernel memory leak on Security Gateway with SecurePlatform OS / Gaia OS

sk35496

0 Kudos
_Val_
Admin
Admin

Please open a TAC case

0 Kudos
Daniel_Taney
Advisor

This could be totally unrelated, but I had the same problem after upgrading a pair of 15400's to R80.20 last week. We worked extensively with TAC and identified a hotfix to resolve the issue. I was advised by TAC yesterday that this hotfix is included in the latest Ongoing Take (Take 73) of R80.20. 

Obviously there could be huge differences since your issue is on Open Server and mine was on an Appliance, but it couldn't hurt to consider the Ongoing Take before reverting back?

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

If you choose to open a TAC case (not the worst idea), you may want to familiarize yourself with this SK, as you will likely need to invoke kdb mode on the Gateway to capture the crash information. We were finding that in the event of a total system freeze, no relevant logs were being generated. 

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

 

R80 CCSA / CCSE
0 Kudos
andy_currigan
Contributor

I open a tac and the se as first workaround disable the firewall priorization with the following comand

 

[Expert@HostName]# fw ctl multik prioq

 

b. Choose the mode number 0 "Off".

 

c. Reboot

 

0 Kudos
Daniel_Taney
Advisor

If disabling Priority Queues resolves your freezing issue, this is the same bug we encountered. For the long term, it would be generally advisable to have Priority Queues enabled. 

So, if this workaround resolves the issue, applying the latest ongoing HFA should permanently resolve the issue and allow PQ to be turned back on. 

Good luck!

R80 CCSA / CCSE
0 Kudos
Garrett_DirSec
Advisor

I see that someone else already posted SK reference but here's more..

R80.20 Security Gateway freezes when Priority Queue enabled (sk149413)

 

0 Kudos
Daniel_Taney
Advisor

I believe this is the hotfix that is included in the latest Ongoing Take. 

R80 CCSA / CCSE
0 Kudos
Garrett_DirSec
Advisor

It's interesting that comments stated the fix in take #73 of R80.20 HFA.  However, it's somewhat mysterious that nothing in list of HFA Take #73 fixes comes close to referencing this.

There are nine specific fixes listed in #73 for "Security Gateway".     While the fix may not specifically refernce "priority queue" as the fix could be related elsewhere in code, it's not going to be under "SmartConsole, etc".

R80.20 HFA

0 Kudos
Daniel_Taney
Advisor

For what its worth, the TAC engineer who helped me actually pointed out that this fix isn't included in the list of fixes. He didn't know why, but he promised me it was there 😁

R80 CCSA / CCSE
Garrett_DirSec
Advisor

thanks for the insight. 

0 Kudos
Daniel_Taney
Advisor

@andy_currigan Did you make any further progress with your issue? After a week of things running great, I had the standby gateway in my cluster freeze again. Its strange as we went a whole week without any issues. This was the longest we'd gone since upgrading.

I need to revisit my issue with TAC, but was just wondering what your findings have been since our problems sound somewhat similar?

Thanks! 

R80 CCSA / CCSE
0 Kudos
andy_currigan
Contributor

got an hotfix from tac - problem seems fixed

0 Kudos
Garrett_DirSec
Advisor

per comments about this priority queue related freeze issue being resolved (but not documented) by hotfix included in HFA #73.

Is what you describe a hotfix for the hotfix? 

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events