Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Vincent_Bacher
Advisor
Advisor

Segmentation fault

Hello mates, 

after upgrading several gateways to R80.10 T56 a customer always gets segmentation fault when perperforming "show configuration"  command in clish.

I know that there are several sk regarding R77 but not R80.10.

Anybody who knows anything? 

Best regards

Vincent 

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
56 Replies
KennyManrique
Advisor

Hello Vincent,

Recently I had a similar problem on a 5200 Appliance upgraded to R80.10 T56 from R77.30.

According to sk106166 is a database corruption (tough is for R77.30 version). The problem jumped when I verified the status from central managment, connection was lost altough SIC status was communicating (about 18 hours after upgrade).

Meanwhile, I was able to connect to SSH, but I couldnt perform any "fw" operations, even some troubleshooting commands like tcpdump or netstat weren't able to execute; resulting in a core dump. From core dumps, it seems two processes crashed the gateway initially: in.msd and esc_db_complete.

I rebooted the gateway and used HW Diagnostic tools embeded on the device for test the hard drive. A second reboot was performed  and entered to maintenance mode to execute a disk check using fsck based on sk92442.

After this, Appliance work as expected (8 days without the error by now)

Regards.

Vincent_Bacher
Advisor
Advisor

Hello Kenny,

thank you for sharing your experience regarding this issue.

Well, this led me to create a sr at usercenter Smiley Happy

We'll see.

best regards

Vincent

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
Vincent_Bacher
Advisor
Advisor

Hello mates, 

just received update from CP that this is a known issue and resolved in T70.

Will test that and keep you posted.

Best regards

Vincent 

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
KennyManrique
Advisor

Great to hear that Vincent!

On R80.10 HFA document, does not mention a similar issue to the experienced on HFA 70. Does CP team tell you specifically which ID or problem is solved?

Regards.

0 Kudos
Vincent_Bacher
Advisor
Advisor

Hi Kenny,

Unfortunately not. 

Just asked and keep you  posted

Best regards 

Vincent 

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
Vincent_Bacher
Advisor
Advisor

Just installed T70 on a test node and still facing this issue.

And no answer from case owner regarding bug ID or similar helpful info.

I'll wait a bit and push the escalate button if needed

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
KennyManrique
Advisor

Thanks for the update Vincent.

I was analizing the data of my experienced crash and I find out that all the strange crashes belong to inactive functionalities on the Gateway (AntiSpam and HTTPS Inspection).

Do you have a similiar issue?

Regards.

0 Kudos
Vincent_Bacher
Advisor
Advisor

Hm, i did not check if i have a similar issue.

But my co-workers nost notified me about a different issue.

We are now facing the error that my colleagues have added a vlan Interface and the virtual Interface is not shown in cphaprob -a if

added eth1.30 and we dont see the virtual if

# fw getifs
localhost eth1 ******
localhost eth5 *****
localhost eth1.2001 *****
localhost eth1.30 ******
localhost eth1.20 ******
localhost eth1.10 ******

#cphaprob -a if

Required interfaces: 3
Required secured interfaces: 1

eth5       UP                    sync(secured), multicast
eth1       UP                    non sync(non secured), multicast  (eth1.2001 )
eth1       UP                    non sync(non secured), multicast  (eth1.10   )

Virtual cluster interfaces: 4

eth1            ****
eth1.2001       *****
eth1.20         *****
eth1.10         *****

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
KennyManrique
Advisor

Hello Vincent,

Do you have any news about your issue?

Regards.

Vincent_Bacher
Advisor
Advisor

Not yet. CP is waiting for a cpinfo.

Will create one on Monday. Will keep you posted.

BR

Vincent

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
0 Kudos
Vincent_Bacher
Advisor
Advisor

Now i have News.

They told me that "other customer had this issue also had Contact Awareness blade enabled that cause this issue."

They suggest disabling "Contact awareness blade and monitor the FW's to see if the issue persist"

Disbalbing IA balade? Funny!

And they want me to seht 
[Expert@:0]# fw ctl set int panic_on_stk_size 300
This kernel parameter, will cause the device to crash once the issue re-occurs, and will provide us with a vmcore file

I asked the customer to decide if they want to do that

best
Vincent

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
Libin_Thomas
Contributor

we are also facing the same issue , Intially it was working fine for couple days after that primary firewall got disconnected

from the cluster and in the cli if we perform any command it shows segmentation error.

TAC case also opened for the same but still engineers not able to find the root cause.

0 Kudos
VENKAT_S_P
Collaborator

Did you tried degrade?

Feeling lucky for not installing T56 after seeing this thread.

I have T42 installed and do not have any issues.

Vincent_Bacher
Advisor
Advisor

Not yet. Will work bit more on this open sr

Cheers

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
Hugo_Frauches
Contributor

Hello guys,

I have one Cluster of 5200 with R80.10 installed with the latest Hotfix (take 70) and we are having the same problem, sometimes one of the Gateways stops and when i tried to use an FW cmd via clish the output its always the same:

/bin/cpfw_start: line12: 21612 Segmentation fault (core duumped) $FWDIR/bin /fw "$@"

error

I have an open service request on Checkpoint but they didnt found the solution yet, have anyone found what its causing this problem? Maybe its an bug with R80.10?

KennyManrique
Advisor

Hi Hugo.

Concidentally today, one of my customers had the same Segmentation Fault crash (again) on 5200 Appliance with R80.10 HFA 70.

I have a case opened since last week and like Vincent said, apparently HFA 70 was the solution (also according to sk123153 and sk123154)...but it seems not to be. I'm waiting for TAC response, I will update with the news.

Regards.

Vincent_Bacher
Advisor
Advisor

Hi Hugo, Kenny

I think, the issue, Hugo is facing, is quiet different than mine. He mentioned that a core has been dumped and in my case there is no user/core dump at all.
In my case, R&D is actually working on the issue and they told me that they are just testing a possible solution.
Hopefully i'll get some news soon.

Few minutes ago i just tested this at a gateway running T85 and i get no segmentation fault.
But i wait for update from R&D to see if this is generally solved on T85 or just due to different config.

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
Hugo_Frauches
Contributor

Hello guys,

This problem its awful, my service request opened since last month and the support told me to do and "Fresh Install" on both 5200 Gateways, after i did this it take only 5 days to occur again. When its happens, my Gateway stays "Freezed" and after some time the node status shows as Down, and the only thing i can do to recover the member its to reboot the Gateway, after that the problem is removed (For a short time)...

William_Tavares
Participant

Hi Hugo

Have you heard anything back from Check Point about your service request?

Could you share something with us?

Thanks!

Vincent_Bacher
Advisor
Advisor

Well, my special issue is solved.

Solution was

# dbset snmp mode default

# dbset save

But I think that's just a solution for my special issue and all mentioned issues here are different.

Cheers

Vince

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
0 Kudos
KennyManrique
Advisor

Great to hear that Vincent!

TAC provided me sk123562 that is basically the commands you mentioned. I will verify on customer enviroment and see what happens.

Regards.

0 Kudos
William_Tavares
Participant

Hi there

I have the same issue (SegFault) on a R80.10 environment.

I've tried fresh install + T91 and fresh install + T70 (as suggested by TAC engineer in a SR) without success.

It took less than 8 hours to occur again on both attempts.

I already took a look on

sk123153 - It doesn't apply. The gateway already has the T70.

sk123154 - It doesn't apply. The Content Awareness blade is disabled.

sk123562 - The described solution was applied without solve the problem.

It was suggested by TAC to install T103 last Friday, but the engineer wasn't confident on his solution.

He escalated to development team to validate.

Does anyone have any idea or a different solution provided by TAC?

Thanks!

KennyManrique
Advisor

Hi William,

Still working with TAC to determine the cause. Not news by now.

We enabled some parameters to generate vmcore files when the device failed, but they're under analysis.

Regards.

0 Kudos
William_Tavares
Participant

Hi Kenny

Thanks for your fast reply.

How long have you been working on this SR with TAC?

I've been working on that for 3 weeks without any solution.

These vmcore files were requested to me today.

I have a scheduled maintenance window to enable these kernel parameters tomorrow.

Do you have any workaround to suggest me while we didn't get a solution from Check Point?

Regards.

0 Kudos
KennyManrique
Advisor

I have the case opened since mid March.

Unfortunately I do not have a workaround.

I only know when the device fails, policy install also fails and any fw command executed results in segmentation fault. Also VPN tunnels on the device goes down and mail protocols (POP and SMTP) cannot traverse through the firewall.

Theoretically for me is IPS blade who is causing this, but for security reasons I am not able to disable it. I was planning on disable fwaccel to verify if is related as next step.

Regards.

PhoneBoy
Admin
Admin

Segmentation Faults are somewhat generic.

Executing which command under what circumstances?

If it is about the issue originally raised (which was the "show configuration" command in clish), it should stay in this thread.

If it is a different command that generated a segfault (e.g. the command Hugo Frauches‌ mentioned above), please create a separate thread about this.

0 Kudos
Hugo_Frauches
Contributor

Just for updating this thread, the TAC have request a donwgrade to take 70 on R80.10, its seems to be related to an BUG on take 91+.

It has been 3 months with the same problem, and since we did the downgrade to take 70 (More stable), it has been 2 weeks without the problem on the Gateway.

KennyManrique
Advisor

Let us know how this goes for you.

To me, the problem persist even with take 70.

Regards.

0 Kudos
Hugo_Frauches
Contributor

Yes, its strange but the problem just happend this morning with our Gateway, the only solution was to reboot the appliance to solve the issue.

I just want to confirm with you guys, does anyone here have this issue and also have a 5xxx Appliance? Or someone have a diferrent series ? This is because the TAC its suspecting that this issue its related to the 5xxx appliances...

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events