Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
amdhim0004
Contributor
Jump to solution

rtmd Demon causing high utilization (99% CPU)

Hello All,

We are facing some CPU related issues on multiple checkpoint devices.

 

After policy push or reboot firewall CPU utilization spikes and showing 99-100%

In TOP its showing rtmd causing this. Alone this rtmd demon causing CPU utilization. 

Once we stop this demon CPU utilization gets normal. 

 

But we didn't change anything on the gateway end and using the same configuration for very long.

 

The only recent change made was on CMA end. 

We have installed hotfix on CMA only.

Means we have R80.10-T225 on the gateway and R80.30 -T214 ON CMA. 

Not sure installing the hotfix on CMA end will cause any issue on gateway end.

 

Also, this issue is not with all the gateways.

Some devices are working fine.

0 Kudos
1 Solution

Accepted Solutions
HeikoAnkenbrand
Champion Champion
Champion

Hi @amdhim0004

the daemon is for real time traffic statistics.

Look at this sk:
skI2821 How to debug RTM daemon 

➜ CCSM Elite, CCME, CCTE ➜ www.checkpoint.tips

View solution in original post

15 Replies
HeikoAnkenbrand
Champion Champion
Champion

Hi @amdhim0004

the daemon is for real time traffic statistics.

Look at this sk:
skI2821 How to debug RTM daemon 

➜ CCSM Elite, CCME, CCTE ➜ www.checkpoint.tips
amdhim0004
Contributor

Hello @HeikoAnkenbrand 

Thank you so much for your reply. 

Yes, the daemon is for real-time traffic statistics. But the thing is we didn't execute real-time monitoring, this daemon starts automatically.

Also, I was thinking to do a debug. But during issue CPU utilization was around 99-100%. 

So will it cause any issue?

0 Kudos
Timothy_Hall
Legend Legend
Legend

Definitely take a look at any error messages getting barfed into $FWDIR/log/rtmd.elg when the high CPU is happening; you don't need to explicitly start a debug to hopefully gain insight into what that daemon is having a problem with.  Usually it is a problem with the monitoring database: sk112058: Gateways & Servers view in R80.xx SmartConsole does not show statuses of servers

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
amdhim0004
Contributor

Thanks @Timothy_Hall 

I will check this and get back to you.

0 Kudos
amdhim0004
Contributor

Hi @Timothy_Hall 

 

Below are the logs during activity.

++

[RTM 8725]@U****F010[5 Aug 11:19:48] Warning:cp_timed_blocker_handler: A handler [0x80768a0] blocked for 3455 seconds.
[RTM 8725]@U****F010[5 Aug 11:19:48] Warning:cp_timed_blocker_handler: Handler info: Library [rtmd], Function offset [0x2e8a0].
[RTM 8725]@U****F010[5 Aug 11:19:48] Warning:cp_timed_blocker_handler: A handler [0x4e8ba0] blocked for 3455 seconds.
[RTM 8725]@U****F010[5 Aug 11:19:48] Warning:cp_timed_blocker_handler: Handler info: Library [/opt/CPshrd-R80/lib/libmessaging.so], Function offset [0x4ba0].
[RTM 8725]@U****F010[5 Aug 11:19:48] Warning:cp_timed_blocker_handler: A handler [0x4bb6e0] blocked for 3455 seconds.
[RTM 8725]@U****F010[5 Aug 11:19:48] Warning:cp_timed_blocker_handler: Handler info: Library [/opt/CPshrd-R80/lib/libComUtils.so], Function offset [0x156e0].

++

0 Kudos
Timothy_Hall
Legend Legend
Legend

Hmm when searching for "handler" and/or "blocked" the following SK is the first hit which looks very similar to your issue even though it is a different process:

sk130513: cpd process consuming high CPU (more than 90%) during production hours

This SK is specifically for R80.10 which is the version you are running on your gateway.  Interestingly the terms "handler" and "blocked" do not appear in the text of the SK itself that I can see with my partner access, which I've figured out means there are "hidden" notes accessible only by Check Point employees matching these terms.  Given the very very long duration of the block it sounds like something is getting stuck in rtmd for an extended period, perhaps consuming CPU in the process.  

Try viewing the individual execution threads of the rtmd process with the top -Hbn1 -p 8725 command, where 8725 is the PID of the rtmd process.  This may give you some further insight.

Beyond that a TAC case is probably in order, referencing sk130513 as a starting point for discussion.  However before that applying the latest GA take (275) of the R80.10 Jumbo HFA is probably in order even though I don't see your particular problem as "fixed", as that is probably the first thing TAC will recommend anyway. 😁

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
amdhim0004
Contributor

Hi @Timothy_Hall 

Thanks. Yes TAC ticket has been already opened.

But not sure why this causing issue now.

As this is working before. Lets see what TAC have on this.

0 Kudos
amdhim0004
Contributor

Hello @Timothy_Hall 

We have observer that rtmd get stared after policy push or reboot. But in some devices it will causing issues and in some it will spike the CPU for 2 min and then goes down.

So is there any specific instance. That this rtmd stars after policy push or reboot?

Means is this normal behaviour

0 Kudos
Ido_Shoshana
Employee
Employee

Hi,

We would like to understand the issue more deeply. 

We have some missing info and Ill appreciate if you will elaborate, 

How is it related to cpstat_monitor? Wrong value?

Do you know to tell  what was the CPU before enabling RTM?

In addition, what is the current configuration? Did you enable all 3 checkbox related to RTM history

Are you sure this happen after installing hotfix in R80.10?

 

Thanks,

Ido.

 

amdhim0004
Contributor

Hello @Ido_Shoshana 

Thanks for your response. 

Let me explain.

The thing is we, didn't even start/restart/reset this demon or made any changes related to the same. its automatically get enabled and cpu spikes to 100.

+++

In addition, what is the current configuration? Did you enable all 3 checkbox related to RTM history

(Only Monitoring is checked. please refer attached for the same)

++

Are you sure this happen after installing hotfix in R80.10?

Hotifix was not done on the gateway. But on the CMA.

__

We have installed hotfix on CMA only.

Means we have R80.10-T225 on the gateway and R80.30 -T214 ON CMA. 

Not sure installing the hotfix on CMA end will cause any issue on gateway end.

++

After this hotfix we push the policy and then this issue observed. 

Ido_Shoshana
Employee
Employee

Thanks!

Do you have an open SR for the issue?

If yes? Can we have the ID?

0 Kudos
amdhim0004
Contributor

Hi @Ido_Shoshana 

Currently our orations team are working with TAC on some SR.

I don't have those details. 

0 Kudos
Ido_Shoshana
Employee
Employee

Hi,

Can you send to my e-mail address the customer's details (idosh@checkpoint.com)? 

I would like to find the SR and understand where it is currently stands.

amdhim0004
Contributor

I have advised my operations team to do the same.

0 Kudos
Ido_Shoshana
Employee
Employee

Thanks,

Please keep me posted once you have the details.

 

Thanks,

Ido

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events