Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
the_rock
Legend
Legend
Jump to solution

URL filtering blade (RAD process) causing high CPU tip

Hey guys,

Just a quick tip, though Im sure most of you may know, referring to below post:

https://community.checkpoint.com/t5/Security-Gateways/High-CPU-on-Security-Gateway-caused-by-RAD-ser...

Customer tried value of 50000 in guidbeedit, no change, so once we did failover, all worked fine. I asked them to install jumbo 99 on the cluster (currently on jumbo 98).

mgmt -> R81.20 jumbo 99

gw cluster -> R81.20 jumbo 98

Btw, if its standalone or just a single gateway, usually cprestart would work, or reboot, but just keep in mind that since cpstop unloads the policy, there would be disruption for short time.

Something to keep in mind if you ever encounter RAD process causing high CPU.

Cheers,

Andy

0 Kudos
1 Solution

Accepted Solutions
D_W
Advisor
(1)
36 Replies
D_W
Advisor

We had similar issues with RAD High CPU on some systems but solution was to disable the RAD "autodebug" feature.
sk182859 

the_rock
Legend
Legend

Awesome, thanks @D_W 

0 Kudos
the_rock
Legend
Legend

Hey @D_W 

I read that sk again and shows product as anti-bot, did you apply it in the past for URLF issue and worked okay?

Andy

0 Kudos
D_W
Advisor

Yes I was searching for an URLF Issue - new rule did not hit - then I saw the CPU is high and a lot of these messages in the Loggrafik.png

 

0 Kudos
the_rock
Legend
Legend

Ah, got it. I dont believe my customer had those logs, they just noticed cpu when running top was showing 180% on rad process.

Andy

0 Kudos
Timothy_Hall
Legend Legend
Legend

Right, CPU over 100% on RAD is not necessarily a problem as it is multi-threaded.  However high RAD CPU accompanied with laggy Internet surfing performance may be related.

Attend my Gateway Performance Optimization R81.20 course
CET (Europe) Timezone Course Scheduled for July 1-2
the_rock
Legend
Legend

Yep, thats exactly issue they had, laggy surfing.

Andy

0 Kudos
r1der
Advisor

Weird, I had the same issue - laggy surfing. No one had said anything though.
All along I thought it was my https inspection settings I was testing on myself.

0 Kudos
the_rock
Legend
Legend

Thats why we all share ideas my friend ; )

Henkpoa
Contributor

With the new generation of Quantum Gateways which are running SecureXL in User Space, we've also seen a lot of CPU and stability issues.

UPPAK has created issues with Hyperflow where Dynamic Balancing stops working due to interface state mismatch when trying to reallocate back cores.

Also some known crashes with UPPAK.

So if you're running new quantum GWs, I'd be on lookout for performance issues and bugs.

the_rock
Legend
Legend

Thanks for that @Henkpoa 

Andy

0 Kudos
Jan_Kleinhans
Advisor

We have also such issues with 19200 and UPPAK. Also Hyperflow causes issues in KPPAK on one of our clusters. We disabled it for the moment as we want to go to UPPAK again, as we want to use the hardware acceleration of the nvidia network cards.

the_rock
Legend
Legend

Thanks for the feedback guys, appreciated.

Andy

0 Kudos
Henkpoa
Contributor

Interesting.

We have yet to see an hyperflow issue on our 9300.
But we'll be on lookout.

I assume disabling it requires a restart with no issues to ClusterXL Sync?

Jan_Kleinhans
Advisor

We had issues with long lasting TCP sessions like RDP, Citrix HDX or MSSQL. We had entries like this:

 

[11 Mar 9:06:03][fw4_11];[vs_5];psl_send_data: send_cookie_f failed [11 Mar 9:06:03][fw4_11];[vs_5];psl_send_flags: could not send cookie [11 Mar 9:06:03][fw4_19];[vs_5];cphwd_send_cookie: cphwd_send_packet failed [11 Mar 9:06:03][fw4_19];[vs_5];fwpslglue_send: cphwd_send_cookie failed [11 Mar 9:06:03][fw4_19];[vs_5];psl_glue_send_cookie: error, fwpslglue_send failed to send cookie, conn=<dir 1, 172.18.64.73:62975 -> 172.16.80.107:3389 IPP 6> [11 Mar 9:06:03][fw4_19];[vs_5];psl_send_data: send_cookie_f failed [11 Mar 9:06:03][fw4_19];[vs_5];psl_send_flags: could not send cookie

We disabled it via 

connection_pipelining advanced prevent

This doesn't need a reboot and does not harm ClusterXL.

 

Regards,

 

Jan

0 Kudos
the_rock
Legend
Legend

Very good to know!

Andy

0 Kudos
Timothy_Hall
Legend Legend
Legend

Great tip, the RAD is discussed extensively in my Gateway Performance Optimization course.  The RAD was enhanced and also multi-threaded during version R80.40 which solved a lot of the old problems, but the amount of chatter between the gateway inspection code and the ThreatCloud just keeps on relentlessly increasing, and the RAD still gets in trouble occasionally.  Here is the most up-to-date content from my course:

rad1.pngrad2.png

Attend my Gateway Performance Optimization R81.20 course
CET (Europe) Timezone Course Scheduled for July 1-2
the_rock
Legend
Legend

I was waiting for a response from a TRUE LEGEND, aka @Timothy_Hall  : - )

Thank you!

Andy

0 Kudos
_Val_
Admin
Admin

Did you see sk110501 yet?

0 Kudos
GigaYang
Collaborator

We have also seen high CPU load caused by RAD Daemon in recent days, and it is confirmed to be related to Anti-Bot.

We found a large number of abnormal records in SmartLog (such as attachments), but we are currently unclear about the source of these connections. Any suggested analysis methods?

Thanks

0 Kudos
the_rock
Legend
Legend

I actually observed the same in R82 lab as well after enabling AB blade...mind you, had not seen it since installing jumbo 14, so thats a good sign.

Andy

0 Kudos
GigaYang
Collaborator

Thanks for the assistance.

What I don't understand is that most of these records are related to AWS EC2 and Google GCP addresses, but when I collected packets through Tcpdump, I couldn't find any relevant information, which made me unsure whether it was unrelated to the attack. There are about dozens of such records per minute.

I have checked some records through Virustotal, and most of them are fine.

0 Kudos
the_rock
Legend
Legend

I found in my lab, what I had to do to get around this, was add bunch of bypass rules, which is not really optimal, in my opinion.

Andy

0 Kudos
D_W
Advisor
(1)
the_rock
Legend
Legend

Excellent! Hey @GigaYang , worth doing that sk, seems 100% related.

Andy

0 Kudos
GigaYang
Collaborator

Thanks for the assistance.

I have applied to the company to execute this SK a few days ago, hoping to solve this problem.

0 Kudos
the_rock
Legend
Legend

I just checked my notes from last year (literally day after the sk was written), when one customer had this problem and that sk also worked for them, so its very likely it would fix it for you as well.

Andy

0 Kudos
GigaYang
Collaborator

After we adjusted according to the steps in sk182859, CPU Loading recovered immediately. But we found that the memory usage of the firewall rose to 70%, of which fwk0_dev_0 accounted for 50%. Has anyone encountered this situation before?

0 Kudos
the_rock
Legend
Legend

What do cpview and top show? Those would be best indicators, in my view.

Andy

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events