High cpu utilization on RAD Process

Netadmin2020 · ‎2021-01-20

Hello!

We just update to 80.40 and with latest hotfix. (appliance 15600)

I observe with top command high CPU percentage on rad process.

I check and found that is related with URL cache.

We firstly hit the limits of 20000 (I checked it with the below command)

fw tab -t urlf_cache_tbl -s

After upgrading to 80.40 we add more users through the checkpoint (all with URL filtering enabled)

I change the limits of rad service yesterday with GuiDBedit.

The cache max hash size from 20000(default) firstly tried to 40000.

Today I saw that we reached the 40000 limit and I set it to 70000.

we hit the below values right now:

What are you proposing about this situation ?

Timothy_Hall · ‎2021-01-20

How many internal users do you have? The table default of 20,000 entries assumes about 1,000 users with a "normal" level of web browsing activity.

Also check your policy and make sure APCL/URLF is not getting matched against inbound web requests heading to a DMZ web server farm or something.

New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization

Netadmin2020 · ‎2021-01-20

My users generally will going to be about 5000. Does this mean that I should change the limit to 100000?

Timothy_Hall · ‎2021-01-21

Occasionally having the cache fill up and get cleared is fine, it is when it happens constantly that it is a problem. If you have it set to 70000 and are now hitting 55000 without it constantly hitting the limit and getting cleared, I'd leave it at 70000 for now.

New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization

Netadmin2020 · ‎2021-01-21

Thank you. Is normal the rad process going to 250% but not staying permanently ?

Timothy_Hall · ‎2021-01-21

A CPU spike in rad every now and then is fine.

New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization

Netadmin2020 · ‎2021-01-21

Today morning ....

Chris_Atkinson · ‎2021-01-21

As this is multi-threaded, you will see more RAD processes and CPU can exceed 100%. This is a normal behavior.

Source: sk163793

CCSM R77/R80/ELITE

Martin_Seeger · ‎2021-03-07

Yes, but RAD CPU usage topping the CPU usage of all virtual firewalls on a VSX?

the_rock · ‎2021-03-07

Does sound like normal behavior to me, sorry. Personally, I had never seen something like this pre R80 version.

Best,
Andy
"Have a great day and if its not, change it"

Martin_Seeger · ‎2021-03-07

This seems to me absurdly high. I'm still looking into the source of the RAD load.

My impression:

a) Malware-Blade generates a lots of queries to the RAD

b) Cache too small, that causes the RAD too often to inquire externally.

But that is only my current impression... My gut feeling has been wrong before.

Chris_Atkinson · ‎2021-03-07

The operation of the process was changed in JHFs applied over versions of R80.20 and above. That's not to say there isn't an issue warranting investigation however.

CCSM R77/R80/ELITE

Martin_Seeger · ‎2021-03-07

Yes, we guess that the problem started after applying JHF 219. But this is unsure as we were slow to notice the problem. I need to rewrite our monitoring. Nobody was watching the RAD CPU usage ;-).

Martin_Seeger · ‎2021-03-07

Do you know how R80.30 differs in regard to the RAD cache from R80.40?

With R80.30 the gateway (VSX) does not yield any result with the "fw tab -t urlf_cache_tbl -s" command: Failed to get table status for urlf_cache_tbl.

Since the update to the latest JHF (Take 219), the RAD CPU usage is through the roof. It often tops the CPU usage from all the virtual firewalls together. The CPU usage alone would not worry us, but we see in tcpdumps that firewall ist sometimes "keeping" HTTP connections "on hold" for 5 seconds.

With some optimisations in the policy (exceptions in Thread Prevention Policy) the RAD load has been reduced a bit and the 5s delays occur now less often.

My impression is that the cache is too small. I asked the support on how to increase the cache, but I did not get a reply for 5 days now.

Do DNS queries somehow end up in the RAD? RAD CPU usage seems to be the highest when we see a lot of DNS traffic.

Thanks, Martin

Martin_Seeger · ‎2021-03-07

RAD statistics in cpview seems broken (see screenshot below).

If I do a "rad_admin stats print malware", I get the same statistic in the CSV file.

Chris_Atkinson · ‎2021-03-07

The URLF cache increase procedure is documented per sk90422

CCSM R77/R80/ELITE

the_rock · ‎2021-03-07

With all due respect, it would not be the first time (and Im sure it wont be last either) that CP sk is wrong.

Best,
Andy
"Have a great day and if its not, change it"

Martin_Seeger · ‎2021-03-07

Thanks, I have been looking for such an SK for quite some time now. This is really helpful.

According to what I see, I rather need to increase the cache for malware instead of urlf, but that is easily adapted.

the_rock · ‎2021-03-07

What value exactly did you change in guidbedit?

Andy

Best,
Andy
"Have a great day and if its not, change it"

Netadmin2020 · ‎2021-03-07

In my case I adjust the url cache at 120000. I have rad spikes but with no issues.

the_rock · ‎2021-03-07

Okay, thats fair...BUT, personally (and I cant really speak for anyone else), to me, thats more really masking an issue, unless it truly addressed the problem you had. I always found that TAC may ask customers to increase certain values in guidbedit without a true understanding of how to fix the issue more permanently.

Andy

Best,
Andy
"Have a great day and if its not, change it"

Are you a member of CheckMates?

High cpu utilization on RAD Process