R81.10 Take 45 - RAD errors

Alex- · ‎2022-04-13

I've got a cluster of regular appliances (high-end 5000 series) fresh installed in R81.10 a few weeks back suddenly reporting that they can't connect to CP services anymore. All TP blades are red with the message it can't contact to CP cloud for updates.

Now, traffic through the appliances is apparently OK. CPUSE also stopped working (can't connect to Check Point cloud).

The interesting bit is that it happened all of a sudden on a configuration that has been active and quite static for a long time.

Checking the logs, it's full of RAD errors like RAD timeout or maximum RAD connections reached, stopping handling RAD requests.

Failover/reboot didn't help and increasing the default value of 1000 flows to more either. Activating RAD stats show in CPVIEW a lot of RAD expired/missed for a small amount of successful ones.

The FW have internal DNS defined and can quickly resolve public domains. It can also ping public IP's from itself (for instance ping www.google.com).

curl_cli however takes ages to timeout eventually times out. SecureXL off doesn't help, removing/putting back TP blades neither.

FW has ample CPU and memory space.

Traffic to URL like cws.checkpoint.com is resolvable and not dropped by the FW itself. RAD daemon has been restarted.

Contracts have been updated. Licenses are valid. In short I looked up quite a few things about RAD to no avail.

I have a TAC case which has yet to gain traction. In the meantime, any advice is welcome as most RAD SK indicates the issues are fixed in hotfixes or apply to older versions.

Timothy_Hall · ‎2022-04-13

Strange that it seems to have started happening for no reason.

Any changes at your ISP? Strange ISP outages? Throttling? DoS attacks?

Are you located in a geographic country that may be the target of some kind of upstream ISP blocking due to the current geopolitical situation in Europe?

New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization

Alex- · ‎2022-04-21

It turned out, after claiming repeatedly that every was fine on their side, that the ISP had enabled some sort of DDOS protector on some part of their backbone and mistakenly included IP ranges provided to end customers, which caused transient connectivity issues that RAD didn't like.

Timothy_Hall · ‎2022-04-21

Thanks for the follow up, that must have been difficult to figure out.

New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization

the_rock · ‎2022-04-13

Just an idea Alex...can you try change DNS to google one (8.8.8.8) and see if issue is still there?

Andy

Best,
Andy
"Have a great day and if its not, change it"

Are you a member of CheckMates?

R81.10 Take 45 - RAD errors