Short communications disruptions

May the 4th (+4)
Roadmap Session and Use Cases for
Cloud Security, SASE, and Email Security

SASE Masters:
Deploying Harmony SASE for a 6,000-Strong Workforce
in a Single Weekend

Paradigm Shifts: Adventures Unleashed!
Capture Your Adventure for a Chance to WIN!

Join the Photo Contest!

Mastering Compliance
Unveiling the power of Compliance Blade

WATCH NOW

CPX 2024 Content
is Here!

Get What You Need

Harmony SaaS
The most advanced prevention
for SaaS-based threats

LEARN MORE

CheckMates Go:
CPX 2024 Recap

LISTEN NOW

Create a Post

We are experiencing short (30 sec-2 min) communications distruptions, where all the connectivity is gone and the main cluster member doesn't respond anymore (while the standby member does).

looking through /var/log/messages we can find some patterns here. Every time there is something like:

Starting CUL mode because CPU-02 usage (81%) on the local member increased above the configured threshold (80%).

Then multiple logs like:

cerbero1 kernel: [fw4_1];[~~censored_public_~~ip:44288 -> ~~Censored_public_ip~~:53] [ERROR]: malware_res_rep_rad_query: rad_kernel_api_async_get_resource() failed with error: Service is down

And then:

cerbero1 kernel: [fw4_1];CLUS-120202-1: Stopping CUL mode after 80 sec (short CUL timeout), because no member reported CPU usage above the configured threshold (80%) during the last 10 sec.

what may cause this problem?

Thanks in advance