- Products
- Learn
- Local User Groups
- Partners
- More
Firewall Uptime, Reimagined
How AIOps Simplifies Operations and Prevents Outages
Introduction to Lakera:
Securing the AI Frontier!
Check Point Named Leader
2025 Gartner® Magic Quadrant™ for Hybrid Mesh Firewall
HTTPS Inspection
Help us to understand your needs better
CheckMates Go:
SharePoint CVEs and More!
Hello All,
Since last 2 days every morning we are facing very strange issue. Commands are not getting executed on management server. CPU & memory utilization is also normal.
After rebooting of management server the issue gets fixed but again next morning the issue arises.
We have collected few of the outputs during the issue as per the TAC suggestion. Attaching the same herewith.
We have logged a ticket with checkpoint TAC but they are also not able to fix this issue.
Kindly help if any troubleshooting we can perform to fix this issue
Which commands do not get executed ? What is shown in logs from the time of the issue ?
cpview, cpstat , cpinfo, reboot etc commands are not getting executed.
[Expert@DSPMGMT:0]# tail -f /var/log/messages
Jan 24 08:58:29 2020 DSPMGMT PAM-tacplus[1819]: auth failed: 2
Jan 24 09:21:24 2020 DSPMGMT snmpd: Error: Timeout waiting for response from database server.
Jan 24 09:22:04 2020 DSPMGMT monitord[3873]: Error: Timeout waiting for response from database server.
Jan 24 09:22:24 2020 DSPMGMT snmpd: Error: Timeout waiting for response from database server.
Jan 24 09:38:01 2020 DSPMGMT PAM-tacplus[4844]: auth failed: 2
Jan 24 09:58:43 2020 DSPMGMT PAM-tacplus[6059]: auth failed: 2
Jan 24 10:49:33 2020 DSPMGMT PAM-tacplus[8861]: auth failed: 2
Jan 24 10:54:43 2020 DSPMGMT PAM-tacplus[9166]: auth failed: 2
Jan 24 10:54:49 2020 DSPMGMT PAM-tacplus[9166]: auth failed: 2
Jan 24 10:56:38 2020 DSPMGMT PAM-tacplus[9325]: auth failed: 2
Good afternoon. Was there a resolution to this? We are having identical problems with a Smart-1 5050, R80.30. The only difference is the power cords must be reseated. A warm reboot or shutdown -r does not help. Thank you for any info you can provide. I do have a case open with TAC.
Cant say I had ever seen that before...what did TAC say?
TAC is still working on it. Trying to duplicate the problem with our configuration.
Just curious, as I like to approach every problem logically. So, when you say this happened 2 days ago, anything you can think of that may had changed on mgmt server 2 or 3 days ago at all? Can you maybe check any audit logs to see if there is anything of interest when this issue occurred? One thing that comes to my mind is guidbedit, but unless someone inadvertently made changes there, I guess might not be relevant. Just to be on safe side, I would try do "install database" on the server itself.
TAC has valid idea...if they can import your config in their lab and try fix it, they can provide the solution.
Thanks for your interest. I don't recall saying it happened two days ago though - it started about 12 days ago and is very intermittent. We're about 14 hours total into troubleshooting, reinstalling from R80.30 ISO (twice). Patch to latest hotfix, migrate export/import, etc, push policy, all is good. Wait x amount of minutes/hours/days, then same problem.
My gut says it's hardware sensor related - or maybe ILMI related because only reseating the power cables will bring it back to the point where the GAIA portal and the dashboard are useable again. But that's just my opinion. As soon as that database timeout message appears in /var/log/messages, that's it for the portal and dashboard.
Sorry, my apologies, I read original post and said "since last 2 days"...thats what I wanted to respond to, but replied to you, sorry about that. Though now that you said all that, I would agree 100% with your assessment...did you asked TAC for rma? I cant see what else they can ask you to do, except send a replacement.
I forgot to add I've had practically zero problems like this. For roughly 14 months it's been rock solid with regular operational rule changes, IPS, other blade updates, VPN stuff, regular hotfix updates, etc. No real negative work stopping events like this for a long time.
Well, for such expensive machine like Smart-1 5050, better work way longer than 14 months 🙂
No worries. Agreed. Decision on RMA late tomorrow.
Leaderboard
Epsum factorial non deposit quid pro quo hic escorol.
User | Count |
---|---|
14 | |
12 | |
11 | |
9 | |
8 | |
7 | |
5 | |
5 | |
5 | |
5 |
Tue 07 Oct 2025 @ 10:00 AM (CEST)
Cloud Architect Series: AI-Powered API Security with CloudGuard WAFThu 09 Oct 2025 @ 10:00 AM (CEST)
CheckMates Live BeLux: Discover How to Stop Data Leaks in GenAI Tools: Live Demo You Can’t Miss!Thu 09 Oct 2025 @ 10:00 AM (CEST)
CheckMates Live BeLux: Discover How to Stop Data Leaks in GenAI Tools: Live Demo You Can’t Miss!Wed 22 Oct 2025 @ 11:00 AM (EDT)
Firewall Uptime, Reimagined: How AIOps Simplifies Operations and Prevents OutagesAbout CheckMates
Learn Check Point
Advanced Learning
YOU DESERVE THE BEST SECURITY