- Local User Groups
we have cluster X checkpoint R77.10 , the primary one is up and running but I am not able to SSH to it , its prompted me with the username and password , when I enter the password I got the below message :
CLINFR0139 Couldn't open the tmp_env_vars: No such file or directory
from SmartView Monitor it shows as disconnected. the secondary is reachable via SSH and shows as OK in SmartView. I am new to checkpoint , and don't have support contract . can some help ?
It might be HDD issue (out of space or corruption). Considering you are running an unsupported version and do not have a maintenance contract, your best option is to backup the machine on an external location and try rebooting.
If it does not come up after reboot, reinstall and restore from backup.
thanks for your reply Valeri , that's what I am planning to do. one question here, I read about fsck command in maintenance mode that could sort the issue. any thoughts ??
Yes, this is one more option, in case your drive has enough space, and the damage is not physical.
You can do backup on external location from WebUI. Is it still working?
no, that's not available. but as I mentioned the secondary is reachable and when rebooting, the traffic should fails to the secondary. configuration wise I can backup the secondary as they should share the same.
as expected the primary did not back up , its failing at checking the filesystems , and asking me to reboot in maintenance mode to run fsck , the issue that I cant get to maintenance mode , it only give me the option to from Gaia or boot manager, can you please advise ?
You are out of disk space or memory on that primary member, or its hard drive has failed as Val said. Since the primary firewall is showing as disconnected, that means that the cpd daemon is dead or impaired, and the fact that sshd can't get any resources for your incoming connection confirms that process/user space has insufficient resources to do anything. However most traffic inspection operations take place in the kernel, which preallocates all its needed memory in RAM and has no dependence on the hard drive once successfully booted up. If the hard drive has indeed failed, be prepared for the primary to not boot back up when you power cycle it. Make sure your secondary member is ready to take over, and also anticipate that you'll need to do a restore on the primary after replacing the hard drive.
Thanks Tim for your input, I am planning to fail the traffic to the secondary by increasing its priority via SmartDashboard prior to the reboot of the primary. will update the thread after the reboot , finger cross its not a hard drive failure.
just to let you know guys, increasing the standby priority from smartdashboard and push the policy failed to make the standby active. will go a head and physically reboot the primary to force failover, this is scheduled for tonight. will keep you posted.,
I have rebooted the primary , the traffic has failed to the secondary but as you both expected the primary did not back up, its failing at checking the filesystems and I get below message :-
***please boot in the maintenance mode to repair the filesystems.
I rebooted the firewall but I can see " press any key to see the boot menu" , after the reboot , it goes first to memory check and then I have the choice to boot from gaia or boot manager. can someone help on how to boot in the maintenance mode ?