Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Justin_Hickey
Collaborator
Jump to solution

Report Server wont boot, cant get into maintenance mode

I'm stuck in a catch 22 with my R80.10 VMWare based reporting server. When I boot it up I get stuck at a screen which says:

Checking filesystems FAILED
*** An error occurred during the file system check.

*** Please boot in the maintenance mode to repair the filesystem

When I reboot, I cannot get into maintenance mode because apparently the default value for the /etc/grup.conf file to maintenance mode timeout is set to 0 . I button mash like crazy during boot but no luck.


I'm stuck. Can I perhaps edit the files of the OS without having a successful boot ?

Thanks,

Justin

0 Kudos
2 Solutions

Accepted Solutions
Richard_Quick
Contributor

I just booted off a live cd and then opened a terminal with admin perms to edit the grub file.

I also ran fsck from the live cd against any of the R80.10 partitions that i could see.

Once i was able to boot into maintenance mode with R80.10 then i ran fsck again.

View solution in original post

Justin_Hickey
Collaborator

So, we were finally able to get this box up and running again. I created a Live CD from

http://www.system-rescue-cd.org/ . Booted off the .iso in VMWare, and mounted the drives. Edited /etc/grup.conf and changed to maintenance mode timeout is set to 5. Booted off the disk, the boot options came up. The system detected disk issues and automatically ran fsck. It failed with error "Unexpected Inconsistency, RUN fsck manually. I ran fsck -y manually and went home for the night. The next day it looks like it finished, rebooted and the system came up as it should.

What an ordeal. It didn't help that the log drive was 20 TB. Really hope Checkpoint fixes this issue so others using VMWare don't run into the same problem. Many thanks for all the assistance.

View solution in original post

(1)
18 Replies
Vladimir
Champion
Champion

You can try Easy way to mount and access vSphere VMDK files offline 

and use this gem I've found in CPUG forum:

/boot/grub/menu.1st

change

timeout=0

to

timeout=4

Now you can hit escape during the count down and see the maintenance mode option.

Justin_Hickey
Collaborator

I appreciate the response. The box wont boot so, it would seem having this value default to '0' , is a bit of a design flaw, I might even go as far to say a bug. Now that I'm in a jam, I cant boot up to edit that file. '4' should be the default.

I'm left with trying to hack the vdmk which is 2 TB, or scrap it all, and a whole lot of valuable logs.

0 Kudos
Vladimir
Champion
Champion

Well, it may not be a bad idea to have a template with the timeout=4 for future deployments.

As to hacking VMDK, here are some other ways to get in to it:

https://www.kjctech.net/mount-a-vmdk-image-file-in-windows/ 

but if you cannot get the this VMDK moved or mapped to Windows, you may have to mount it to another, maybe temporary, Linux VM and try it from there.

0 Kudos
PhoneBoy
Admin
Admin

I was going to suggest mounting the filesystem from another Linux VM.

0 Kudos
Vladimir
Champion
Champion

Yep, that's an option. On a different topic: is there a way in CheckMates portal to bookmark certain threads so it is easier to reference them later? May be even to attach personal tags to them?

0 Kudos
PhoneBoy
Admin
Admin

See the "Actions" at the top of the post.

Bookmarking is an option.

0 Kudos
Vladimir
Champion
Champion

OK. But I have trouble locating the way to retrieve them later. 

Where should I look?

0 Kudos
Richard_Quick
Contributor

I have this issue with 2 different Smartevent servers on Vmware 6.5 running R80.10.  I had to boot to a live CD and edit the grub file to boot the maintenance mode option and include the timeout.

I followed sk94671 before it was updated to not include R80.10 and i think that was the issue.  There is a known problem, discovered in the last few months, with LVM_Manager and R80.10 where it doesn't stop all the needed processes before expanding the disk.  I believe the SK about this is internal only.  I've been told the only solution is to get a DB and OS backup and rebuild the guest.

0 Kudos
PhoneBoy
Admin
Admin

Which is why it's probably safe to run it in maintenance mode Smiley Happy

0 Kudos
Justin_Hickey
Collaborator

Hi Richard, thanks for the reply. So, I just boot with the install .iso . Would it be possible to add some more detail ? I suppose I just instruct VMWare to boot off the .iso but do I have to do a new install on the old one, or is there some way I can drop into a cli mode ?

0 Kudos
Richard_Quick
Contributor

I just booted off a live cd and then opened a terminal with admin perms to edit the grub file.

I also ran fsck from the live cd against any of the R80.10 partitions that i could see.

Once i was able to boot into maintenance mode with R80.10 then i ran fsck again.

Justin_Hickey
Collaborator

Sorry to belabor the point but, you mean you are booting off of the install dvd ? I get to the point where it asks if I want to install Checkpoint GAIA R80.10. I worry about losing the config/OS. Did you reinstall, or did you perhaps have some other way to get to the cli ?

Thanks,

Justin

0 Kudos
Justin_Hickey
Collaborator

So, I'm reading about Emergendisk. It talks about creating a physical USB disk. There is no .iso to download for VMWare ?

0 Kudos
Richard_Quick
Contributor

I used another Linux build like BSD live and mounted that iso.  You aren't going to use the Check Point iso's to do this.

Don't install the BSD OS just boot to the live CD and you should be able to get to a terminal within the linux OS.

Justin_Hickey
Collaborator

So, we were finally able to get this box up and running again. I created a Live CD from

http://www.system-rescue-cd.org/ . Booted off the .iso in VMWare, and mounted the drives. Edited /etc/grup.conf and changed to maintenance mode timeout is set to 5. Booted off the disk, the boot options came up. The system detected disk issues and automatically ran fsck. It failed with error "Unexpected Inconsistency, RUN fsck manually. I ran fsck -y manually and went home for the night. The next day it looks like it finished, rebooted and the system came up as it should.

What an ordeal. It didn't help that the log drive was 20 TB. Really hope Checkpoint fixes this issue so others using VMWare don't run into the same problem. Many thanks for all the assistance.

(1)
Richard_Quick
Contributor

That's great.  I'm glad you were able to get it running.  If you have the grub.conf changed to a 5 second timeout then you can just enter maintenance mode without the live cd now.  I've added that step to my build scripts for all my Vmware devices.

RicPor
Explorer

Worked for us, too. Thanks.

0 Kudos
Johan_Hillstrom
Contributor

I wish I had read this article BEFORE today...

Just got hit with this issue after some problems with our vm environment.

I second the opinion that this should be considered as a bug, and a default of 5 seconds should be implemented.

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events