Reboot no explanation. Checkpoint 3200.

Antonio_Madeira · ‎2020-08-20

Hi,

I recently have the issue with a customer an appliance reboots without explanation.

The only thing I ne thing is showing is "Aug 20 00:07:18 2020 fw01 shutdown[1583]: shutting down for system halt".

Troubleshooting:

fw01# last -x |head |ta

runlevel (to lvl 0) 2.6.18-92cpx86_6 Thu Aug 20 00:07 - 00:09 (00:01)
reboot system boot 2.6.18-92cpx86_6 Thu Aug 20 00:09 (15:05)
runlevel (to lvl 3) 2.6.18-92cpx86_6 Thu Aug 20 00:09 - 15:14 (15:05)
admin pts/2 X.X.X.X Thu Aug 20 12:49 - 12:59 (00:10)
admin pts/2 X.X.X.X Thu Aug 20 13:07 - 13:17 (00:10)
admin pts/2 X.X.X.X Thu Aug 20 13:17 - 14:07 (00:49)
admin pts/2 X.X.X.X Thu Aug 20 14:35 still logged in
admin pts/3 X.X.X.X Thu Aug 20 14:41 - 14:51 (00:10)

fw01# less /var/log/dmesg

NOTHING RELEVANT

fw01# cat /var/log/messages.1

Aug 20 00:07:18 2020 fw01 shutdown[1583]: shutting down for system halt
Aug 20 00:07:20 2020 fw01 xinetd[5034]: Exiting...
Aug 20 00:07:20 2020 fw01 auditd[4412]: The audit daemon is exiting.
Aug 20 00:07:20 2020 fw01 kernel: audit(1597878440.910:616): audit_pid=0 old=4412 by auid=4294967295 subj=kernel
Aug 20 00:07:21 2020 fw01 kernel: Kernel logging (proc) stopped.
Aug 20 00:07:21 2020 fw01 kernel: Kernel log daemon terminating.
Aug 20 00:07:22 2020 fw01 exiting on signal 15
Aug 20 00:09:20 2020 fw01 syslogd 1.4.1: restart.
Aug 20 00:09:20 2020 fw01 syslogd: local sendto: Network is unreachable
Aug 20 00:09:20 2020 fw01 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Aug 20 00:09:20 2020 fw01 kernel: Linux version 2.6.18-92cpx86_64 (builder@Lnx30BccCmp5) (gcc version 4.1.1 20061011 (Red Hat 4.1.1-30)) #1 SMP Sun Sep 2 17:48:54 IDT 2018
Aug 20 00:09:20 2020 fw01 kernel: Command line: ro root=/dev/mapper/vg_splat-lv_current vmalloc=256M noht panic=15 console=ttyS0 crashkernel=128M@16M 3 quiet
Aug 20 00:09:20 2020 fw01 kernel: BIOS-provided physical RAM map:

--------------------------------------------------------------------------------------------

where can i see what happened??

G_W_Albrecht · ‎2020-08-20

Look if you see files with corresponding date/time in

32-bit: /var/crash/<date>/vmcore

64-bit: /var/log/crash/<date>/vmcore

/var/log/dump/usermode

and contact TAC for help.

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Antonio_Madeira · ‎2020-08-20

Hi G_W_Albrecht,

Thank you for your help.

Nothing report at this date.

fw1# ls -lah /var/log/dump/usermode
total 42M
drwxr-xr-x 2 admin root 4.0K Dec 23 2019 .
drwxr-xr-x 3 admin root 4.0K Dec 21 2016 ..
-rw-r--r-- 1 admin root 11M Aug 26 2019 DAService.17940.core.gz
-rw-r--r-- 1 admin root 12M Sep 3 2019 DAService.5368.core.gz
-rw-r--r-- 1 admin root 1.4M Dec 23 2019 PostgreSQLCmd.5512.core.gz
-rw-r--r-- 1 admin root 19M Dec 23 2019 cpd.5231.core.gz

fw1# ls -lah /var/crash/
total 8.0K
drwxr-xr-x 2 admin root 4.0K Nov 24 2014 .
drwxr-xr-x 17 admin root 4.0K Jan 29 2018 ..

fw1# ls -lah /var/log/crash/
total 8.0K
drwxr-xr-x 2 admin root 4.0K Mar 16 2017 .
drwxr-xr-x 24 admin root 4.0K Aug 20 15:32 ..

Any thing more I can check?

HristoGrigorov · ‎2020-08-20

Have you checked hardware sensors ? May be the system is overheating.

Antonio_Madeira · ‎2020-08-21

Hi Grigorov,

Thank you for your help.

I don't have that information on my monitoring tool, I will add now.

I check on Gaia the history health but only show the average of the day on last month, if we had a peak it was hidden with the average of the day.

On the sysenv (attach) I see the maximum value 87, but I don't know when was.

Is any way to check when occurs this value?

HristoGrigorov · ‎2020-08-21

I am sorry, I don't really know any built-in command to check this. I believe CPU temp of 87 degrees C is close to maximum of 100 so it is worth investigating it more thoroughly.

PhoneBoy · ‎2020-08-24

That looks like someone typing reboot or shutdown at the CLI versus an actual crash.
But, to be clear, it’s only a guess.

JanH · ‎2021-08-23

Hello, we get very similar behavior on one of our GW. Unexpected reboot was performed on:

reboot system boot 2.6.18-92cpx86_6 Sat Aug 14 00:10 (2+17:05)
reboot system boot 2.6.18-92cpx86_6 Sat Aug 14 00:08 (2+17:07)

unfortunately, not more logs were collected.

Are you a member of CheckMates?

Reboot no explanation. Checkpoint 3200.