Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Participant

Reboot no explanation. Checkpoint 3200.

Hi, 

I recently have the issue with a customer an appliance reboots without explanation. 

The only thing I ne thing is showing is "Aug 20 00:07:18 2020 fw01 shutdown[1583]: shutting down for system halt".

Troubleshooting:

 

fw01# last -x |head |ta

runlevel (to lvl 0) 2.6.18-92cpx86_6 Thu Aug 20 00:07 - 00:09 (00:01)
reboot system boot 2.6.18-92cpx86_6 Thu Aug 20 00:09 (15:05)
runlevel (to lvl 3) 2.6.18-92cpx86_6 Thu Aug 20 00:09 - 15:14 (15:05)
admin pts/2 X.X.X.X Thu Aug 20 12:49 - 12:59 (00:10)
admin pts/2 X.X.X.X Thu Aug 20 13:07 - 13:17 (00:10)
admin pts/2 X.X.X.X Thu Aug 20 13:17 - 14:07 (00:49)
admin pts/2 X.X.X.X Thu Aug 20 14:35 still logged in
admin pts/3 X.X.X.X Thu Aug 20 14:41 - 14:51 (00:10)

 

 fw01# less /var/log/dmesg

NOTHING RELEVANT

 

 fw01# cat /var/log/messages.1

Aug 20 00:07:18 2020 fw01 shutdown[1583]: shutting down for system halt
Aug 20 00:07:20 2020 fw01 xinetd[5034]: Exiting...
Aug 20 00:07:20 2020 fw01 auditd[4412]: The audit daemon is exiting.
Aug 20 00:07:20 2020 fw01 kernel: audit(1597878440.910:616): audit_pid=0 old=4412 by auid=4294967295 subj=kernel
Aug 20 00:07:21 2020 fw01 kernel: Kernel logging (proc) stopped.
Aug 20 00:07:21 2020 fw01 kernel: Kernel log daemon terminating.
Aug 20 00:07:22 2020 fw01 exiting on signal 15
Aug 20 00:09:20 2020 fw01 syslogd 1.4.1: restart.
Aug 20 00:09:20 2020 fw01 syslogd: local sendto: Network is unreachable
Aug 20 00:09:20 2020 fw01 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Aug 20 00:09:20 2020 fw01 kernel: Linux version 2.6.18-92cpx86_64 (builder@Lnx30BccCmp5) (gcc version 4.1.1 20061011 (Red Hat 4.1.1-30)) #1 SMP Sun Sep 2 17:48:54 IDT 2018
Aug 20 00:09:20 2020 fw01 kernel: Command line: ro root=/dev/mapper/vg_splat-lv_current vmalloc=256M noht panic=15 console=ttyS0 crashkernel=128M@16M 3 quiet
Aug 20 00:09:20 2020 fw01 kernel: BIOS-provided physical RAM map:

--------------------------------------------------------------------------------------------

 

 where can i see what happened??

 
0 Kudos
Reply
6 Replies
Champion
Champion

Look if you see files with corresponding date/time in

32-bit: /var/crash/<date>/vmcore

64-bit: /var/log/crash/<date>/vmcore

/var/log/dump/usermode

and contact TAC for help.

Participant

Hi G_W_Albrecht,

Thank you for your help.

Nothing report at this date.

 

fw1# ls -lah /var/log/dump/usermode
total 42M
drwxr-xr-x 2 admin root 4.0K Dec 23 2019 .
drwxr-xr-x 3 admin root 4.0K Dec 21 2016 ..
-rw-r--r-- 1 admin root 11M Aug 26 2019 DAService.17940.core.gz
-rw-r--r-- 1 admin root 12M Sep 3 2019 DAService.5368.core.gz
-rw-r--r-- 1 admin root 1.4M Dec 23 2019 PostgreSQLCmd.5512.core.gz
-rw-r--r-- 1 admin root 19M Dec 23 2019 cpd.5231.core.gz

 

fw1# ls -lah /var/crash/
total 8.0K
drwxr-xr-x 2 admin root 4.0K Nov 24 2014 .
drwxr-xr-x 17 admin root 4.0K Jan 29 2018 ..


fw1# ls -lah /var/log/crash/
total 8.0K
drwxr-xr-x 2 admin root 4.0K Mar 16 2017 .
drwxr-xr-x 24 admin root 4.0K Aug 20 15:32 ..

 

Any thing more I can check?

 

0 Kudos
Reply

Have you checked hardware sensors ? May be the system is overheating.

Participant

Hi Grigorov,

Thank you for your help.

I don't have that information on my monitoring tool, I will add now.

I check on Gaia the history health but only show the average of the day on last month, if we had a peak it was hidden with the average of the day.

On the sysenv (attach) I see the maximum value 87, but I don't know when was.

Is any way to check when occurs this value?

 

0 Kudos
Reply

I am sorry, I don't really know any built-in command to check this. I believe CPU temp of 87 degrees C is close to maximum of 100 so it is worth investigating it more thoroughly.

Admin
Admin

That looks like someone typing reboot or shutdown at the CLI versus an actual crash.
But, to be clear, it’s only a guess.

0 Kudos
Reply