Debugging freeze, overload and performance issues on SMB devices

Document created by Günther W. Albrecht on Feb 27, 2018Last modified by Günther W. Albrecht on Jun 13, 2018
Version 2Show Document
  • View in full screen mode

When a customer complains about SMB units performance issues and a reboot does not resolve that, mostly TAC has to be involved. I hate to come empty-handed , so if possible, i first perform the following debug procedure to collect everything TAC needs before opening a SR#:


1. Make sure - by issuing # configload_status - that all configurations have finished, there are no errors and all blades have loaded before testing or debugging!


2. Please connect with SSH or serial console to the device.


3. Set serial program (e.g. putty) to capture output
- set the PUTTY to record all Sessions:
- open PUTTY > Session > Logging > Session Logging >
- check the All session output


4. Boot the device. Boot it in debug mode (option 2 in the boot menu).


5. Wait for the issue to occur again, then collect:
- from CLISH: # show diag (or from expert: # echo show diag | clish > diag.txt)
- collect cpinfo (# cpinfo -z -o <filename>).
- In addition, if there's a core file please collect it too (check through # ls -l /logs in expert Mode)
- collect output from # dmesg


6. Before rebooting or restarting sfwd:
- Run "ps aux | grep sfwd" to get the sfwd pid "*"
- Get the output of "# ls -l /proc/*/fd/" using the sfwd pid


7. Save all PUTTY output, cpinfo file, core dump etc. for CP TAC.


Please be aware that different issues will need different debug procedures, so it may often be more preferable to open a SR# and ask TAC for the needed debug procedure and not waste time in vain ! If the space is too small to get the issue captured in sfwd.elg, consult sk122156 Disk space issues on Gaia Embedded appliance when running debugs for a workaround.