Wa -IO-wait

Marco_Valenti · ‎2018-06-05

Hey check mates

Sometimes on some old type of appliance from the top command you can see that one of the highest value is %wa causing high cpu load

Most of the time is a temporary issue that will auto resolve.

My question is if someone was bothered enough from that issue to find a way to discover wich process is causing high %wa or in his experience the time spent on resolving that is just wasted time

Thanks

Vladimir · ‎2018-06-05

Please check your memory utilization. If it is consistently high and you are swapping to disk, it may be better to simply upgrade the RAM on the appliance and, if they are in 32 bit mode, change them to 64 bit, to take advantage of it.

Marco_Valenti · ‎2018-06-05

sadly was not a memory issue

Timothy_Hall · ‎2018-06-05

> sadly was not a memory issue

Are you sure Marco Valenti? Running free -m reports zero for swap utilization on the last line of the output? Lack of RAM is by far the most common cause of high wa values on a gateway. This can also potentially be caused by a runaway process, failing hard drive, or possibly enabling gateway features that incur a large number of process space trips (such as HTTPS Inspection), process space trips and what causes them are covered extensively in Chapter 10 of the second edition of my book.

The tool of choice for identifying other events that are causing high wa is iotop, but that tool is not available for the 2.6.18 kernel version that Gaia uses.

Be careful trying to observe different types of CPU utilization with Check Point tools like the SmartView Monitor and cpview/cpstat, as these tools tend to roll up us/ni into a single "User CPU" number and sy/si/hi/wa/st into a single "Kernel CPU" number. I assume you are using top or sar to observe the high wa?

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization

Marco_Valenti · ‎2018-06-06

yep top was used for that case and it does not seem to be a memory issue at least for the last case that was brought to my attention

RickHoppe · ‎2018-06-06

Security Management R80.20 EA is running on kernel 3.10.0, so iotop is included in there (for future troubleshooting).

[Expert@FWMGMT:0]# fw ver
This is Check Point's software version R80.20 - Build 068

[Expert@FWMGMT:0]# uname -a
Linux FWMGMT 3.10.0-693cpx86_64 #1 SMP Tue Feb 6 12:13:02 IST 2018 x86_64 x86_64 x86_64 GNU/Linux

Security Gateway R80.20 EA is still running on kernel 2.6.18, so no iotop for now.

[Expert@GW2:0]# fw ver
This is Check Point's software version R80.20 - Build 068
[Expert@GW2:0]# uname -a
Linux GW2 2.6.18-92cpx86_64 #1 SMP Tue May 15 18:55:11 IDT 2018 x86_64 x86_64 x86_64 GNU/Linux

My blog: https://checkpoint.engineer

G_W_Albrecht · ‎2018-06-05

Which appliance model ? Which CP version ?

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Marco_Valenti · ‎2018-06-05

gaia 4000 series r77.30

Kaspars_Zibarts · ‎2018-06-05

can be be wait for disk access, I would check that too

High CPU utilization after upgrade to R77.30

Marco_Valenti · ‎2018-06-05

thanks , despite it is not an upgrade could be relevant anyway

Kaspars_Zibarts · ‎2018-06-05

There are more articles if you search on io wait and r77.30 that was just to throw ideas in the air

Markus_Genser · ‎2018-06-06

As %wa is caused by disc usage and you checked the available memory if the gateway started swapping.

Did the GW lost the connection to the management server and had to log locally?

You can check $FWDIR/log/ for local logs.

Alternatively, what was the "top -S" output during when you noticed the high %wa, was gzip involved?

A classic would be a repeatedly crashing process with coredumps.

Are you a member of CheckMates?

Wa -IO-wait