Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
S_E_
Advisor

Snapshot / Policy verification - high CPU

Hi,

we really enjoy the feature 'add snapshot' on a MDS system and would like to run this frequently (2-4 times a week)
However, we recognized that taking a snapshot on a MDS really slows down the complete box.
According cpview 24 CPU's but only 1 at 100% usage, 23 are bored.

During this time,
-login via SmartConsole fails
-doing a policy verification took ages
-login via ssh to MDS took ages
Average load with top shows around 50, normal operation load is 2-5.
Healtcheck and other reports are fine.

During snapshot creation, this message appears frequently:

MDS1> show snapshots
Restore points:
---------------
SNAP1
AutoSnapShot885
Restore point now under creation:
---------------------------------
SNAP2_20092021 (32%)
NMSNAP0042 System is too busy, please try again in a few seconds.

Q:
- do you experience the same issue / behavior
- does R81.10 has the same behavior / will it solve the issue

Currently running Smart-1 MDS R80.40 T91, but same behavior was already seen with R80.30
Not sure if sk104788 also applies on R80.40

Regards

0 Kudos
2 Replies
Timothy_Hall
Champion
Champion

This effect is almost certainly due to hard drive contention and not caused by only one CPU being utilized.  If you run top while the snapshot is executing, you are likely to see a high waiting for I/O (wio) percentage reported.  CPU usage by a process can be "niced" by reducing CPU priority (command nice), and so can I/O priority (ionice).  Try this:

1) Start the snapshot 

2) Confirm slow SmartConsole performance

3) There should be two processes running that are related to the snapshot: xfsdump and xfsrestore.  Determine their two process IDs (PIDs) via top or ps.

4) Confirm that their current I/O priority is 0 (best effort FIFO) - ionice -p PID1 ; ionice -p PID2 

5) Set their I/O priority to idle (lowest possible): ionice -p PID1 -c 3; ionice -p PID2 -c 3

6) Retest performance and see if it has improved while snapshot is running

There is probably a way to have the snapshot I/O priority set to idle every time a snapshot is invoked but that will almost certainly need to be done by Check Point.  Note that this will make the snapshot take longer to complete (potentially MUCH longer).

Watch My 2023 CPX360 Speech Titled "Max Power
Reloaded: R81+ Gateway Performance Innovations"
S_E_
Advisor

Hi,

Great. 

Started now the first test in lab and will run later on prod devices.

Thanks a lot.

Regards


MDS-R8040> add snapshot perftest
Taking snapshot. You can continue working normally.
You can use the command 'show snapshots' to monitor creation progress.
MDS-R8040> exit


[Expert@MDS-R8040:0]# ps -aef | grep xfsd
admin 659 30308 0 16:18 ttyS0 00:00:00 grep --color=auto xfsd
admin 32433 31708 10 16:17 ? 00:00:05 /sbin/xfsdump -l 0 -F - /dev/vg_splat/lv_current_snap

[Expert@MDS-R8040:0]# ionice -p 32433
unknown: prio 0

[Expert@MDS-R8040:0]# ionice -p 32433 -c 3

[Expert@MDS-R8040:0]# ionice -p 32433
idle