Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Cegeka_Networki
Participant

R80.10 SMS slowness

Hello,

One of our customers is using a R80.10 SMS in Amazon Cloud.

We are experiencing slowness while doing most of the operation tasks like: connecting to Smart Console takes more than a minute, moving between different pages inside Smart Console takes 5-10 seconds, loading a policy to display it takes 5-10 seconds, loading objects in Object Explore takes more than 10 seconds, etc. 

The virtual machine has 32 GB of RAM, 8 CPUs and 100 GB disk space. See more in the attached file.

There are 11 gateways managed from that SMS and SMS is logging server as well. 

I'm wondering if that slowness can be reduced. 

Could you please advise how to approach this slowness investigation and if there are any tips or guide lines to check?

Thank you!

Adrian

0 Kudos
8 Replies
Timothy_Hall
Legend Legend
Legend

Your processor and memory specs look good, so it is probably an issue with the performance of the disk path.  While performing operations in the SmartConsole, run top.  Do you see a lot of CPU time expended in "wa"?  That means the CPU is blocked waiting for the disk path to respond.  You can also look at this historically by just running sar with no options, "wa" will be labelled "iowait" in the sar output.

Also I assume that your 8 CPUs are actually 8 discrete CPUs and not 4 CPUs hyperthreaded as it is not currently recommended to enable hyperthreading on an SMS for performance reasons.  May not matter or be relevant in AWS but I thought I'd throw it out there.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Cegeka_Networki
Participant

Thank you Tim for response.

I checked wa values while performing some operations in Smart Console and I noticed that CPU 0 had all the time wa values above zero but not bigger than a few percentages (in my estimation its average was around 1.00 %). Also CPU 1 had from time to time wa values above zero.

I attached the output of "sar" command. Are those values relevant to consider the disk space the issue for that slowness?

Also please advise if there are other things we should check.

Many thanks,

Adrian

0 Kudos
Timothy_Hall
Legend Legend
Legend

Those numbers on your SMS look fine, not sure why your access is so slow.  Most of the heavy lifting is being done on the SMS end, and the SmartConsole is just displaying and manipulating what is sent to it.  I assume the network  between where the SmartConsole is running and the SMS is good?  Try running some continuous big pings to the SMS from the same workstation using SmartConsole while executing some particularly slow operations in the SmartConsole as stated in my book:

A great trick to help you determine whether a particular network path is experiencing
latency or loss is to send extra-large test packets with the ping command, which have a

knack for irritating any underlying network problems thus making them more
pronounced and easier to identify:


Gaia/Linux: ping -s 1400 129.82.102.32
Windows: ping -l 1400 129.82.102.32


Better yet, most Linux-based versions of the ping command also support a flood
option (-f) which instead of sending one echo request per second, will send a flood of
them as fast as it can and note how much loss and/or latency is encountered.

Any packet loss or wild swings in latency?  Also please post output of the following run on the SMS:

free -m

netstat -ni

grep -c ^processor /proc/cpuinfo

netstat -s

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Cegeka_Networki
Participant

Tim,

I attached the output of those commands and also ping results.

Indeed, there is a considerably distance between the PCs where we run Smart Console (one in UK, Europe and another one in Sydney, Australia) and SMS server that is in USA. There was no dropped ping as can be seen in the attached file.

I will follow with customer to see if there is any server they have in AWS from where to run Smart Console to check if there will be a better experience for Smart Console operation.

Best Regards,

Adrian

0 Kudos
Timothy_Hall
Legend Legend
Legend

The network latency is a little high (~150ms) but seems to be relatively stable with low jitter.  No significant fragmentation and everything looks healthy at the network level.  There does seem to be a slightly unusual number of TCP RSTs but I doubt they matter:

251183 connection resets received

173534 resets sent

111605 connections reset due to unexpected data
136180 connections reset due to early user close

I suppose CPMI could be one of those protocols that does not handle relatively high latency very well, and is doing a lot of waiting around for application-level acknowledgements of operations.   Very curious to see what happens when you run SmartConsole from inside AWS to the SMS which is also inside AWS, instead of traversing a relatively high latency Internet with the CPMI traffic.

For grins do a tail -f on $FWDIR/log/cpm.elg and $FWDIR/log/fwm.elg on the SMS while attempting some slow SmartConsole operations to see if any interesting error messages or warnings are being barfed into these files at that time.

One last thing to look into *might* be increasing FWASYNC_MAXBUF but unless you are seeing the "fwasync_connbuf_realloc: Connection buffer overflow" warnings mentioned in sk109236: High CPU / process crashes / timeout due to large database / first time operations / load ... arbitrarily increasing this value is not recommended.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Cegeka_Networki
Participant

Thank you for your great support.

Smart Console experience from a server in AWS it's much better than from UK or Sydney.

There was no feeling of delay.

I attached the output of cpm.elg while doing some operational activities in Smart Console.

Please let me know if you see anything relevant in there. Afterwards, from my point of view we can close this treat.

0 Kudos
Timothy_Hall
Legend Legend
Legend

Nothing unusual going on in cpm.elg either, I'd say your SMS itself is fine.  As I theorized earlier CPMI must be one of those protocols whose operations can't be pipelined or parallelized, and as such high network latency will slow it down as data is sent and some kind of application-level acknowledgement must be received before proceeding.  I seem to recall the ability to use compressed connections with older copies of the SmartDashboard, but since the issue is not lack of bandwidth in this case I don't see how that would help.  Early versions of SMB/CIFS suffered performance issues over high latency networks as well and were mentioned briefly in my book.

I'll go ahead and tag Dameon Welch Abernathy‌ here, Daemon are there any internal resources talking about tuning CPMI for performance over high latency networks?  I'm suspecting that the delays are being caused by how CPMI itself works, and trying to tweak TCP (smaller MSS, more RTT tolerance, etc) would not help.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
PhoneBoy
Admin
Admin

Not sure much can be done to tune CPMI for higher latency networks but can't hurt to ask around.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events