Re: Best Practices to check the performance of the...

Hllrdm · ‎2022-06-23

We encountered a problem that users were having trouble accessing the Internet. We were asked to test the Check Point perimeter cluster. Is there any Best Practice on how we can check the performance of the equipment, including the interfaces and its traffic that leads to the Internet. Maybe there is some script that will give us statistics and useful information on the interfaces? HCP please do not suggest 🙂

G_W_Albrecht · ‎2022-06-24

Monitoring and analyzing Check Point devices with CPview and DiagnosticsView

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Hllrdm · ‎2022-06-24

Is there some kind of coordinate plan for checking the interfaces and its traffic and the presence of errors (e.g., some scripts)? Administrator Guide we have already looked at. We started this thread to get information from our colleagues from experience, not to link to sk and Administrator Guide

G_W_Albrecht · ‎2022-06-24

Nice answer, thank you!

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

_Val_ · ‎2022-06-24

I do not think you should brush out valid recommendations. Admin guides and SKs are what you should look for in the first place, especially if you do not have experience with the procedures.

sk167553 is exactly what you are looking for, with step-by-step elaborate procedures, starting from basic sanity checks.

For interface specifics, also look into sk166424.

Sorin_Gogean · ‎2022-06-24

hey,

can you elaborate a bit more on "encountered a problem that users were having trouble accessing the Internet" .

(also would help stating the HW model and set-up [like cluster and such])

I want to see what problems you encounter as we also have some HUGE load situations randomly (search for my other post) and in our case (and others from here got the same) it seems to be a sort of DNS attack. Still is manageable from our side and we're working to get a final protection/solution implemented.

So, if there is no HW issue (CPU, memory others) then it can be high number of connections.

You can see that in below screenshot from SmartView Monitor...

Ty,

Hllrdm · ‎2022-06-24

Hi! Answering the questions in more detail:
1. Our users are having trouble opening some sites (periodically) -- sites load slowly or don't open. After some time everything comes back to normal.
2. We are using cluster solution. In HCP we don't see errors on interfaces.
3. In ifconfig we see an increase of drops on RX (1-2 drops once per second) on all interfaces. But on the switch in front of the Check Point equipment we don't see these drops.
4. We are not sure if the problem is Check Point, but I was asked to check Check Point operation and the most interesting thing I found was drops on the RX and no drops on the switch.
Maybe there are some options on how we can solve the problem?
We don't see that the peaks are large.
At 8:30 we rebooted the device and then changed the cluster activity (it is now active equipment. We have a work day that starts at 9am.

_Val_ · ‎2022-06-24

Several notes:

1. "Slow internet" may be related to NAT reaching capacity. If you are using a single NAT Hide IP address for all your internal networks, I would look into this first.
2. RX drops mean receiving side drops. This may be caused by too many frames in the buffer and not enough CPU effort to de-queue those. Look at those interfaces to get more details. You will not see anything on the switch, because it is your FW interface and not the switch that is dropping frames. If rasing RX errors/drops coincide with the Internet issue, that may be the cause.

_Val_ · ‎2022-06-24

More notes:

3. You did not specify, what HW model you are using. Depending on how many CPUs you have, peak 35% CPU utilization may be a symptom of one or several cores running 100%, which would cause all kinds of traffic issues, including RX drops and errors, slow internet, and more.

4. I encourage you to look into the SK I mentioned earlier. It actually provides you with step by step analysis of ANY performance issue you might have.

Timothy_Hall · ‎2022-06-24

The RX-DRPs may be a red herring. As Val said they have nothing to do with the quality of the physical cabling or interface.

1) What version are you running on the gateway? Upgrading to at least R81 and making sure Dynamic Balancing/Split is enabled will solve most gateway performance problems and help dynamically adjust to spikes in load. sk164155: Dynamic Balancing for CoreXL

2) First off if the RX-DRP rate is < 0.1% compared to RX-OK on an interface don't worry about it. If it is higher than that the RX-DRPs may just indicate the reception of unknown/unsupported protocols. In the output of ethtool -S (interface) actual queuing frame drops due to insufficient CPU resources available on your SNDs will increment counters that have something like "missed" or "fifo" or "buffer" in their name. If there are way more total RX-DRPs than individual counters for these types of variables it is unknown/unsupported protocols coming in, which do not increment any of the ethtool counters at all. sk166424: Number of RX packet drops on interfaces increases on a Security Gateway R80.30 and higher ...

3) If you are running R80.40 or earlier, providing the output of the command enabled_blades as well as the output of the "Super Seven" will allow me to diagnose your situation and provide recommendations. S7PAC - Super Seven Performance Assessment Commands

Gaia 4.18 (R82) Immersion Tips, Tricks, & Best Practices Video Course
Now Available at https://shadowpeak.com/gaia4-18-immersion-course

Are you a member of CheckMates?

Best Practices to check the performance of the equipment