Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
aner_sagi
Contributor

WSDNSD high cpu + Strange messages

up_manager_resume_chain: fwhold_send failed. chain will be dropped by the fwhold API
 [up_manager_perform_action: up_manager_resume_chain failed
network_classifiers_domain_async_timeout_cb: the 'perform_action' callback function failed

up_manager_resume_chain: _chain_packet_id 1 is not held

Customer is using 5200 R80.10 jumbo 103 (full ha) . customer complains about web surfing problems.

all blades enabled + https inspection 

have you seen such behavior? thanks in advance

aner

9 Replies
Timothy_Hall
Champion
Champion

Do you mean the wstlsd daemon instead of wsdnsd?

This process handles the initial HTTPS negotiation when HTTPS Inspection is enabled, or "Categorize HTTPS websites" is checked.  Given that a 5200 has only two cores and therefore will have a Firewall Worker and SND/IRQ instance executing on both cores, it is probably going to struggle performing HTTPS Inspection with any reasonable amount of Internet traffic.  Using a full HA setup will only exacerbate this situation, you will probably at a minimum need more RAM.

How much Internet bandwidth does the firewall have?  You may be able to do some tuning to improve the situation somewhat, please provide the output of the following commands run on your active firewall member during peak times:

fwaccel stat
fw ctl affinity -l -r
sim affinity -l
netstat -ni
fw ctl multik stat
free -m
enabled_blades
fw ver

cpstat os -f multi_cpu -o 1

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
aner_sagi
Contributor

Hi Tim

. Thanks for helping me. This is not peak time now. please look at the attached info.

fw screenshot

enabled_blades
fw vpn cvpn urlf av appi ips identityServer SSL_INSPECT anti_bot vpn

fwaccel stat
Accelerator Status : on
Accept Templates : disabled by Firewall
Layer Network disables template offloads from rule #8
Throughput acceleration still enabled.
Drop Templates : enabled
NAT Templates : disabled by Firewall
Layer Network disables template offloads from rule #8
Throughput acceleration still enabled.
NMR Templates : enabled
NMT Templates : enabled

Accelerator Features : Accounting, NAT, Cryptography, Routing,
HasClock, Templates, Synchronous, IdleDetection,
Sequencing, TcpStateDetect, AutoExpire,
DelayedNotif, TcpStateDetectV2, CPLS, McastRouting,
WireMode, DropTemplates, NatTemplates,
Streaming, MultiFW, AntiSpoofing, Nac,
ViolationStats, AsychronicNotif, ERDOS,
McastRoutingV2, NMR, NMT, NAT64, GTPAcceleration,
SCTPAcceleration
Cryptography Features : Tunnel, UDPEncapsulation, MD5, SHA1, NULL,
3DES, DES, CAST, CAST-40, AES-128, AES-256,
ESP, LinkSelection, DynamicVPN, NatTraversal,
EncRouting, AES-XCBC, SHA256

fw ctl affinity -l -r
CPU 0: eth1-02 eth1-04 Mgmt
fw_1
CPU 1: eth1-01 eth1-03 eth1-05 eth1-06 eth1
fw_0
All: lpd dtlsd wsdnsd dtpsd vpnd mpdaemon usrchkd pepd fwd fwpushd fwm cpca pdpd rad in.msd cplmd cpstat_monitor in.acapd cpd cprid

netstat -ni
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
Mgmt 1500 0 494043 0 0 0 732621 0 0 0 BMRU
eth1-01 1500 0 2866409 0 0 0 16969676 0 0 0 BMRU
eth1-02 1500 0 44293052 0 0 0 44762235 0 0 0 BMRU
eth1-03 1500 0 1325240 0 0 0 1386822 0 0 0 BMRU
eth1-04 1500 0 752440 0 0 0 730159 0 0 0 BMRU
eth1-05 1500 0 232262 0 0 0 198449 0 0 0 BMRU
eth1-06 1500 0 642412 0 0 0 552055 0 0 0 BMRU
lo 16436 0 754282 0 0 0 754282 0 0 0 LRU

cpstat os -f multi_cpu -o 1

Processors load
---------------------------------------------------------------------------------
|CPU#|User Time(%)|System Time(%)|Idle Time(%)|Usage(%)|Run queue|Interrupts/sec|
---------------------------------------------------------------------------------
| 1| 12| 61| 26| 74| ?| 6589|
| 2| 12| 52| 37| 63| ?| 6589|
---------------------------------------------------------------------------------

0 Kudos
Timothy_Hall
Champion
Champion

Looks like it is wsdnsd eating the CPU after all, this daemon provides DNS lookup services when the firewall is configured as a proxy.  Are the clients in your network using the firewall as a web proxy server?  That configuration isn't generally necessary to obtain the inspection benefits of the firewall.

The big thing to check here is that the DNS servers defined in the firewall's Gaia OS config are reachable and responding quickly.  rad (Resource Advisor Daemon) is another firewall process that can have major problems if there are DNS configuration issues, here is an excerpt from my book about how to manually test the firewall's DNS servers for responsiveness and reachability:

Special Case: DNS and the rad Daemon


The Resource Advisor Daemon (rad) is a key process for many of the commonly
used blades listed in the table above. The rad process handles interaction between the
firewall and the Check Point ThreatCloud for dynamic lookups of content such as URLs;
as such it needs reliable access to the Internet and timely DNS responses to avoid
potential delays of user traffic. To ensure that all DNS servers defined in Gaia are
reachable and delivering timely responses (which rad actively depends on), run this
quick test:


1. On the firewall from expert mode, run cat /etc/resolv.conf and note the
DNS servers listed there. For our example the listed DNS servers will be 8.8.8.8
and 4.2.3.2.


2. Now make sure that all DNS servers listed are reachable and responding promptly
with the nslookup command like this:

See that? Looks like the second DNS server’s IP address has a typo (it should be
4.2.2.2), get that fixed! Make sure all DNS servers in the firewall’s list are correct and
responding promptly, or DNS resolution delays experienced by the rad daemon could
pass through to user sessions.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
aner_sagi
Contributor

Hi Tim,

We don’t use the gateway as proxy. Besides the strange error messages in /var/log/messages we can see kernel traces

As attached.

Dns on gaia is internal dns 172.29.x.x but I can resolve www.cnn.com<http://www.cnn.com> on the firewall using the nslookuo command.

Aner.

0 Kudos
Timothy_Hall
Champion
Champion

Do you have the setting "Use this gateway as an HTTP/HTTPS Proxy" checked for the gateway/cluster object on the "HTTP/HTTPS Proxy" screen?  I guess I don't understand why wsdnsd is running on your system in the first place, let alone why it is eating so much CPU.  Be warned however that if you do have this checkbox set and uncheck it, any users with explicit proxy settings defined in their browsers won't be able to reach the Internet.

Might be worth taking a look in the $FWDIR/log/wsdnsd.elg file, is the wsdnsd daemon barfing any error messages into this file that could be helpful?

Also are there any core dumps for the daemon present in /var/log/dump/usermode?

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
William_Tavares
Participant

Hi Timothy

I have exactly the same problem on an 2-nodes cluster (Open Server) working in Load Sharing mode.

WSDNSD process high CPU even with HTTP/HTTPS Proxy DISABLED.

I found in the Check Point Processes and Daemons SK that this process is "activated when Security Gateway is configured as HTTP/HTTPS Proxy", but it's not my case...

I've bought your book, but didn't find any topic about the HTTP/HTTPS Proxy feature.

I've even searched about this behavior in many communities, user groups and forums without find anything.

There's no dump core files...

The $FWDIR/log/wsdnsd.elg file (debug enabled) doesn't show anything relevant.

Could you explain why this process is running without HTTP/HTTP Proxy checked?

Is it normal or is it a bug/problem?

0 Kudos
Timothy_Hall
Champion
Champion

Could definitely be due to the use of domain objects in the policy, curious to see what happens when you remove it.

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Ofir_Shikolski
Employee
Employee

Are you using "Domain Objects"? if yes, please see : Best Practices - Working with Domain Objects (Pre R80.10) 

0 Kudos
William_Tavares
Participant

Yes. I'm using just one.

Going to deactivate him.

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events