cancel
Showing results for 
Search instead for 
Did you mean: 
Create a Post
Highlighted
Chris_Thuys
Nickel

Slow performance when Antivrus enabled.

I am looking for information on why performance of a java applet drops by a factor of 20 when I enable the antivirus blade and where I can look for the cause. The environment: I have replaced an existing cluster of 5200 appliances with a cluster of 5600 appliances. the 5200 cluster is running r77.30 and the 5600 cluster is running R80.20 They both have exactly the same features/blades enabled and the same policy is applied to both. When I run the java applet(jnlp) from the web page using the 5200 cluster, logon to the application takes under 5 seconds after entering my credentials. However when I replaced the cluster with the 5600 cluster running R80.20 logon takes >2 minutes. If I disable the antivirus blade logon goes back to sub 5 seconds. I set up a test environment where I can run the 5600 cluster in parallel with the 5200 cluster with the only traffic through the 5600 cluster being the one server that java applet connects to. I have exactly the same experience with the applet, <5sec logon with antivirus disabled and > 2 minutes with it enabled. To me this would indicate that it is not a capacity problem, but possibly something to do with the way R80.20 performs antivirus. I have checked the logs on the smartconsole but there are no antivirus logs recorded. Does anyone have any tips on how to see what the antivirus is doing and why it may be causing slow performance?
14 Replies

Re: Slow performance when Antivrus enabled.

A few things to try:

1) What happens on the R80.20 box when SecureXL is disabled?  Run fwaccel off then close all browser windows (very important), then launch a new browser window and test your java application. fwaccel on to re-enable SecureXL.

2) Run fw ctl zdebug drop on the gateway during the slow java test.  Any drops involving the client and web server systems?  (Or possibly backend web server to application/database server traffic) It sounds like some operation involved with the login process is getting blocked, and whatever it is "gives up" after two minutes and lets things proceed.

3) Anything interesting getting logged into $FWDIR/log/rad.elg or /var/log/messages around the time of the slow logins?

4) Are you sure that the Threat Prevention profile assigned to the traffic is exactly the same in its Anti-virus configuration for the R80.20 and R77.30 gateways?

5) Bit of a crapshoot, but try running ips off on the gateway, then close all browser windows (very important), then launch a new browser window and test your java application.  ips on to re-enable IPS.

In a worst case scenario you'll need to obtain a packet capture of the subject traffic and try to figure out if the client is waiting for the server to do something during the delay or vice-versa; essentially you need to determine which side is slowing down the transaction when it is slow and whether the firewall is involved.

 

"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
Chris_Thuys
Nickel

Re: Slow performance when Antivrus enabled.

Thanks Tim,

 

1)fwaccel off made no difference

2) fwctl zdebug drop did not show any dropped traffic related to the application

3) Neither log showed any interesting logs and no new entries were posted during the test.

4) I am fairly sure that they have the same av and ips policy applied. they both have the Optimzed policy applied for IPS and AV

5) ips off made no difference.

 

I think I will have to place a support call with checkpoint to look into it.

 

Chris.

0 Kudos

Re: Slow performance when Antivrus enabled.

Yep but at least you have conclusively identified the performance issue is antivirus-related and can go from there with TAC.

 

"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
0 Kudos
Vladimir
Pearl

Re: Slow performance when Antivrus enabled.

1. Any of the archives being loaded immediately after successful authentication?

2. Can you try failing authentication on purpose and see how long it takes to receive the failed auth message?

If the answer to [1] is a "Yes", change archive inspection parameters in the TP profile and try it again.

0 Kudos
Chris_Thuys
Nickel

Re: Slow performance when Antivrus enabled.

Not sure how to check when the archives are loading. however the app is still slow after logon. moving around the app is extraordinarily slow.
0 Kudos

Re: Slow performance when Antivrus enabled.

Hi @Chris_Thuys 

to debug Anti-Virus see ATRG: Anti-Bot and Anti-Virus

More about the Anti-Virus Daemon can be found in my article: R80.x Security Gateway Architecture (Content Inspection)

Connection flow for Anti-Virus:

Screenshot_20190707-161311_Firefox.jpg

 

Debug rad daemon:

a) Start debug:
rad_admin rad debug on all
b) Replicate the issue.
c) Stop debug:
rad_admin rad debug off ALL
d) Analyze:
$FWDIR/log/rad.elg*

Debug Anti-Virus

fw ctl debug 0
fw ctl debug -buf 32000
fw ctl debug -m CI all
fw ctl kdebug -T -f > /var/log/debug.txt

Debug Anti-Virus archive scanning:

fw ctl debug 0
fw ctl debug -buf 32000
fw ctl debug -m CI all
fw ctl debug -m dlpk all
fw ctl kdebug -T -f > /var/log/debug.txt

 

Tags (1)

Re: Slow performance when Antivrus enabled.

sorry, I forgot this:

Debug malware:

fw ctl debug 0
fw ctl debug -buf 32000
fw ctl debug -m RAD_KERNEL all
fw ctl debug -m cmi_loader all
fw ctl debug -m fw + malware (since R80: fw ctl debug -m MALWARE + all)
fw ctl kdebug -T -f > /var/log/debug.txt

Tags (1)
Chris_Thuys
Nickel

Re: Slow performance when Antivrus enabled.

Thanks Heiko.

I followed the ATRG and ran debug but did not as yet see anything indicating the cause. However I have done some further testing.

It was pointed out to me that all of the gateways interfaces were defined as internal and as the  AV policy was defined to only act on traffic from  a DMZ or external then there should actually be no AV checking on the traffic. I changed the gateway interface for the source server to DMZ.  It made no difference to performance of the app.

I removed all threat prevention blades from the gateway object and removed all reference to the gateway in the threat prevention policy, then applied policy. This resulted in fast response from the application.

I then added Antivirus blade on gateway and set it to detect only. I did not add it to any TP policy, and then installed policy. Performance of the app was fast.

I created a profile based on the optimised policy with only AV enabled and activation mode set to detect for all 3 confidence levels and created a TP policy using this profile and  applied it to the gateway. I then installed the policy to the gateway. Performance of the app was fast.

I then updated the profile and enabled Threat emulation, IPS and anti bot. I then installed the policy to the gateway. Performance of the app was fast.

I then edited the gateway disabled the AV blade. then re-enabled the AV blade and selected "According to the threat prevention policy"  I then installed the policy to the gateway. Performance of the app was terribly slow.

I then edited the gateway disabled the AV blade. then re-enabled the AV blade and selected "detect only"   I then installed the policy to the gateway. Performance of the app was fast.

I then updated the profile and disabled Threat emulation,  IPS and anti bot and also  edited the gateway disabled the AV blade. then re-enabled the AV blade and selected "According to the threat prevention policy". I then installed the policy to the gateway. Performance of the app was terribly slow.

 The only traffic through this firewall is me testing the application. No other users are using the firewall.

What is the difference between setting the gateway to detect only when enabling the AV blade and setting the activation mode to detect?

0 Kudos
Chris_Thuys
Nickel

Re: Slow performance when Antivrus enabled.

below is a section of debug log from the AV policy debug. I can recognise the source and destination ip addresses for my servers and the url. Does the rule id mentioned "match_data: rule_id 1, rule_uuid 3794, rule_name_id 3656, track 1, profile_id" relate directly to the rule is seen in smart console for threat prevention?

two of the lines seem to indicate that it has decided the url is ok 

@;2313599;17Jul2019 17:23:35.777934;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_rad_process_response_ex: RAD returned 0 patterns, the resource [4.36.0.10.in-addr.arpa] is white;
@;2313599;17Jul2019 17:23:35.777936;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_classify_ex: rc 1, url: 4.36.0.10.in-addr.arpa, action 0, rad_fail_close 0 rad_status=Status is ok;

 

@;2313598;17Jul2019 17:23:35.210072;[cpu_2];[fw4_1];1:{policy} : malware_res_rep_is_url_whitelist: Check if should bypass url=enternetview01.ems.ops/servlet/AdapterServlet;
@;2313598;17Jul2019 17:23:35.210139;[cpu_2];[fw4_1];1:{policy} : mal_conn_table_get_conn_policy_ex: _include_local_connections=0;
@;2313598;17Jul2019 17:23:35.210141;[cpu_2];[fw4_1];1:{policy} : mal_conn_table_get_conn_policy_ex: dir 0, 10.0.38.132:44304 -> 10.0.34.140:8080 IPP 6 policy ffffc200343d1a50;
@;2313598;17Jul2019 17:23:35.210141;[cpu_2];[fw4_1];1:{policy} : ==>malware_policy_get_av_settings_by_conn_policy;
@;2313598;17Jul2019 17:23:35.210143;[cpu_2];[fw4_1];1:{policy} : malware_statistics_get_statistics_from_av_handler: failed: av_handler is NULL;
@;2313598;17Jul2019 17:23:35.210144;[cpu_2];[fw4_1];1:{policy} : malware_statistics_fast_path_increment: failed: _malware_statistics is NULL;
@;2313598;17Jul2019 17:23:35.210146;[cpu_2];[fw4_1];1:{policy} : mal_conn_table_get_conn_policy_ex: _include_local_connections=0;
@;2313598;17Jul2019 17:23:35.210148;[cpu_2];[fw4_1];1:{policy} : mal_conn_table_get_conn_policy_ex: dir 0, 10.0.38.132:44304 -> 10.0.34.140:8080 IPP 6 policy ffffc200343d1a50;
@;2313598;17Jul2019 17:23:35.210148;[cpu_2];[fw4_1];1:{policy} : ==>malware_policy_get_av_settings_by_conn_policy;
@;2313598;17Jul2019 17:23:35.210150;[cpu_2];[fw4_1];1:{policy} : ==>malware_policy_create_ci_blade_profile_list: blade: 5;
@;2313598;17Jul2019 17:23:35.210152;[cpu_2];[fw4_1];1:{policy} : malware_policy_create_ci_blade_profile_list: match_data: rule_id 1, rule_uuid 3794, rule_name_id 3656, track 1, profile_id
1;
@;2313598;17Jul2019 17:23:35.210153;[cpu_2];[fw4_1];1:{policy} : malware_policy_create_ci_blade_profile_list: match_data: rule_id 6, rule_uuid 3703, rule_name_id 3656, track 1, profile_id
2;
@;2313598;17Jul2019 17:23:35.210154;[cpu_2];[fw4_1];1:{policy} : ==>malware_policy_get_match_dir_by_matched_rules_lst;
@;2313598;17Jul2019 17:23:35.210156;[cpu_2];[fw4_1];1:{policy} : malware_av_conn_policy_dup: create bidirectional_list ffffc20017fd0c10;
@;2313598;17Jul2019 17:23:35.210159;[cpu_2];[fw4_1];1:{policy} : malware_policy_update_ref_count: policy ffffc200343d1a50, increase=1, ref count=15;
@;2313598;17Jul2019 17:23:35.210160;[cpu_2];[fw4_1];1:{policy} : ==>malware_policy_create_ci_blade_profile_list: blade: 5;
@;2313598;17Jul2019 17:23:35.210161;[cpu_2];[fw4_1];1:{policy} : malware_policy_create_ci_blade_profile_list: match_data: rule_id 1, rule_uuid 3794, rule_name_id 3656, track 1, profile_id
1;
@;2313598;17Jul2019 17:23:35.210163;[cpu_2];[fw4_1];1:{policy} : malware_policy_create_ci_blade_profile_list: match_data: rule_id 6, rule_uuid 3703, rule_name_id 3656, track 1, profile_id
2;
@;2313598;17Jul2019 17:23:35.210163;[cpu_2];[fw4_1];1:{policy} : ==>malware_policy_create_ci_blade_profile_list: blade: 6;
@;2313598;17Jul2019 17:23:35.210165;[cpu_2];[fw4_1];1:{policy} : malware_policy_create_ci_blade_profile_list: match_data: rule_id 1, rule_uuid 3794, rule_name_id 3656, track 1, profile_id
1;
@;2313598;17Jul2019 17:23:35.210166;[cpu_2];[fw4_1];1:{policy} : malware_policy_create_ci_blade_profile_list: match_data: rule_id 6, rule_uuid 3703, rule_name_id 3656, track 1, profile_id
2;
@;2313598;17Jul2019 17:23:35.210180;[cpu_2];[fw4_1];1:{policy} : malware_nrb_rb_get_layer_by_rule: rule is 1, layer is 1;
@;2313598;17Jul2019 17:23:35.210181;[cpu_2];[fw4_1];1:{policy} : malware_nrb_rb_get_layer_by_rule: rule is 6, layer is 2;
@;2313598;17Jul2019 17:23:35.210183;[cpu_2];[fw4_1];1:{policy} : ==>malware_policy_create_ci_blade_profile_list: blade: 5;
@;2313598;17Jul2019 17:23:35.210184;[cpu_2];[fw4_1];1:{policy} : malware_policy_create_ci_blade_profile_list: match_data: rule_id 1, rule_uuid 3794, rule_name_id 3656, track 1, profile_id
1;
@;2313598;17Jul2019 17:23:35.210185;[cpu_2];[fw4_1];1:{policy} : malware_policy_create_ci_blade_profile_list: match_data: rule_id 6, rule_uuid 3703, rule_name_id 3656, track 1, profile_id
2;
@;2313598;17Jul2019 17:23:35.210187;[cpu_2];[fw4_1];1:{policy} : malware_res_rep_is_url_whitelist: Check if should bypass url=enternetview01.ems.ops/servlet/AdapterServlet;
@;2313599;17Jul2019 17:23:35.777884;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_cmi_match_cb: =>;
@;2313599;17Jul2019 17:23:35.777887;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_cmi_match_cb: context_id 202;
@;2313599;17Jul2019 17:23:35.777888;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_cmi_match_cb: MALWARE_CMI_CONTEXT_DNS_QUESTION;
@;2313599;17Jul2019 17:23:35.777889;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_match_dns: =>;
@;2313599;17Jul2019 17:23:35.777892;[cpu_1];[fw4_2];1:{policy} : mal_conn_table_get_conn_policy_ex: _include_local_connections=0;
@;2313599;17Jul2019 17:23:35.777895;[cpu_1];[fw4_2];1:{policy} : mal_conn_table_get_conn_policy_ex: dir 1, 10.0.38.132:35657 -> 10.0.36.4:53 IPP 17 policy ffffc200312b2030;
@;2313599;17Jul2019 17:23:35.777900;[cpu_1];[fw4_2];1:{policy} : malware_rules_filter_by_engine: rule 1 is disabled for engine_mask 4;
@;2313599;17Jul2019 17:23:35.777901;[cpu_1];[fw4_2];1:{policy} : malware_rules_filter_by_engine: rule 6 is disabled for engine_mask 4;
@;2313599;17Jul2019 17:23:35.777902;[cpu_1];[fw4_2];1:{policy} : malware_policy_is_engine_active: engine DNS Reputation is inactive;
@;2313599;17Jul2019 17:23:35.777905;[cpu_1];[fw4_2];1:{policy} : malware_policy_get_engine_action_by_profile: engine=AV Reputation, action=1, confidence=1;
@;2313599;17Jul2019 17:23:35.777906;[cpu_1];[fw4_2];1:{policy} : malware_policy_is_engine_active: engine AV Reputation is active;
@;2313599;17Jul2019 17:23:35.777907;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_classify_ex: PM ffff8101a6b1a018;
@;2313599;17Jul2019 17:23:35.777909;[cpu_1];[fw4_2];1:{policy} : malware_policy_create_profile_list_by_matched_rules: match_data: rule_id 1, rule_uuid 3794, rule_name_id 3656, track 1, pro
file_id 1;
@;2313599;17Jul2019 17:23:35.777911;[cpu_1];[fw4_2];1:{policy} : malware_policy_create_profile_list_by_matched_rules: match_data: rule_id 6, rule_uuid 3703, rule_name_id 3656, track 1, pro
file_id 2;
@;2313599;17Jul2019 17:23:35.777912;[cpu_1];[fw4_2];1:{policy} : malware_policy_get_perf_impact_by_conn_policy: TP performance impact for the connection is 3;
@;2313599;17Jul2019 17:23:35.777913;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_exec_pm: _params->user_attribs: 3, profile perf impact 3;
@;2313599;17Jul2019 17:23:35.777919;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_classify_ex: indicator_match 0, action 0;
@;2313599;17Jul2019 17:23:35.777932;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_rad_query: resource is in cache;
@;2313599;17Jul2019 17:23:35.777934;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_rad_process_response_ex: RAD returned 0 patterns, the resource [4.36.0.10.in-addr.arpa] is white;
@;2313599;17Jul2019 17:23:35.777936;[cpu_1];[fw4_2];1:{policy} : malware_res_rep_classify_ex: rc 1, url: 4.36.0.10.in-addr.arpa, action 0, rad_fail_close 0 rad_status=Status is ok;
@;2313599;17Jul2019 17:23:35.915426;[cpu_3];[fw4_0];1:{policy} : mal_conn_table_match_conn_policy: is_http = 0;
@;2313599;17Jul2019 17:23:35.915435;[cpu_3];[fw4_0];1:{policy} : mal_conn_table_match_conn_policy: in_ifn=6 out_ifn=6;
@;2313599;17Jul2019 17:23:35.915439;[cpu_3];[fw4_0];1:{policy} : mal_conn_table_get_conn_policy_data: dir 0, 0.0.0.0:8116 -> 192.168.1.0:8116 IPP 17 policy ffffc2003493c7c8;

0 Kudos

Re: Slow performance when Antivrus enabled.

Just saw your debug file, looks like some kind of reputation check is happening via RAD which is a little unusual for completely internal traffic,  and may indicate that your firewall topology is not completely and correctly defined.  Try inactivating "Links in email" and "URLs with malware" for sure as mentioned in my last post.

 

"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
0 Kudos

Re: Slow performance when Antivrus enabled.

Great round of testing there.  Was the TP configuration you are using under R80.20 imported from R77.30 or configured anew?

Is HTTPS Inspection enabled?

Please post a screenshot of the Anti-Virus settings associated with the TP profile in use.  If "Enable archive scanning" is set, unset it, reinstall policy and try the app again.

Next do the following:

1) Put the app back into its slow state.

2) Go to the Threat Prevention..Threat Tools...Protections screen.

3) For the first category with a Blade of "Anti-virus" R-click and select "Inactive Selected".

4) Publish and Install both the Access Control and Threat Prevention policy.

5) Retest the application in a new browser, if it is still slow go back to step 3 and Inactivate the next category.  (some of the categories won't let you do this, skip those)  Rinse and repeat.

My guess is the application is putting together some kind of dynamic archive or jar file that the AV blade is trying to pull apart, and this procedure will help you narrow down which specific category of AV is messing with the application.  Secondary guess would be that the app is doing really long URLs with all kinds of interesting things embedded inside, and the "URLs with malware" category is having an issue with it.  Final guess is something in Malicious Activity or Unusual Activity is taking issue with the application.

 

"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
0 Kudos
Chris_Thuys
Nickel

Re: Slow performance when Antivrus enabled.

Thanks for your help Tim,

HTTPS inspection is not enabled. also the protocol use is http albeit on port 8080.

Enable Archive scanning is not enabled.

I went through all your steps. and ended up inactivating all the antivirus protections. with the same result. slow performance. 

Also with my previous testing I do not have to reload the java app to see the difference. with the app running and modifying the firewall as soon as the policy is installed you get a instantaneous change in performance.

av-settings.PNGgeneralPolicySettings.PNG

0 Kudos

Re: Slow performance when Antivrus enabled.

OK, last four things to try before engaging TAC for a detailed debug:

0) Make sure your DNS servers defined on the firewall are correct for RAD to perform its reputation check.  Run nslookup from expert mode, look up a few site names,  and make sure DNS resolution is speedy and not delayed.

1) Leave anti-virus enabled but define a full anti-virus TP exception for the systems in question - This shouldn't change anything as an exception just changes the final decision/Action but all the regular inspection still occurs.  However if this does improve things it means there is some kind of block occurring that you can't see with fw ctl zdebug drop, and eventually whatever is being blocked times out and the application proceeds anyway which is the cause of the slow performance.

2) Define what I call a "null" TP profile in my book.  Leave anti-virus enabled on the gateway, then create a new TP profile with all five checkboxes (including anti-virus) unchecked.  Put a rule at the top of your TP policy with the null profile matching the application traffic and install policy.  This should make the application fast again as you are essentially disabling anti-virus completely for the application traffic.  If it doesn't improve things but unchecking anti-virus on the gateway object does, well that doesn't make sense...

3) When the application is in the fast state, on the firewall interface *facing the client* take a packet capture using tcpdump with time stamps (use -ttt to display inter-packet deltas instead of absolute timestamps) while the application is operating.  Note any unexpected  delays of more than 200 milliseconds in the stream of packets that are not caused by waiting for user input (i.e. entering data and hitting submit).  Now do the same thing when the application is in the slow state.  When a >200ms delay occurs, which system (client or server) speaks first and causes everything to start rapidly moving again?  In other words are there now regular occurrences where one side is waiting for something from the other to get things moving? Which side is holding things up?   Once you have characterized these delays in the packet flow when the app is slow, now move the packet capture on the firewall *facing the server* when the app is slow and find the >200ms gaps again.  Compare your results with the capture from the client side.

Yes this process is painstaking, but necessary to figure out strange application delays like this.

 

"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
0 Kudos
jslimma
Ivory

Re: Slow performance when Antivrus enabled.

Hello guys,

I´m with the same problem but the problematic blade is Anti-bot. I have been configured a policy with Anti-bot enabled in Threat Prevention and when I access a Web Java Application port 8080 to be very slow. I disabled this rule and everything be normally (fast).

I will try to test all steps suggest by you and I post the results.

If exists one more recommendation about this case I will appreciate.   

0 Kudos