Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Patricio_Gavila
Participant

Messages of mux error on a cluster (active-standby) in r80.20

Hi all,

I have a Lenovo System x3650 M5 (compatibility matrix) with GAIA r80.20 (jumboHF take 80) in distributed deployment. The server firmware is updated to the last level, and with the r77.30 version works great. I have many problems with the Internet, for example, images and Office 365 emails take too long to load, even when the user is in an unrestricted rule. This did not happen with r77.30. In active Gateway shows error messages in file /var/log/messages: 

 

Jun 12 14:19:57 2019 FW-NODO1 kernel: [fw4_4];mux_task_handler: ERROR: Failed to handle task. task=ffffc20085221670, app_id=1, mux_state=ffffc20092970c00.
Jun 12 14:19:57 2019 FW-NODO1 kernel: [fw4_4];mux_soc_result_handler: ERROR: Failed to handle task queue. mux_opaque=ffffc20092970c00.
Jun 12 14:19:57 2019 FW-NODO1 kernel: [fw4_4];tls_main_send_record_layer_message: mux_soc_result_handler failed
Jun 12 14:19:58 2019 FW-NODO1 kernel: [fw4_4];mux_task_handler: ERROR: Failed to handle task. task=ffffc2008275e530, app_id=1, mux_state=ffffc2005f6a5c00.
Jun 12 14:19:58 2019 FW-NODO1 kernel: [fw4_4];mux_soc_result_handler: ERROR: Failed to handle task queue. mux_opaque=ffffc2005f6a5c00.
Jun 12 14:19:58 2019 FW-NODO1 kernel: [fw4_4];tls_main_send_record_layer_message: mux_soc_result_handler failed
Jun 12 14:19:58 2019 FW-NODO1 kernel: [fw4_4];mux_task_handler: ERROR: Failed to handle task. task=ffffc2011e77b7b0, app_id=1, mux_state=ffffc200d97bfc00.
Jun 12 14:19:58 2019 FW-NODO1 kernel: [fw4_4];mux_soc_result_handler: ERROR: Failed to handle task queue. mux_opaque=ffffc200d97bfc00.
Jun 12 14:19:58 2019 FW-NODO1 kernel: [fw4_4];tls_main_send_record_layer_message: mux_soc_result_handler failed
Jun 12 14:19:59 2019 FW-NODO1 kernel: [fw4_3];mux_task_handler: ERROR: Failed to handle task. task=ffffc200a775bfb0, app_id=1, mux_state=ffffc2027cc1a420.
Jun 12 14:19:59 2019 FW-NODO1 kernel: [fw4_3];mux_soc_result_handler: ERROR: Failed to handle task queue. mux_opaque=ffffc2027cc1a420.
Jun 12 14:19:59 2019 FW-NODO1 kernel: [fw4_3];tls_main_send_record_layer_message: mux_soc_result_handler failed
Jun 12 14:19:59 2019 FW-NODO1 kernel: [fw4_3];mux_task_handler: ERROR: Failed to handle task. task=ffffc200aa947b30, app_id=1, mux_state=ffffc200dffa5810.
Jun 12 14:19:59 2019 FW-NODO1 kernel: [fw4_3];mux_soc_result_handler: ERROR: Failed to handle task queue. mux_opaque=ffffc200dffa5810.
Jun 12 14:19:59 2019 FW-NODO1 kernel: [fw4_3];tls_main_send_record_layer_message: mux_soc_result_handler failed
Jun 12 14:20:00 2019 FW-NODO1 kernel: [fw4_2];mux_task_handler: ERROR: Failed to handle task. task=ffffc2007f670b30, app_id=1, mux_state=ffffc200c6950420.
Jun 12 14:20:00 2019 FW-NODO1 kernel: [fw4_2];mux_soc_result_handler: ERROR: Failed to handle task queue. mux_opaque=ffffc200c6950420.
Jun 12 14:20:00 2019 FW-NODO1 kernel: [fw4_2];tls_main_send_record_layer_message: mux_soc_result_handler failed
Jun 12 14:20:01 2019 FW-NODO1 kernel: [fw4_5];mux_task_handler: ERROR: Failed to handle task. task=ffffc20122ccdb70, app_id=1, mux_state=ffffc20068218810.
Jun 12 14:20:01 2019 FW-NODO1 kernel: [fw4_5];mux_soc_result_handler: ERROR: Failed to handle task queue. mux_opaque=ffffc20068218810.
Jun 12 14:20:01 2019 FW-NODO1 kernel: [fw4_5];tls_main_send_record_layer_message: mux_soc_result_handler failed
Jun 12 14:20:02 2019 FW-NODO1 kernel: [fw4_5];cpas_newconn_ex : called upon something other than tcp SYN. Aborting

 

My question is if anyone knows if it is possible to deactivate the mux?. Otherwise I will rollback to r77.30.

My concern is: because Check Point sells a poorly tested product and even more wants to force customers to migrate from r77.30 to r80, knowing that the r77.30 version is the best they have had in many years. The r80 version has too many problems, but even in cluster, the truth is impressive the failures of the product.

 

Thanks,

Patricio G.

0 Kudos
16 Replies
_Val_
Admin
Admin

Did you open an SR with TAC?

0 Kudos
Patricio_Gavila
Participant

Dear Valeri,

 

TAC solutions is:

"Please try with the JHF ongoing Take_80. sk137592
There were improvements in that JHF and that error is covered in that patch.
First try this option, before going down to the R80.10 version, as they told me they have the intention to try."

Unacceptable for a production environment. 

 

Regards,

Patricio

0 Kudos
PhoneBoy
Admin
Admin

Perhaps there is something in this hotfix, but it's not obvious from the release notes.
I'll contact you privately to get the SR number here.
0 Kudos
PhoneBoy
Admin
Admin

I double-checked with TAC and we indeed integrated a fix for a different customer into R80.20 JHF 80 for a similar issue.
0 Kudos
G_W_Albrecht
Legend
Legend

My question is: How many production systems have you been upgrading from R77.30 to R80.20 yet ? Or is it this one installation only that makes you so sure this is a poorly tested product ?

Most customers have had some troubles during this transition as a lot is changed at the core and much is working better now - but mostly, these issues have their reason in poor configuration, as you have to revise a lot for R80.20 !

CCSE CCTE CCSM SMB Specialist
Patricio_Gavila
Participant

Dear,

 

With my IT team we did a previous migration to test the r80.20 version in an isolated environment, and everything went well in the tests, but when sending to production it shows the recurring errors in /var/log/messages. The testing environment worked for a month without problems. The physical servers are exactly the same in brand, model and PCI cards. There was no reason to think that there are so many problems in the production environment. I work in a finance company and therefore we can not be testing the production environment with risks.

 

Regards,

Patricio G.

0 Kudos
Jelle_Hazenberg
Collaborator
Collaborator

@Patricio_Gavila are you still facing the errors? Or did TAC provide you with a solution? We face the same issues. Also we are struggling with memory leaks in R80.20 and R80.30...

0 Kudos
Thomas_Eichelbu
Advisor

Hello, 

 

are they any news about this we see this on a fresh installed R80.30 Take 50 on an openserver ...

 

best regards
Thomas.

0 Kudos
phlrnnr
Advisor

I currently have a TAC case open for a similar issue on R80.30 / Jumbo 111.

I see this in the messages log (repeated continually when HTTPS Inspection is enabled.  If HTTPS inspection is disabled, these messages go away):

Feb 14 11:49:16 2020 <removed> kernel: [fw4_10];tls_main_handle_ingress: malformed alert:
Feb 14 11:49:16 2020 <removed> kernel: [fw4_10]; 0: <00 00 00 00 00 00 00 01 d2 5f 8a fd b3 ac ed f4 ........._......
Feb 14 11:49:16 2020 <removed> kernel: [fw4_10]; 16: 0f 50 49 39 a7 d3 8b eb 0c 06> .PI9......
Feb 14 11:49:16 2020 <removed> kernel: [fw4_10];
Feb 14 11:49:16 2020 <removed> kernel: [fw4_10];mux_task_handler: ERROR: Failed to handle task. task=ffffc202095362b0, app_id=1, mux_state=ffffc2012eabc6f0.
Feb 14 11:49:16 2020 <removed> kernel: [fw4_10];mux_read_handler: ERROR: Failed to handle task queue. mux_opaque=ffffc2012eabc6f0.
Feb 14 11:49:16 2020 <removed> kernel: [fw4_10];mux_active_read_handler_cb: ERROR: Failed to forward data to Mux.

TAC says these messages are cosmetic and there is a hotfix that can be applied to get rid of the error messages.

0 Kudos
phlrnnr
Advisor

I've also been told by TAC that the fix is included in R80.30 JHFA 140, even though it is not listed in the 'resolved issues' section.

0 Kudos
Kathleen_Murphy
Participant

We are on R80.30 JHF215 and still seeing this same syslog message.

0 Kudos
abihsot__
Advisor

Did this go away at any incarnation of the gateway? These messages are still present on R81.10. It is long way since R80.20 or R80.30 mentioned in this thread.

0 Kudos
Tal_Paz-Fridman
Employee
Employee

Can you paste the exact messages you are receiving.

Thanks

0 Kudos
abihsot__
Advisor

Thanks for reply. For an untrained eye they look the same. R81.10 latest JHF

 

Oct 4 15:51:05 2023 HOSTNAME kernel: [fw4_2];ws_mux_host_only_active_finalize_read_handler: ERROR: stream[1] is empty. mux_stat=1.
Oct 4 15:51:05 2023 HOSTNAME kernel: [fw4_2];ws_mux_read_handler_from_main_ex: ERROR: Finalize callback failed. cdir=2, mux_stat=1.
Oct 4 15:51:05 2023 HOSTNAME kernel: [fw4_2];ws_mux_read_handler_from_main: ERROR: Failed to call read handler. ws_connection=ffffc90082a25370.
Oct 4 15:51:05 2023 HOSTNAME kernel: [fw4_2];mux_task_handler: ERROR: Failed to handle task. task=ffffc90096862558, app_id=4 (WS), mux_state=ffffc9008e213030, curr_side 0, prev_side 0.
Oct 4 15:51:05 2023 HOSTNAME kernel: [fw4_2];mux_read_handler: ERROR: Failed to handle task queue. mux_opaque=ffffc9008e213030.
Oct 4 15:51:05 2023 HOSTNAME kernel: [fw4_2];mux_active_read_handler_cb: ERROR: Failed to forward data to Mux.

 

enabled_blades
fw urlf appi identityServer SSL_INSPECT content_awareness mon

0 Kudos
Tal_Paz-Fridman
Employee
Employee

Hi again,

I've talked to to the relevant owner in R&D owner and we are aware of this issue. It will be handled in upcoming JHFs.

Thanks

0 Kudos
abihsot__
Advisor

Thanks for an update! You saved me one support ticket 🙂

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events