Solved: R80.10 take 42 - Smartevent - log_indexer always c...

Simon_Drapeau · ‎2017-12-10

R80.10 distributed architecture

2 X 15600 appliances in VSX cluster VSLS (one VS currently active) .... 7 others coming
SMART-1 3150 appliance (management server)

Since last week, smartevent is not able to index all the logs at the right time. On heavy load, we got a 3-4 hours delay with the real time.

DIAGNOSTIC :

I get a verbose error message about log_indexer when the logs are loaded just after the process crashes:

A message in the file: /opt/Cprt-R80/log_indexer.elg indicates that an error might have occured. The message is: [log_indexet 13211 4053793680]@fw_name[Date time] SolrClient::Send: connection failure with 127.0.0.1:8210 (culr error: )(curl error number:56). This message indicates that indexer process (log_indexer) coudn^t send the logs to the Log Database engine.

I tried to change the priority of this process to help the system to prioritize this function.

The default priority for the log_indexer process is 19. I changed the priority of the log_indexer process to a better priority. The better priority available is 0.

renice -n 0 -p <pid number>

No significant improvement.

Ticket opened, TAC said :

Log_indexer process crashing every 5-20 minutes.
Log_indexer.elg shows “I’m sleep” / connection failed to 127.0.0.1
All other processes appear to be working correctly
Log_Indexer consuming 100 CPU
TROUBLESHOOTING:
Referenced previous tickets all point to fresh install.
R&D will be our next step.

Any hints regarding this issue ? no more idea.

regards

Simon

Simon_Drapeau · ‎2018-01-11

Final response from TAC : Fresh installation (on same appliance)

STEPS :

Before Fresh Installation, collect a Snapshot of the Management, Collect the Log files, and take a Migrate Export.

Fresh installation, Migrate_Import (importing our Migrate_Export Database)

Follow "sk98894 - Run SmartEvent Offline Jobs for multiple log files"
Upgrading to the latest Jumbo Fix (Take 56)
Enable "Logs and Monitoring" only
Enable log indexing

Enable "SmartEvent and SmartEvent Correlation unit"

MONITORING:

*****Run debug if the problem still occurs****

View solution in original post

Neil_ZInk · ‎2017-12-12

Simon

I have the Similar setup and issue. we noticed it was some automated reports running causing the main issue. we stopped using Network activity and thing settled down. I too am waiting for final response from support.

Simon_Drapeau · ‎2017-12-13

Great to see your post about similar issue. Update the post with the final response from support if it happens

Simon_Drapeau · ‎2018-01-11

Final response from TAC : Fresh installation (on same appliance)

STEPS :

Before Fresh Installation, collect a Snapshot of the Management, Collect the Log files, and take a Migrate Export.

Fresh installation, Migrate_Import (importing our Migrate_Export Database)

Follow "sk98894 - Run SmartEvent Offline Jobs for multiple log files"
Upgrading to the latest Jumbo Fix (Take 56)
Enable "Logs and Monitoring" only
Enable log indexing

Enable "SmartEvent and SmartEvent Correlation unit"

MONITORING:

*****Run debug if the problem still occurs****

Luisnego · ‎2019-11-07

Thanks Simon, you help me solved this problem.

Danny · ‎2017-12-12

I recommend upgrading to the latest GA version: R80.10 Jumbo Hotfix (Take 56).

Simon_Drapeau · ‎2017-12-13

I agree about a future installation of the latest "jumbo hotfix" but no revelant information or improvement concerning log_indexer process

Garrett_DirSec · ‎2019-11-11

I've been on various customer escalations (specific to gateway issues) where the final statement from R&D has been "upgrade to R80.30. R80.10 is inferior for multiple reasons.".

Granted, this issue is SmartEvent but I can't help but wonder if you can run a newer version (example: R80.30 SmartEvent install) against a down-rev R80.10 SMS?

Dror_Aharony · ‎2019-11-11

Yes, you can.

a SmartEvent server can be same-version or up comparing with its Management server.

Garrett_DirSec · ‎2019-11-11

thanks for quick answer. we appreciate it .

Are you a member of CheckMates?

R80.10 take 42 - Smartevent - log_indexer always crashed (often 100% CPU usage)