Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Simon_Drapeau
Participant
Jump to solution

R80.10 take 42 - Smartevent - log_indexer always crashed (often 100% CPU usage)

R80.10 distributed architecture

  • 2 X 15600 appliances in VSX cluster VSLS (one VS currently active) .... 7 others coming
  • SMART-1 3150 appliance (management server)

Since last week, smartevent is not able to index all the logs at the right time. On heavy load, we got a 3-4 hours delay with the real time.

DIAGNOSTIC : 

I get a verbose error message about log_indexer when the logs are loaded just after the process crashes:

  • A message in the file: /opt/Cprt-R80/log_indexer.elg indicates that an error might have occured. The message is: [log_indexet 13211 4053793680]@fw_name[Date time] SolrClient::Send: connection failure with 127.0.0.1:8210 (culr error: )(curl error number:56). This message indicates that indexer process (log_indexer) coudn^t send the logs to the Log Database engine.

I tried to change the priority of this process to help the system to prioritize this function.

The default priority for the log_indexer process is 19. I changed the priority of the log_indexer process to a better priority.  The better priority available is 0.

  • renice -n 0 -p <pid number>

No significant improvement.

Ticket opened, TAC said :  

  • Log_indexer process crashing every 5-20 minutes.
  • Log_indexer.elg shows “I’m sleep” / connection failed to 127.0.0.1
  • All other processes appear to be working correctly
  • Log_Indexer consuming 100 CPU
    TROUBLESHOOTING:
  • Referenced previous tickets all point to fresh install.
  • R&D will be our next step.

Any hints regarding this issue ? no more idea.

regards

Simon 

1 Solution

Accepted Solutions
Simon_Drapeau
Participant

Final response from TAC : Fresh installation (on same appliance)

STEPS : 

  • Before Fresh Installation, collect a Snapshot of the Management, Collect the Log files, and take a Migrate Export.
  • Fresh installation, Migrate_Import (importing our Migrate_Export Database)
  • Follow "sk98894 - Run SmartEvent Offline Jobs for multiple log files"
  • Upgrading to the latest Jumbo Fix (Take 56)
  • Enable "Logs and Monitoring" only
  • Enable log indexing
  • Enable "SmartEvent and SmartEvent Correlation unit"

MONITORING:

  • *****Run debug if the problem still occurs****

View solution in original post

9 Replies
Neil_ZInk
Collaborator

Simon

I have the Similar setup and issue.  we noticed it was some automated reports running causing the main issue.  we stopped using Network activity and thing settled down.  I too am waiting for final response from support.

0 Kudos
Simon_Drapeau
Participant

Great to see your post about similar issue.  Update the post with the final response from support if it happens

Simon_Drapeau
Participant

Final response from TAC : Fresh installation (on same appliance)

STEPS : 

  • Before Fresh Installation, collect a Snapshot of the Management, Collect the Log files, and take a Migrate Export.
  • Fresh installation, Migrate_Import (importing our Migrate_Export Database)
  • Follow "sk98894 - Run SmartEvent Offline Jobs for multiple log files"
  • Upgrading to the latest Jumbo Fix (Take 56)
  • Enable "Logs and Monitoring" only
  • Enable log indexing
  • Enable "SmartEvent and SmartEvent Correlation unit"

MONITORING:

  • *****Run debug if the problem still occurs****
Luisnego
Contributor
Thanks Simon, you help me solved this problem.
0 Kudos
Danny
Champion Champion
Champion

I recommend upgrading to the latest GA version: R80.10 Jumbo Hotfix (Take 56).

0 Kudos
Simon_Drapeau
Participant

I agree about a future installation of the latest "jumbo hotfix" but no revelant information or improvement concerning log_indexer process

Garrett_DirSec
Advisor

I've been on various customer escalations (specific to gateway issues) where the final statement from R&D has been "upgrade to R80.30.   R80.10 is inferior for multiple reasons.".   

Granted, this issue is SmartEvent but I can't help but wonder if you can run a newer version (example: R80.30 SmartEvent install) against a down-rev R80.10 SMS? 

0 Kudos
Dror_Aharony
Employee Alumnus
Employee Alumnus

Yes, you can.

a SmartEvent server can be same-version or up comparing with its Management server.

Garrett_DirSec
Advisor
thanks for quick answer. we appreciate it .
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events