R80.10 distributed architecture
- 2 X 15600 appliances in VSX cluster VSLS (one VS currently active) .... 7 others coming
- SMART-1 3150 appliance (management server)
Since last week, smartevent is not able to index all the logs at the right time. On heavy load, we got a 3-4 hours delay with the real time.
DIAGNOSTIC :
I get a verbose error message about log_indexer when the logs are loaded just after the process crashes:
- A message in the file: /opt/Cprt-R80/log_indexer.elg indicates that an error might have occured. The message is: [log_indexet 13211 4053793680]@fw_name[Date time] SolrClient::Send: connection failure with 127.0.0.1:8210 (culr error: )(curl error number:56). This message indicates that indexer process (log_indexer) coudn^t send the logs to the Log Database engine.
I tried to change the priority of this process to help the system to prioritize this function.
The default priority for the log_indexer process is 19. I changed the priority of the log_indexer process to a better priority. The better priority available is 0.
- renice -n 0 -p <pid number>
No significant improvement.
Ticket opened, TAC said :
- Log_indexer process crashing every 5-20 minutes.
- Log_indexer.elg shows “I’m sleep” / connection failed to 127.0.0.1
- All other processes appear to be working correctly
- Log_Indexer consuming 100 CPU
TROUBLESHOOTING: - Referenced previous tickets all point to fresh install.
- R&D will be our next step.
Any hints regarding this issue ? no more idea.
regards
Simon