Hi
Memory seems to be ok
Virtual System Capacity Summary:
Physical memory used: 42% (11446 MB out of 27074 MB) - below watermark
Kernel memory used: 12% (3484 MB out of 27074 MB) - below watermark
Virtual memory used: 6% (1636 MB out of 27074 MB) - below watermark
I've noticed the following pattern in /var/log/messages
1-Before try install . Nothing new in messages
2-After install failed with message 'Installation failed. Reason: TCP connectivity failure ( port = 18191 )( IP = 198.18.0.20 )[ error no. 10 ].
A lot of messages like this for 8-10 minutes aprox.
Sep 26 17:09:24 2024 mrtdca01vsxfw spike_detective: spike info: type: thread, thread id: 23018, thread name: cpd, start time: 26/09/24 17:09:17, spike duration (sec): 6, initial cpu usage: 99, average cpu usage: 99, perf taken: 1
Sep 26 17:09:53 2024 mrtdca01vsxfw spike_detective: spike info: type: cpu, cpu core: 5, top consumer: cpd, start time: 26/09/24 17:09:46, spike duration (sec): 6, initial cpu usage: 100, average cpu usage: 100, perf taken: 1
Sep 26 17:09:53 2024 mrtdca01vsxfw spike_detective: spike info: type: thread, thread id: 23018, thread name: cpd, start time: 26/09/24 17:09:46, spike duration (sec): 6, initial cpu usage: 99, average cpu usage: 99, perf taken: 0
Sep 26 17:10:04 2024 mrtdca01vsxfw spike_detective: spike info: type: thread, thread id: 23018, thread name: cpd, start time: 26/09/24 17:09:58, spike duration (sec): 6, initial cpu usage: 100, average cpu usage: 100, perf taken: 0
Sep 26 17:10:10 2024 mrtdca01vsxfw spike_detective: spike info: type: cpu, cpu core: 3, top consumer: cpd, start time: 26/09/24 17:09:52, spike duration (sec): 17, initial cpu usage: 95, average cpu usage: 74, perf taken: 0
And node is marked AS LOST in MDS
3- After 8-10 min I see this in messages .
Sep 26 17:11:04 2024 mrtdca01vsxfw xpand[14067]: show_asset CDK: asset_get_proc started.
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: init sensors
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Using /etc/hw_info/sensors.xml as active sensors data file (for thresholds and translation data)
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Loading driver name [nct7904]
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Loading driver name [lm63]
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Loading driver name [pac1014a]
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Loading driver name [i2c-i801]
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU0 Vcore
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU1 Vcore
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU0 DDR4-1
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU0 DDR4-2
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU1 DDR4-1
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU1 DDR4-2
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor VCC 12V
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor VCC 3V
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor VCC 5V
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor 3VSB
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor 5VSB
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor VBAT
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU0 Temp
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU1 Temp
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor Intake Temp
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor Outlet Temp
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor System Fan 1
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor System Fan 2
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor System Fan 3
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor System Fan 4
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Checking whether to add Power supply sensors
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor BIOS
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU0 Vcore
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU1 Vcore
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU0 DDR4-1
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU0 DDR4-2
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU1 DDR4-1
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU1 DDR4-2
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor VCC 12V
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor VCC 3V
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor VCC 5V
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor 3VSB
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor 5VSB
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor VBAT
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU0 Temp
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor CPU1 Temp
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor Intake Temp
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor Outlet Temp
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor System Fan 1
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor System Fan 2
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor System Fan 3
Sep 26 17:11:04 2024 mrtdca01vsxfw cpd: Adding sensor System Fan 4
Sep 26 17:11:04 2024 mrtdca01vsxfw xpand[14067]: show_asset CDK: asset_get_proc started.
Node is OK again in MDS and policy install now works.
It seems that CPD daemon restarts and then installation works.