Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Alexander_Wilke
Advisor

Custom Metrics - Failed to execute script - /bin/bashexceededtheCPUthreshold

Hello,

I created a script (or AI did it) to get the information from "orch_stat -p" of my MHO 140.

 

This is the script:

#!/bin/bash

# =========================
# RX Metrics Script for MHO
# =========================

# Load Maestro environment profiles (required for Maestro scripts)
source /opt/CPshrd-R81.20/tmp/.CPprofile.sh
. /opt/CPotlpAgent/cs_data_handler_is.bash

# Check if this is an MHO system; exit if not
if [[ ! -f /etc/.scalable_platform_mho ]]; then
    script_exit "system is no MHO" 0
fi

# Use process substitution to avoid subshells
while IFS= read -r line; do
    # Read the 12 tab-separated fields into named variables
    IFS=$'\t' read -r Physical_Port Interface_Name Type SG QSFP_Mode Admin_State Link_State Transceiver_State Operating_Speed MTU RX_Frames TX_Frames <<< "$line"

    # Skip lines with missing critical fields (should not happen, but for safety)
    if [[ -z "$Physical_Port" || -z "$RX_Frames" ]]; then
        continue
    fi

    # Set RX_Frames as the metric value
    set_ot_object new value "${RX_Frames}"

    # Set all other columns as labels (explicitly, order is guaranteed)
    set_ot_object last label "Physical_Port"      "${Physical_Port}"
    set_ot_object last label "Interface_Name"     "${Interface_Name}"
    set_ot_object last label "Type"               "${Type}"
    set_ot_object last label "SG"                 "${SG}"
    set_ot_object last label "QSFP_Mode"          "${QSFP_Mode}"
    set_ot_object last label "Admin_State"        "${Admin_State}"
    set_ot_object last label "Link_State"         "${Link_State}"
    set_ot_object last label "Transceiver_State"  "${Transceiver_State}"
    set_ot_object last label "Operating_Speed"    "${Operating_Speed}"
    set_ot_object last label "MTU"                "${MTU}"
done < <(
    orch_stat -p | awk '
    BEGIN {FS="|"}
    /^\+/ { next }                              # Skip separator lines
    /^\|[ ]*Physical Port/ { next }             # Skip header line
    /^\|/ {
        if (NF != 14) next                      # Only process lines with 14 fields (12 data fields)
        row=""
        for(i=2; i<=13; i++) {                  # Extract fields 2 to 13 (data columns)
            f=$i
            gsub(/^[ \t]+|[ \t]+$/, "", f)      # Trim leading/trailing whitespace
            row = row f "\t"
        }
        sub(/\t$/, "", row)                     # Remove trailing tab
        print row
    }
    '
)

# Exit successfully
script_exit "Finished running" 0

 

Running it manually in the shell is working. it takes approximately 10 seconds to finish. This is long time - however I am testing and maybe it can be improved - maybe not.

However if I add it to the "sklnctl otlp add" and it is running after service restart (it should run evry 60s) I get this error:

[Expert@yyyy-mho1_01:0]# tail -n 1 /opt/CPotlpAgent/otlp_agent.log
ts=2025-06-16T00:07:43.461+02:00 caller=level.go:63 ts=2025-06-16T00:07:43.461+02:00 caller=level.go:63 level=info msg="Collector: /config/skyline_custom_metrics/skyline_custom_orch_stat_p_rxhas disabled due to: " Script:/var/log/CPotlpAgent/backup/scripts/skyline_custom_orch_stat_p_rx.shchangethestatetodisableddueto:TheCommand:/bin/bashexceededtheCPUthreshold=(MISSING)


Looks like a CPU limit in place. where to check? how to adjust? how to disable? Alternatives?

0 Kudos
0 Replies

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events