Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Toolmaker
Participant
Jump to solution

Skyline + VSX: "OpenTelemetry Components are not up yet"

Hi,

we try to use Skyline on R81.10, following sk178566. Our version is

Product version Check Point Gaia R81.10
OS build 335
OS kernel version 3.10.0-957.21.3cpx86_64
OS edition 64-bit

 

Running '/opt/CPotelcol/REST.py --set_open_telemetry "$(cat payload.json)"' throws an

    Exception: OpenTelemetry Components are not up yet

This seems to be due to the result from '/opt/CPviewExporter/CPviewExporterCli.sh show':

{
"active_after_reboot":"false",
"status":"Agents are down"
}

Looking into CPviewExporterCli.sh, we see that the "show" parameter branches into the function product_status_json.

I assume that this function should check each virtual systems (loop over $(vslist) ) and either return "Agents are down", "All agents are up" or "Agents are partially up". To that end,  it sets state variables FOUND_ONE_NON_ACTIVE and FOUND_ACTIVE:

function product_status_json() {
for VS in $(vslist); do
STAT=$(${COMPONENT_DIR}/otlp_wd.bash -o stat ${VS})
FOUND_ONE_NON_ACTIVE=false
FOUND_ACTIVE=false
if [[ ${STAT} == "Agent is not running" ]]; then
FOUND_ONE_NON_ACTIVE=true
elif [[ ${STAT} == "Agent is running" ]]; then
FOUND_ACTIVE=true
fi
if ${FOUND_ONE_NON_ACTIVE} && ${FOUND_ACTIVE}; then
RC_EXPORTER="Agents are partially up"
elif ${FOUND_ONE_NON_ACTIVE}; then
RC_EXPORTER="Agents are down"
elif ${FOUND_ACTIVE}; then
RC_EXPORTER="All agents are up"
else
RC_EXPORTER="Unknown"
fi
done

echo "{"
if [[ $(is_product_active) -eq 1 ]]; then
echo -e "\t\"active_after_reboot\":\"true\","
else
echo -e "\t\"active_after_reboot\":\"false\","
fi
echo -e "\t\"status\":\"${RC_EXPORTER}\""
echo "}"
}

However, the state variables get set to default values inside the for loop (instead of before), and the evaluation logic setting RC_EXPORTER also runs inside the loop (instead of after). I wonder if this is really intended?

As a consequence, the final result only depends on the VS with the highest number, not on any of the other VSs; and "Agents are partially up" will never be a result.

At least, that is my interpretation... I might also have misunderstood this functions intent or working.

 

In our setup, the highest-numbered VS is a virtual switch for which the test in line 3 of product_status_json()

   otlp_wd.bash -o stat ${VS})

returns "not running", so  the above /opt/CPotelcol/REST.py will always fail.

 

After that lengthy prologue, I wonder:

- Is the above behaviour correct  in CPviewExporterCli.sh?
- Can/should we activate the CpviewExporter agent in the virtual switches?
- If so, how?
- or is it safe to modify "CPviewExporterCli.sh show" to always return "All agents are up"?

 

Many thanks,

Bernhard

2 Solutions

Accepted Solutions
Elad_Chomsky
Employee
Employee

@Toolmaker wrote:

Hi,

we try to use Skyline on R81.10, following sk178566. Our version is

Product version Check Point Gaia R81.10
OS build 335
OS kernel version 3.10.0-957.21.3cpx86_64
OS edition 64-bit

 

Running '/opt/CPotelcol/REST.py --set_open_telemetry "$(cat payload.json)"' throws an

    Exception: OpenTelemetry Components are not up yet

This seems to be due to the result from '/opt/CPviewExporter/CPviewExporterCli.sh show':

{
"active_after_reboot":"false",
"status":"Agents are down"
}

Looking into CPviewExporterCli.sh, we see that the "show" parameter branches into the function product_status_json.

I assume that this function should check each virtual systems (loop over $(vslist) ) and either return "Agents are down", "All agents are up" or "Agents are partially up". To that end,  it sets state variables FOUND_ONE_NON_ACTIVE and FOUND_ACTIVE:

function product_status_json() {
for VS in $(vslist); do
STAT=$(${COMPONENT_DIR}/otlp_wd.bash -o stat ${VS})
FOUND_ONE_NON_ACTIVE=false
FOUND_ACTIVE=false
if [[ ${STAT} == "Agent is not running" ]]; then
FOUND_ONE_NON_ACTIVE=true
elif [[ ${STAT} == "Agent is running" ]]; then
FOUND_ACTIVE=true
fi
if ${FOUND_ONE_NON_ACTIVE} && ${FOUND_ACTIVE}; then
RC_EXPORTER="Agents are partially up"
elif ${FOUND_ONE_NON_ACTIVE}; then
RC_EXPORTER="Agents are down"
elif ${FOUND_ACTIVE}; then
RC_EXPORTER="All agents are up"
else
RC_EXPORTER="Unknown"
fi
done

echo "{"
if [[ $(is_product_active) -eq 1 ]]; then
echo -e "\t\"active_after_reboot\":\"true\","
else
echo -e "\t\"active_after_reboot\":\"false\","
fi
echo -e "\t\"status\":\"${RC_EXPORTER}\""
echo "}"
}

However, the state variables get set to default values inside the for loop (instead of before), and the evaluation logic setting RC_EXPORTER also runs inside the loop (instead of after). I wonder if this is really intended?

As a consequence, the final result only depends on the VS with the highest number, not on any of the other VSs; and "Agents are partially up" will never be a result.

At least, that is my interpretation... I might also have misunderstood this functions intent or working.

 

In our setup, the highest-numbered VS is a virtual switch for which the test in line 3 of product_status_json()

   otlp_wd.bash -o stat ${VS})

returns "not running", so  the above /opt/CPotelcol/REST.py will always fail.

 

After that lengthy prologue, I wonder:

- Is the above behaviour correct  in CPviewExporterCli.sh?
- Can/should we activate the CpviewExporter agent in the virtual switches?
- If so, how?
- or is it safe to modify "CPviewExporterCli.sh show" to always return "All agents are up"?

 

Many thanks,

Bernhard




Hi @Toolmaker,

We have found the issue, it seems like this is happening when the last VS has a different status then the ones before it. We will fix the code, and it will be pushed as part of the next version. Out of curiosity, can you expend on why the last VS has a different status then the previous ones?

You can modify the script as follows to fix this issue:

 

   for VS in $(vslist); do
     STAT=$(${COMPONENT_DIR}/otlp_wd.bash -o stat ${VS})
     FOUND_ONE_NON_ACTIVE=false
     FOUND_ACTIVE=false
     if [[ ${STAT} == "Agent is not running" ]]; then
       FOUND_ONE_NON_ACTIVE=true
     elif [[ ${STAT} == "Agent is running" ]]; then
       FOUND_ACTIVE=true
     fi
     if ${FOUND_ONE_NON_ACTIVE} && ${FOUND_ACTIVE}; then
       RC_EXPORTER="Agents are partially up"
     elif ${FOUND_ONE_NON_ACTIVE}; then
       RC_EXPORTER="Agents are down"
     elif ${FOUND_ACTIVE}; then
       RC_EXPORTER="All agents are up"
      else
        RC_EXPORTER="Unknown"
     fi
  done

 

Change to:

 

  FOUND_ONE_NON_ACTIVE=false
  FOUND_ONE_ACTIVE=false
  for VS in $(vslist); do
    STAT=$(${COMPONENT_DIR}/otlp_wd.bash -o stat ${VS})
    if [[ ${STAT} == "Agent is not running" ]]; then
      FOUND_ONE_NON_ACTIVE=true
    elif [[ ${STAT} == "Agent is running" ]]; then
      FOUND_ONE_ACTIVE=true
    fi
  done
  if ${FOUND_ONE_NON_ACTIVE} && ${FOUND_ONE_ACTIVE}; then
    RC_EXPORTER="Agents are partially up"
  elif ${FOUND_ONE_NON_ACTIVE}; then
    RC_EXPORTER="Agents are down"
  elif ${FOUND_ONE_ACTIVE}; then
    RC_EXPORTER="All agents are up"
  else
    RC_EXPORTER="Unknown"
  fi

 

 

View solution in original post

Elad_Chomsky
Employee
Employee

Hi @NUNO_C ,

Please download REST.py from here, (SHA1 should start with '55da7f....', please contact me on eladch@checkpoint.com, if SHA1 on the website is mismatched ), Replace /opt/CPotelcol/REST.py with it , and retry to run the script. This is a newer version, that is going to be pushed as part of an upcoming AutoUpdater release ( Component name is CPotelcol ). 

View solution in original post

(1)
12 Replies
Chris_Atkinson
Employee Employee
Employee

To confirm the system in question is running Jumbo take 79 or higher? To check run:

cpinfo -y all

CCSM R77/R80/ELITE
Toolmaker
Participant

Take 81:

 

[Expert@fw-vsxa-01:0]# cpinfo -y FW1
This is Check Point CPinfo Build 914000231 for GAIA
[FW1]
HOTFIX_R81_10_JUMBO_HF_MAIN Take: 81

JozkoMrkvicka
Mentor
Mentor

Not sure who is the original code owner, but looks like the logic explained is pretty clear and the code should be revised.

Looping in @Arik_Ovtracht 

Kind regards,
Jozko Mrkvicka
0 Kudos
the_rock
Legend
Legend

I followed the sk you mentioned using Grafana method and worked fine. I tested this on R81.10 jumbo 87, which is the latest, but as @Chris_Atkinson mentioned, have you ensured you are on at least take 79 as indicated in the article?

Just run cpinfo -y FW1 and it will give you the jumbo version.

Andy

0 Kudos
Elad_Chomsky
Employee
Employee

@Toolmaker wrote:

Hi,

we try to use Skyline on R81.10, following sk178566. Our version is

Product version Check Point Gaia R81.10
OS build 335
OS kernel version 3.10.0-957.21.3cpx86_64
OS edition 64-bit

 

Running '/opt/CPotelcol/REST.py --set_open_telemetry "$(cat payload.json)"' throws an

    Exception: OpenTelemetry Components are not up yet

This seems to be due to the result from '/opt/CPviewExporter/CPviewExporterCli.sh show':

{
"active_after_reboot":"false",
"status":"Agents are down"
}

Looking into CPviewExporterCli.sh, we see that the "show" parameter branches into the function product_status_json.

I assume that this function should check each virtual systems (loop over $(vslist) ) and either return "Agents are down", "All agents are up" or "Agents are partially up". To that end,  it sets state variables FOUND_ONE_NON_ACTIVE and FOUND_ACTIVE:

function product_status_json() {
for VS in $(vslist); do
STAT=$(${COMPONENT_DIR}/otlp_wd.bash -o stat ${VS})
FOUND_ONE_NON_ACTIVE=false
FOUND_ACTIVE=false
if [[ ${STAT} == "Agent is not running" ]]; then
FOUND_ONE_NON_ACTIVE=true
elif [[ ${STAT} == "Agent is running" ]]; then
FOUND_ACTIVE=true
fi
if ${FOUND_ONE_NON_ACTIVE} && ${FOUND_ACTIVE}; then
RC_EXPORTER="Agents are partially up"
elif ${FOUND_ONE_NON_ACTIVE}; then
RC_EXPORTER="Agents are down"
elif ${FOUND_ACTIVE}; then
RC_EXPORTER="All agents are up"
else
RC_EXPORTER="Unknown"
fi
done

echo "{"
if [[ $(is_product_active) -eq 1 ]]; then
echo -e "\t\"active_after_reboot\":\"true\","
else
echo -e "\t\"active_after_reboot\":\"false\","
fi
echo -e "\t\"status\":\"${RC_EXPORTER}\""
echo "}"
}

However, the state variables get set to default values inside the for loop (instead of before), and the evaluation logic setting RC_EXPORTER also runs inside the loop (instead of after). I wonder if this is really intended?

As a consequence, the final result only depends on the VS with the highest number, not on any of the other VSs; and "Agents are partially up" will never be a result.

At least, that is my interpretation... I might also have misunderstood this functions intent or working.

 

In our setup, the highest-numbered VS is a virtual switch for which the test in line 3 of product_status_json()

   otlp_wd.bash -o stat ${VS})

returns "not running", so  the above /opt/CPotelcol/REST.py will always fail.

 

After that lengthy prologue, I wonder:

- Is the above behaviour correct  in CPviewExporterCli.sh?
- Can/should we activate the CpviewExporter agent in the virtual switches?
- If so, how?
- or is it safe to modify "CPviewExporterCli.sh show" to always return "All agents are up"?

 

Many thanks,

Bernhard




Hi @Toolmaker,

We have found the issue, it seems like this is happening when the last VS has a different status then the ones before it. We will fix the code, and it will be pushed as part of the next version. Out of curiosity, can you expend on why the last VS has a different status then the previous ones?

You can modify the script as follows to fix this issue:

 

   for VS in $(vslist); do
     STAT=$(${COMPONENT_DIR}/otlp_wd.bash -o stat ${VS})
     FOUND_ONE_NON_ACTIVE=false
     FOUND_ACTIVE=false
     if [[ ${STAT} == "Agent is not running" ]]; then
       FOUND_ONE_NON_ACTIVE=true
     elif [[ ${STAT} == "Agent is running" ]]; then
       FOUND_ACTIVE=true
     fi
     if ${FOUND_ONE_NON_ACTIVE} && ${FOUND_ACTIVE}; then
       RC_EXPORTER="Agents are partially up"
     elif ${FOUND_ONE_NON_ACTIVE}; then
       RC_EXPORTER="Agents are down"
     elif ${FOUND_ACTIVE}; then
       RC_EXPORTER="All agents are up"
      else
        RC_EXPORTER="Unknown"
     fi
  done

 

Change to:

 

  FOUND_ONE_NON_ACTIVE=false
  FOUND_ONE_ACTIVE=false
  for VS in $(vslist); do
    STAT=$(${COMPONENT_DIR}/otlp_wd.bash -o stat ${VS})
    if [[ ${STAT} == "Agent is not running" ]]; then
      FOUND_ONE_NON_ACTIVE=true
    elif [[ ${STAT} == "Agent is running" ]]; then
      FOUND_ONE_ACTIVE=true
    fi
  done
  if ${FOUND_ONE_NON_ACTIVE} && ${FOUND_ONE_ACTIVE}; then
    RC_EXPORTER="Agents are partially up"
  elif ${FOUND_ONE_NON_ACTIVE}; then
    RC_EXPORTER="Agents are down"
  elif ${FOUND_ONE_ACTIVE}; then
    RC_EXPORTER="All agents are up"
  else
    RC_EXPORTER="Unknown"
  fi

 

 

NUNO_C
Participant

Hi, 

After changing the script, the result went from
"Agent is not running"

to 
"Agents are partially up"

Which is expected since the vsx environment  im testing has VS switches

But /opt/CPotelcol/REST.py

Expects 

"All agents are up"

 

So "Agents are partially up" doesnt do the job as expected

if len(filter) == 0 or 'exporter' in filter:
status.append(json.loads(out_cpview_exporter)["status"] == "All agents are up")

 

Cheers,

 

Nuno

0 Kudos
Elad_Chomsky
Employee
Employee

Hi @NUNO_C ,

Please download REST.py from here, (SHA1 should start with '55da7f....', please contact me on eladch@checkpoint.com, if SHA1 on the website is mismatched ), Replace /opt/CPotelcol/REST.py with it , and retry to run the script. This is a newer version, that is going to be pushed as part of an upcoming AutoUpdater release ( Component name is CPotelcol ). 

(1)
NUNO_C
Participant

Hi @Elad_Chomsky ,

The new REST.py  and CPviewExporterCli.sh modification did the job, skyline telemetry working as expected. 

If we can get a notification in which take will be added would be great.

Thanks,
N

0 Kudos
rizkyw
Explorer

Hi, I cant downlod REST.py file. with below information

Missing software subscription to download this file.

Thanks

0 Kudos
Chris_Atkinson
Employee Employee
Employee

Do you have an account associated with a valid support contract?

CCSM R77/R80/ELITE
0 Kudos
rizkyw
Explorer

Hi Chris,

I guess my account is not associated with valid support. Do you have any solutions? because in grafana for vsx not showing the data. I already attached screenshot for vsx0, vsx1, and vsx2

0 Kudos
Elad_Chomsky
Employee
Employee

Hi @rizo 

Please see the official sk, we moved to a new tool ( sklnctl ), please try to see if using it resolves the issue. 

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events