Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
gianogli
Contributor
Jump to solution

[skyline] system_uptime metric disappeared

Hi all,

in the last weeks our Skyline grafana dashboard loose all the information related to our appliances uptime (now we see only "no data" messages). We checked the metrics of our environment (r81.20 and r82 clusters) with the command "cpview -m" and discovered that the metric "system_uptime" is disappeared!

What's happened? Could you check your environment?

Thanks...

 

0 Kudos
1 Solution

Accepted Solutions
Elad_Chomsky
Employee
Employee

Hi @gianogli ,

Prior to the latest version 'system.uptime' was pushed as part of the translation flow in otlp_cpview ( that extracts and convert CPView data) , as part of the last version it was moved to be it's own metric on otlp_agent, can you check what version of CPOtlpAgent you have on the machine? try to update to the latest take as needed. If there is further issues please reply here or contact me on eladch@checkpoint.com

https://support.checkpoint.com/results/sk/sk181615

View solution in original post

10 Replies
Chris_Atkinson
Employee Employee
Employee

To clarify there is no data displayed at all or just not the uptime value?

Were there any recent changes to the environment?

The last time I encountered something like this there was a filter applied...

CCSM R77/R80/ELITE
gianogli
Contributor

Hi Chris,

just the uptime value. The other metrics we use are working.

As you can see from this commands we haven't uptime metrics:


 # cpview -m | jq '.' | grep -i metric\-id
     "metric-id": "hardware.temperature.state",
     "metric-id": "hardware.temperature.max",
     "metric-id": "hardware.temperature.min",
     "metric-id": "hardware.temperature",
     "metric-id": "hardware.voltage.state",
     "metric-id": "hardware.voltage.min",
     "metric-id": "hardware.voltage.max",
     "metric-id": "hardware.voltage",
     "metric-id": "hardware.power_supply.state",
     "metric-id": "hardware.power_supply",
     "metric-id": "hardware.fan.state",
     "metric-id": "hardware.fan.max",
     "metric-id": "hardware.fan.min",
     "metric-id": "hardware.fan",
     "metric-id": "hardware.bios.state",
     "metric-id": "hardware.bios",
     "metric-id": "system.cpu.count",
     "metric-id": "system.cpu.dynamic_balancing.state",
     "metric-id": "system.cpu.utilization",
     "metric-id": "system.cpu.interrupts",
     "metric-id": "system.memory.limit",
     "metric-id": "system.paging.limit",
     "metric-id": "system.paging.usage",
     "metric-id": "system.memory.usage",
     "metric-id": "system.filesystem.usage",
     "metric-id": "system.filesystem.limit",
     "metric-id": "system.gaia.os.edition",
     "metric-id": "system.gaia.os.role",
     "metric-id": "system.gaia.os.version",
     "metric-id": "system.gaia.module.version",
     "metric-id": "firewall.multik.state",
     "metric-id": "firewall.policy.time",
     "metric-id": "firewall.policy.name",
     "metric-id": "hardware.model",
     "metric-id": "system.network.interface.state",
     "metric-id": "system.network.interface.address",
     "metric-id": "system.network.dropped.receive",
     "metric-id": "system.network.dropped.transmit",
     "metric-id": "system.network.errors.receive",
     "metric-id": "system.network.errors.transmit",
     "metric-id": "system.network.packets.receive",
     "metric-id": "system.network.packets.transmit",
     "metric-id": "system.network.io.receive",
     "metric-id": "system.network.io.transmit",
     "metric-id": "system.io.utilization",
     "metric-id": "system.process.top.cpu.utilization",
     "metric-id": "system.process.top.fd.count",
     "metric-id": "system.process.top.memory.usage",
     "metric-id": "kernel.instances.count",
     "metric-id": "system.fw.memory.limit",
     "metric-id": "system.fw.memory.usage",
     "metric-id": "system.fw.memory.utilization",
     "metric-id": "system.traffic.packets.receive",
     "metric-id": "system.traffic.packets.transmit",
     "metric-id": "system.traffic.io.receive",
     "metric-id": "system.traffic.io.transmit",
     "metric-id": "system.traffic.connections",
     "metric-id": "system.traffic.dropped",
     "metric-id": "vpn.packets",
     "metric-id": "vpn.errors",
     "metric-id": "vpn.ike.concurrent",
     "metric-id": "vpn.restarts",
     "metric-id": "vpn.ioctls",
     "metric-id": "vpn.clients",
     "metric-id": "vpn.ipsec.fragmentation.count",
     "metric-id": "vpn.ipsec.fragmentation.drops",
     "metric-id": "vpn.ike.peers",
     "metric-id": "vpn.ike.max",
     "metric-id": "vpn.ike.count",
     "metric-id": "vpn.compression.bytes",
     "metric-id": "vpn.compression.packets",
     "metric-id": "vpn.kernel_traps",
     "metric-id": "vpn.ike.negotiations.max",
     "metric-id": "ida.authenticated",
     "metric-id": "ida.authenticated.count",
     "metric-id": "ida.logins.count",
     "metric-id": "ida.logins.successful",
     "metric-id": "ida.logged.unsuccessful",
     "metric-id": "ida.user_directory.count",
     "metric-id": "ida.components.disconnections",
     "metric-id": "ida.components.state",
     "metric-id": "ida.memory",
     "metric-id": "flow_profiler.entities",
     "metric-id": "flow_profiler.utilization",
     "metric-id": "sxl.state",
     "metric-id": "sxl.gtp.tunnels.created",
     "metric-id": "sxl.gtp.tunnels.count",
     "metric-id": "sxl.gtp.packets",
     "metric-id": "sxl.synatk.configuration",
     "metric-id": "sxl.synatk.status",
     "metric-id": "sxl.synatk.global_high_threshold",
     "metric-id": "sxl.synatk.interface_high_threshold",
     "metric-id": "sxl.synatk.low_threshold",
     "metric-id": "adv_prv.errors.count",
     "metric-id": "adv_prv.expired",
     "metric-id": "system.network.nat.ports",
     "metric-id": "system.network.nat.ports.limit",
     "metric-id": "system.network.nat.connections.count",
     "metric-id": "system.network.nat.connections.rate",
     "metric-id": "cluster_xl.mode",
     "metric-id": "cluster_xl.members.state",
     "metric-id": "cluster_xl.pnotes",
     "metric-id": "vsx.overview",
     "metric-id": "vsx.core_xl.count",
     "metric-id": "blades.update.time",
     "metric-id": "blades.update.state",
     "metric-id": "system.network.interface.packets.receive.rate.peak",
     "metric-id": "system.network.interface.packets.receive.rate",
     "metric-id": "system.network.interface.packets.transmit.rate.peak",
     "metric-id": "system.network.interface.packets.transmit.rate",
     "metric-id": "system.network.interface.io.receive.rate.peak",
     "metric-id": "system.network.interface.io.receive.rate",
     "metric-id": "system.network.interface.io.transmit.rate.peak",
     "metric-id": "system.network.interface.io.transmit.rate",
     "metric-id": "system.network.interface.rx_throughput_bs",
     "metric-id": "system.network.interface.tx_throughput_bs",
     "metric-id": "system.network.connections.rate",
     "metric-id": "system.network.connections",
     "metric-id": "system.network.tcp_out_of_state_drops.state",
     "metric-id": "system.network.blades.vpn.kernel_limit_reached_count",
     "metric-id": "system.network.blades.vpn.active_clients",
     "metric-id": "system.network.blades.vpn.ike_sas",
     "metric-id": "system.network.blades.vpn.max_ike_sas",
     "metric-id": "system.network.blades.vpn.total_sas",
     "metric-id": "system.network.blades.vpn.all_ike_errors",
     "metric-id": "voip.sip.multicore.state",
     "metric-id": "voip.sip.earlynat.capacity",
     "metric-id": "voip.sip.count",
     "metric-id": "system.traffic.dropped.rate",

0 Kudos
gianogli
Contributor

Hi @Chris_Atkinson ,

you were right, the problem was related to a filter... 😅

Thanks... 

Elad_Chomsky
Employee
Employee

Hi @gianogli ,

Prior to the latest version 'system.uptime' was pushed as part of the translation flow in otlp_cpview ( that extracts and convert CPView data) , as part of the last version it was moved to be it's own metric on otlp_agent, can you check what version of CPOtlpAgent you have on the machine? try to update to the latest take as needed. If there is further issues please reply here or contact me on eladch@checkpoint.com

https://support.checkpoint.com/results/sk/sk181615

gianogli
Contributor

Hi @Elad_Chomsky ,

autoupdates! This would explain why the system_update values ​​disappeared without any modifications on our part.

I checked the agent version and both R82 and R81.20 are updated:

# cpinfo -y all 2>/dev/null | grep CPOTLPAGENT_AUTOUPDATE
       BUNDLE_CPOTLPAGENT_AUTOUPDATE   Take:  63

By using this version, how can I extract the uptime values of our appliances? 😎

Thanks...

0 Kudos
Elad_Chomsky
Employee
Employee

Hi @gianogli ,

If you use the Prometheus explorer - Have you verified you can't see there anymore the system.uptime metric? ( system_uptime) . 

0 Kudos
Vincent_Bacher
Advisor
Advisor

Hi Elad,

we are facing the same behavior.

# sklnctl otelcol metrics --show | grep uptime
Please refer to the Skyline Metrics Repository for the list of available metrics

But seeing it in prom we don't care

system_uptime.png

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
0 Kudos
gianogli
Contributor

Hi @Elad_Chomsky ,

sure, I verified as you can see in the attached screenshots.

I tried to run on all the GWs this command:

*********************************************
# sklnctl otelcol metrics --show
Please refer to the Skyline Metrics Repository for the list of available metrics
hardware.temperature.state
hardware.temperature.max
hardware.temperature.min
hardware.temperature
hardware.voltage.state
hardware.voltage.min
hardware.voltage.max
hardware.voltage
hardware.power_supply.state
hardware.power_supply
hardware.fan.state
hardware.fan.max
hardware.fan.min
hardware.fan
hardware.bios.state
hardware.bios
system.cpu.count
system.cpu.dynamic_balancing.state
system.cpu.utilization
system.cpu.interrupts
system.memory.limit
system.paging.limit
system.paging.usage
system.memory.usage
system.filesystem.usage
system.filesystem.limit
system.gaia.os.edition
system.gaia.os.role
system.gaia.os.version
system.gaia.module.version
firewall.multik.state
firewall.policy.time
firewall.policy.name
hardware.model
system.network.interface.state
system.network.interface.address
system.network.dropped.receive
system.network.dropped.transmit
system.network.errors.receive
system.network.errors.transmit
system.network.packets.receive
system.network.packets.transmit
system.network.io.receive
system.network.io.transmit
system.io.utilization
system.process.top.cpu.utilization
system.process.top.fd.count
system.process.top.memory.usage
kernel.instances.count
system.fw.memory.limit
system.fw.memory.usage
system.fw.memory.utilization
system.traffic.packets.receive
system.traffic.packets.transmit
system.traffic.io.receive
system.traffic.io.transmit
system.traffic.connections
system.traffic.dropped
vpn.packets
vpn.errors
vpn.ike.concurrent
vpn.restarts
vpn.ioctls
vpn.clients
vpn.ipsec.fragmentation.count
vpn.ipsec.fragmentation.drops
vpn.ike.peers
vpn.ike.max
vpn.ike.count
vpn.compression.bytes
vpn.compression.packets
vpn.kernel_traps
vpn.ike.negotiations.max
ida.authenticated
ida.authenticated.count
ida.logins.count
ida.logins.successful
ida.logged.unsuccessful
ida.user_directory.count
ida.components.disconnections
ida.components.state
ida.memory
flow_profiler.entities
flow_profiler.utilization
sxl.state
sxl.gtp.tunnels.created
sxl.gtp.tunnels.count
sxl.gtp.packets
sxl.synatk.configuration
sxl.synatk.status
sxl.synatk.global_high_threshold
sxl.synatk.interface_high_threshold
sxl.synatk.low_threshold
adv_prv.errors.count
adv_prv.expired
system.network.nat.connections.count
system.network.nat.connections.rate
cluster_xl.mode
cluster_xl.members.state
cluster_xl.pnotes
vsx.overview
vsx.core_xl.count
blades.update.time
blades.update.state
system.network.interface.packets.receive.rate.peak
system.network.interface.packets.receive.rate
system.network.interface.packets.transmit.rate.peak
system.network.interface.packets.transmit.rate
system.network.interface.io.receive.rate.peak
system.network.interface.io.receive.rate
system.network.interface.io.transmit.rate.peak
system.network.interface.io.transmit.rate
system.network.interface.rx_throughput_bs
system.network.interface.tx_throughput_bs
system.network.connections.rate
system.network.connections
system.network.tcp_out_of_state_drops.state
system.network.blades.vpn.kernel_limit_reached_count
system.network.blades.vpn.active_clients
system.network.blades.vpn.ike_sas
system.network.blades.vpn.max_ike_sas
system.network.blades.vpn.total_sas
system.network.blades.vpn.all_ike_errors
voip.sip.multicore.state
voip.sip.earlynat.capacity
voip.sip.count
system.traffic.dropped.rate

system.uptime
agent.gaia.os.role

*********************************************

Here I see the system.uptime metric, but is quite strange the empty line before it.

What do you think?

Thanks...

0 Kudos
gianogli
Contributor

Hi @Elad_Chomsky ,

problem solved. I've just checked again our grafana dashboard and I found that all our queries were filtered for service_name "CPviewExporter". If I delete this filter or I change it to "otlp_agend" I'm able to see the metric again.

It was my error... Thanks for your support. 🙏

0 Kudos
Vincent_Bacher
Advisor
Advisor

That's funny. I see both.

system_uptime2.png

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events