- CheckMates
- :
- Products
- :
- Quantum
- :
- Skyline
- :
- Re: [skyline] system_uptime metric disappeared
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[skyline] system_uptime metric disappeared
Hi all,
in the last weeks our Skyline grafana dashboard loose all the information related to our appliances uptime (now we see only "no data" messages). We checked the metrics of our environment (r81.20 and r82 clusters) with the command "cpview -m" and discovered that the metric "system_uptime" is disappeared!
What's happened? Could you check your environment?
Thanks...
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @gianogli ,
Prior to the latest version 'system.uptime' was pushed as part of the translation flow in otlp_cpview ( that extracts and convert CPView data) , as part of the last version it was moved to be it's own metric on otlp_agent, can you check what version of CPOtlpAgent you have on the machine? try to update to the latest take as needed. If there is further issues please reply here or contact me on eladch@checkpoint.com.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To clarify there is no data displayed at all or just not the uptime value?
Were there any recent changes to the environment?
The last time I encountered something like this there was a filter applied...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chris,
just the uptime value. The other metrics we use are working.
As you can see from this commands we haven't uptime metrics:
# cpview -m | jq '.' | grep -i metric\-id
"metric-id": "hardware.temperature.state",
"metric-id": "hardware.temperature.max",
"metric-id": "hardware.temperature.min",
"metric-id": "hardware.temperature",
"metric-id": "hardware.voltage.state",
"metric-id": "hardware.voltage.min",
"metric-id": "hardware.voltage.max",
"metric-id": "hardware.voltage",
"metric-id": "hardware.power_supply.state",
"metric-id": "hardware.power_supply",
"metric-id": "hardware.fan.state",
"metric-id": "hardware.fan.max",
"metric-id": "hardware.fan.min",
"metric-id": "hardware.fan",
"metric-id": "hardware.bios.state",
"metric-id": "hardware.bios",
"metric-id": "system.cpu.count",
"metric-id": "system.cpu.dynamic_balancing.state",
"metric-id": "system.cpu.utilization",
"metric-id": "system.cpu.interrupts",
"metric-id": "system.memory.limit",
"metric-id": "system.paging.limit",
"metric-id": "system.paging.usage",
"metric-id": "system.memory.usage",
"metric-id": "system.filesystem.usage",
"metric-id": "system.filesystem.limit",
"metric-id": "system.gaia.os.edition",
"metric-id": "system.gaia.os.role",
"metric-id": "system.gaia.os.version",
"metric-id": "system.gaia.module.version",
"metric-id": "firewall.multik.state",
"metric-id": "firewall.policy.time",
"metric-id": "firewall.policy.name",
"metric-id": "hardware.model",
"metric-id": "system.network.interface.state",
"metric-id": "system.network.interface.address",
"metric-id": "system.network.dropped.receive",
"metric-id": "system.network.dropped.transmit",
"metric-id": "system.network.errors.receive",
"metric-id": "system.network.errors.transmit",
"metric-id": "system.network.packets.receive",
"metric-id": "system.network.packets.transmit",
"metric-id": "system.network.io.receive",
"metric-id": "system.network.io.transmit",
"metric-id": "system.io.utilization",
"metric-id": "system.process.top.cpu.utilization",
"metric-id": "system.process.top.fd.count",
"metric-id": "system.process.top.memory.usage",
"metric-id": "kernel.instances.count",
"metric-id": "system.fw.memory.limit",
"metric-id": "system.fw.memory.usage",
"metric-id": "system.fw.memory.utilization",
"metric-id": "system.traffic.packets.receive",
"metric-id": "system.traffic.packets.transmit",
"metric-id": "system.traffic.io.receive",
"metric-id": "system.traffic.io.transmit",
"metric-id": "system.traffic.connections",
"metric-id": "system.traffic.dropped",
"metric-id": "vpn.packets",
"metric-id": "vpn.errors",
"metric-id": "vpn.ike.concurrent",
"metric-id": "vpn.restarts",
"metric-id": "vpn.ioctls",
"metric-id": "vpn.clients",
"metric-id": "vpn.ipsec.fragmentation.count",
"metric-id": "vpn.ipsec.fragmentation.drops",
"metric-id": "vpn.ike.peers",
"metric-id": "vpn.ike.max",
"metric-id": "vpn.ike.count",
"metric-id": "vpn.compression.bytes",
"metric-id": "vpn.compression.packets",
"metric-id": "vpn.kernel_traps",
"metric-id": "vpn.ike.negotiations.max",
"metric-id": "ida.authenticated",
"metric-id": "ida.authenticated.count",
"metric-id": "ida.logins.count",
"metric-id": "ida.logins.successful",
"metric-id": "ida.logged.unsuccessful",
"metric-id": "ida.user_directory.count",
"metric-id": "ida.components.disconnections",
"metric-id": "ida.components.state",
"metric-id": "ida.memory",
"metric-id": "flow_profiler.entities",
"metric-id": "flow_profiler.utilization",
"metric-id": "sxl.state",
"metric-id": "sxl.gtp.tunnels.created",
"metric-id": "sxl.gtp.tunnels.count",
"metric-id": "sxl.gtp.packets",
"metric-id": "sxl.synatk.configuration",
"metric-id": "sxl.synatk.status",
"metric-id": "sxl.synatk.global_high_threshold",
"metric-id": "sxl.synatk.interface_high_threshold",
"metric-id": "sxl.synatk.low_threshold",
"metric-id": "adv_prv.errors.count",
"metric-id": "adv_prv.expired",
"metric-id": "system.network.nat.ports",
"metric-id": "system.network.nat.ports.limit",
"metric-id": "system.network.nat.connections.count",
"metric-id": "system.network.nat.connections.rate",
"metric-id": "cluster_xl.mode",
"metric-id": "cluster_xl.members.state",
"metric-id": "cluster_xl.pnotes",
"metric-id": "vsx.overview",
"metric-id": "vsx.core_xl.count",
"metric-id": "blades.update.time",
"metric-id": "blades.update.state",
"metric-id": "system.network.interface.packets.receive.rate.peak",
"metric-id": "system.network.interface.packets.receive.rate",
"metric-id": "system.network.interface.packets.transmit.rate.peak",
"metric-id": "system.network.interface.packets.transmit.rate",
"metric-id": "system.network.interface.io.receive.rate.peak",
"metric-id": "system.network.interface.io.receive.rate",
"metric-id": "system.network.interface.io.transmit.rate.peak",
"metric-id": "system.network.interface.io.transmit.rate",
"metric-id": "system.network.interface.rx_throughput_bs",
"metric-id": "system.network.interface.tx_throughput_bs",
"metric-id": "system.network.connections.rate",
"metric-id": "system.network.connections",
"metric-id": "system.network.tcp_out_of_state_drops.state",
"metric-id": "system.network.blades.vpn.kernel_limit_reached_count",
"metric-id": "system.network.blades.vpn.active_clients",
"metric-id": "system.network.blades.vpn.ike_sas",
"metric-id": "system.network.blades.vpn.max_ike_sas",
"metric-id": "system.network.blades.vpn.total_sas",
"metric-id": "system.network.blades.vpn.all_ike_errors",
"metric-id": "voip.sip.multicore.state",
"metric-id": "voip.sip.earlynat.capacity",
"metric-id": "voip.sip.count",
"metric-id": "system.traffic.dropped.rate",
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @gianogli ,
Prior to the latest version 'system.uptime' was pushed as part of the translation flow in otlp_cpview ( that extracts and convert CPView data) , as part of the last version it was moved to be it's own metric on otlp_agent, can you check what version of CPOtlpAgent you have on the machine? try to update to the latest take as needed. If there is further issues please reply here or contact me on eladch@checkpoint.com.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Elad_Chomsky ,
autoupdates! This would explain why the system_update values disappeared without any modifications on our part.
I checked the agent version and both R82 and R81.20 are updated:
# cpinfo -y all 2>/dev/null | grep CPOTLPAGENT_AUTOUPDATE
BUNDLE_CPOTLPAGENT_AUTOUPDATE Take: 63
By using this version, how can I extract the uptime values of our appliances? 😎
Thanks...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @gianogli ,
If you use the Prometheus explorer - Have you verified you can't see there anymore the system.uptime metric? ( system_uptime) .
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Elad,
we are facing the same behavior.
# sklnctl otelcol metrics --show | grep uptime
Please refer to the Skyline Metrics Repository for the list of available metrics
But seeing it in prom we don't care
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Elad_Chomsky ,
sure, I verified as you can see in the attached screenshots.
I tried to run on all the GWs this command:
*********************************************
# sklnctl otelcol metrics --show
Please refer to the Skyline Metrics Repository for the list of available metrics
hardware.temperature.state
hardware.temperature.max
hardware.temperature.min
hardware.temperature
hardware.voltage.state
hardware.voltage.min
hardware.voltage.max
hardware.voltage
hardware.power_supply.state
hardware.power_supply
hardware.fan.state
hardware.fan.max
hardware.fan.min
hardware.fan
hardware.bios.state
hardware.bios
system.cpu.count
system.cpu.dynamic_balancing.state
system.cpu.utilization
system.cpu.interrupts
system.memory.limit
system.paging.limit
system.paging.usage
system.memory.usage
system.filesystem.usage
system.filesystem.limit
system.gaia.os.edition
system.gaia.os.role
system.gaia.os.version
system.gaia.module.version
firewall.multik.state
firewall.policy.time
firewall.policy.name
hardware.model
system.network.interface.state
system.network.interface.address
system.network.dropped.receive
system.network.dropped.transmit
system.network.errors.receive
system.network.errors.transmit
system.network.packets.receive
system.network.packets.transmit
system.network.io.receive
system.network.io.transmit
system.io.utilization
system.process.top.cpu.utilization
system.process.top.fd.count
system.process.top.memory.usage
kernel.instances.count
system.fw.memory.limit
system.fw.memory.usage
system.fw.memory.utilization
system.traffic.packets.receive
system.traffic.packets.transmit
system.traffic.io.receive
system.traffic.io.transmit
system.traffic.connections
system.traffic.dropped
vpn.packets
vpn.errors
vpn.ike.concurrent
vpn.restarts
vpn.ioctls
vpn.clients
vpn.ipsec.fragmentation.count
vpn.ipsec.fragmentation.drops
vpn.ike.peers
vpn.ike.max
vpn.ike.count
vpn.compression.bytes
vpn.compression.packets
vpn.kernel_traps
vpn.ike.negotiations.max
ida.authenticated
ida.authenticated.count
ida.logins.count
ida.logins.successful
ida.logged.unsuccessful
ida.user_directory.count
ida.components.disconnections
ida.components.state
ida.memory
flow_profiler.entities
flow_profiler.utilization
sxl.state
sxl.gtp.tunnels.created
sxl.gtp.tunnels.count
sxl.gtp.packets
sxl.synatk.configuration
sxl.synatk.status
sxl.synatk.global_high_threshold
sxl.synatk.interface_high_threshold
sxl.synatk.low_threshold
adv_prv.errors.count
adv_prv.expired
system.network.nat.connections.count
system.network.nat.connections.rate
cluster_xl.mode
cluster_xl.members.state
cluster_xl.pnotes
vsx.overview
vsx.core_xl.count
blades.update.time
blades.update.state
system.network.interface.packets.receive.rate.peak
system.network.interface.packets.receive.rate
system.network.interface.packets.transmit.rate.peak
system.network.interface.packets.transmit.rate
system.network.interface.io.receive.rate.peak
system.network.interface.io.receive.rate
system.network.interface.io.transmit.rate.peak
system.network.interface.io.transmit.rate
system.network.interface.rx_throughput_bs
system.network.interface.tx_throughput_bs
system.network.connections.rate
system.network.connections
system.network.tcp_out_of_state_drops.state
system.network.blades.vpn.kernel_limit_reached_count
system.network.blades.vpn.active_clients
system.network.blades.vpn.ike_sas
system.network.blades.vpn.max_ike_sas
system.network.blades.vpn.total_sas
system.network.blades.vpn.all_ike_errors
voip.sip.multicore.state
voip.sip.earlynat.capacity
voip.sip.count
system.traffic.dropped.rate
system.uptime
agent.gaia.os.role
*********************************************
Here I see the system.uptime metric, but is quite strange the empty line before it.
What do you think?
Thanks...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Elad_Chomsky ,
problem solved. I've just checked again our grafana dashboard and I found that all our queries were filtered for service_name "CPviewExporter". If I delete this filter or I change it to "otlp_agend" I'm able to see the metric again.
It was my error... Thanks for your support. 🙏
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's funny. I see both.