Quantum SD-WAN Steering Deep Dive: How Decisions Are Made (and Where to Troubleshoot)
If you operate Check Point Quantum SD-WAN long enough, you’ll notice a pattern: when users report “wrong ISP used”, “overlay picked the bad path”, or “policy looks right but traffic disagrees”, the root cause is usually steering state—not the rule definition itself.
This post breaks down what the SD-WAN Steering process actually does, how it writes decisions into kernel tables, and what evidence to collect when steering doesn’t behave as expected.
Why this matters
Steering is the bridge between:
When steering is healthy, the gateway consistently selects the best path based on probing measurements + thresholds + prioritization/aggregation logic. When steering is unhealthy or out of sync, you get drift: policy exists, but enforcement does something else.
1) SD-WAN Steering Responsibilities
Steering is responsible for making real-time path selection decisions for SD-WAN traffic.
Probing measurements it owns
Steering continuously collects measurements for:
Next hop probing
(health to upstream/next-hop per ISP interface)
Local Breakout probing
(health to “internet targets” defined in policy rules)
Overlay probing
(health of VPN peer interfaces / overlay paths)
Core outputs
Based on probing results, steering:
Evaluates probing results and decides which interface (ISP) or transport to use per SD-WAN rule
Writes the selected path into kernel tables used by the packet/connection processing flow
Updates iNext (via Nano Agent) with steering events (for portal visibility)
Updates cpview with selected ISPs and probing statistics (for telemetry/analytics)
Key point: steering isn’t just “calculation”—it’s state + installation into kernel tables.
2) Steering Decision Flow (Control vs Enforcement)
At a high level:
Step A — Steering computes the best path
Then, steering records the selected ISPs/transports in the relevant kernel tables.
Step B — Packet/connection processing consumes those tables
When a new connection is created:
Local Breakout
The Firewall obtains the ISP to carry the connection from the relevant ISP table.
Overlay / Backhaul
VPN obtains the VPN Transport to carry the connection from the relevant VPN transport table.
Failure mode you should recognize
If a connection:
is not steered to the correct ISP/VPN transport, or
fails to forward properly,
you must inspect the kernel tables content.
If the table is empty, there is no ISP/transport to carry the connection → it will fail by design.
3) Steering Decisions for Local Breakout
Steering uses thresholds and rule logic to determine which ISPs are eligible.
Eligibility logic (threshold gate)
Selection logic (within allowed set)
From the allowed ISPs, steering selects according to the rule’s steering object configuration:
Prioritization
Link Aggregation
Steering selects all allowed ISPs
The Firewall chooses among them based on the aggregation method (hash, etc.)
Operational takeaway: In Link Aggregation, “allowed” can mean multiple active candidates, and the final per-connection decision depends on the aggregation algorithm, not only on probing rank.
4) Steering Decisions for VPN Overlay
Overlay steering has an extra dependency: VPN peer discovery and transport preparation.
Peer learning / installation pipeline
SD-WAN VPN peers are learned via GW Sharing
They are downloaded by the SD-WAN Nano service
They are installed with the SD-WAN policy into the Steering process
Steering performs additional checks and, if identified as SD-WAN VPN peers:
classifies them as SD-WAN VPN peers
prepares VPN Transports configuration for these peers, for later use by VPN
Eligibility and preference
All VPN Transports below thresholds (per rule + peer pair) are allowed
Among the allowed, some transports are selected as preferred (per rule + peer pair)
Selection is based on the same steering object settings:
With Link aggregation:
steering marks all allowed VPN transports as eligible
VPN chooses among them based on aggregation method (hash, etc.)
Operational takeaway: Overlay issues often aren’t “VPN is broken”—they’re “steering never produced eligible transports for this peer pair”.
5) Steering Commands (operational control)
Two key operational commands shown in the material:
Use cases:
Important: Treat this like restarting a control component—use it intentionally, and correlate with logs/events.
6) Troubleshooting: What to Collect Before Opening a TAC Case
When steering is wrong, the fastest path to resolution is to prove where the pipeline breaks:
A) Probing health and thresholds
If probing is missing/invalid, steering can’t generate allowed candidates.
B) Rule intent vs enforcement state
Confirm rule configuration (prioritization vs link aggregation)
Confirm thresholds per rule (loss/jitter/latency)
Confirm that at least one candidate is below thresholds
C) Kernel tables content (the “enforcement truth”)
If the gateway is not using the correct ISP/transport, provide:
If tables are empty → steering did not install state.
D) Telemetry and events
E) Reproduction details
exact source/destination/service of a failing flow
whether it’s breakout vs overlay/backhaul
time of failure (to align with probing intervals and events)
Common Pitfalls (what usually bites people)
Thresholds too strict → all links become “not allowed” → empty tables → failures
Link aggregation misunderstood → multiple ISPs allowed, hash selects a non-obvious path
GW Sharing drift → overlay peers not learned/installed consistently
“Policy exists” assumption → policy in portal is not proof of steering state installed on the gateway
Closing
Steering is not magic. It is a deterministic pipeline:
Probing → eligibility (thresholds) → selection (prioritization/aggregation) → kernel tables → packet/connection processing → telemetry/events
If you troubleshoot it in that order, you’ll stop guessing—and you’ll fix issues much faster.
If you want, reply with:
“breakout or overlay?”,
a sample flow (src/dst/service), and
whether you’re using prioritization or link aggregation,
and I can suggest exactly which kernel tables/telemetry points to validate first.