Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
WiliRGasparetto
MVP Diamond
MVP Diamond

Quantum SD-WAN Steering Deep Dive: How Decisions Are Made (and Where to Troubleshoot)

Quantum SD-WAN Steering Deep Dive: How Decisions Are Made (and Where to Troubleshoot)

If you operate Check Point Quantum SD-WAN long enough, you’ll notice a pattern: when users report “wrong ISP used”, “overlay picked the bad path”, or “policy looks right but traffic disagrees”, the root cause is usually steering state—not the rule definition itself.

This post breaks down what the SD-WAN Steering process actually does, how it writes decisions into kernel tables, and what evidence to collect when steering doesn’t behave as expected.

 

Why this matters

Steering is the bridge between:

  • Policy intent (Infinity Portal SD-WAN rules)
    and

  • Enforcement reality (which ISP / VPN Transport carries each connection)

When steering is healthy, the gateway consistently selects the best path based on probing measurements + thresholds + prioritization/aggregation logic. When steering is unhealthy or out of sync, you get drift: policy exists, but enforcement does something else.

 

1) SD-WAN Steering Responsibilities

Steering is responsible for making real-time path selection decisions for SD-WAN traffic.

Probing measurements it owns

Steering continuously collects measurements for:

  • Next hop probing
    (health to upstream/next-hop per ISP interface)

  • Local Breakout probing
    (health to “internet targets” defined in policy rules)

  • Overlay probing
    (health of VPN peer interfaces / overlay paths)

Core outputs

Based on probing results, steering:

  1. Evaluates probing results and decides which interface (ISP) or transport to use per SD-WAN rule

  2. Writes the selected path into kernel tables used by the packet/connection processing flow

  3. Updates iNext (via Nano Agent) with steering events (for portal visibility)

  4. Updates cpview with selected ISPs and probing statistics (for telemetry/analytics)

Key point: steering isn’t just “calculation”—it’s state + installation into kernel tables.

 

2) Steering Decision Flow (Control vs Enforcement)

At a high level:

Step A — Steering computes the best path

  • For Local Breakout: per rule

  • For Overlay: per rule + VPN peer pair

Then, steering records the selected ISPs/transports in the relevant kernel tables.

Step B — Packet/connection processing consumes those tables

When a new connection is created:

  • Local Breakout
    The Firewall obtains the ISP to carry the connection from the relevant ISP table.

  • Overlay / Backhaul
    VPN obtains the VPN Transport to carry the connection from the relevant VPN transport table.

Failure mode you should recognize

If a connection:

  • is not steered to the correct ISP/VPN transport, or

  • fails to forward properly,

you must inspect the kernel tables content.
If the table is empty, there is no ISP/transport to carry the connection → it will fail by design.

 

3) Steering Decisions for Local Breakout

Steering uses thresholds and rule logic to determine which ISPs are eligible.

Eligibility logic (threshold gate)

  • All ISPs below thresholds (per rule) are allowed

  • All ISPs above thresholds (per rule) are not used (as long as at least one ISP remains below)

Selection logic (within allowed set)

From the allowed ISPs, steering selects according to the rule’s steering object configuration:

  • Prioritization

    • Steering chooses the best/priority ISP that is allowed

  • Link Aggregation

    • Steering selects all allowed ISPs

    • The Firewall chooses among them based on the aggregation method (hash, etc.)

Operational takeaway: In Link Aggregation, “allowed” can mean multiple active candidates, and the final per-connection decision depends on the aggregation algorithm, not only on probing rank.

 

4) Steering Decisions for VPN Overlay

Overlay steering has an extra dependency: VPN peer discovery and transport preparation.

Peer learning / installation pipeline

  1. SD-WAN VPN peers are learned via GW Sharing

  2. They are downloaded by the SD-WAN Nano service

  3. They are installed with the SD-WAN policy into the Steering process

  4. Steering performs additional checks and, if identified as SD-WAN VPN peers:

    • classifies them as SD-WAN VPN peers

    • prepares VPN Transports configuration for these peers, for later use by VPN

Eligibility and preference

  • All VPN Transports below thresholds (per rule + peer pair) are allowed

  • Among the allowed, some transports are selected as preferred (per rule + peer pair)

Selection is based on the same steering object settings:

  • Prioritization vs Link aggregation

With Link aggregation:

  • steering marks all allowed VPN transports as eligible

  • VPN chooses among them based on aggregation method (hash, etc.)

Operational takeaway: Overlay issues often aren’t “VPN is broken”—they’re “steering never produced eligible transports for this peer pair”.

 

5) Steering Commands (operational control)

Two key operational commands shown in the material:

  • sdwan_steering_stop — stops the steering process

  • sdwan_steering_start — starts the steering process

Use cases:

  • controlled restart after troubleshooting changes

  • forcing re-initialization of steering state after verifying policy/probing inputs

Important: Treat this like restarting a control component—use it intentionally, and correlate with logs/events.

 

6) Troubleshooting: What to Collect Before Opening a TAC Case

When steering is wrong, the fastest path to resolution is to prove where the pipeline breaks:

A) Probing health and thresholds

  • Are probing results present for:

    • next hop?

    • breakout targets?

    • overlay peers?

If probing is missing/invalid, steering can’t generate allowed candidates.

B) Rule intent vs enforcement state

  • Confirm rule configuration (prioritization vs link aggregation)

  • Confirm thresholds per rule (loss/jitter/latency)

  • Confirm that at least one candidate is below thresholds

C) Kernel tables content (the “enforcement truth”)

If the gateway is not using the correct ISP/transport, provide:

  • the relevant kernel tables showing selected ISP(s) for breakout

  • the relevant VPN transport tables showing allowed/preferred transports for overlay

If tables are empty → steering did not install state.

D) Telemetry and events

  • cpview probing statistics (what the gateway thinks the link quality is)

  • iNext/Nano steering events (what the cloud believes is happening)

E) Reproduction details

  • exact source/destination/service of a failing flow

  • whether it’s breakout vs overlay/backhaul

  • time of failure (to align with probing intervals and events)

Common Pitfalls (what usually bites people)

  • Thresholds too strict → all links become “not allowed” → empty tables → failures

  • Link aggregation misunderstood → multiple ISPs allowed, hash selects a non-obvious path

  • GW Sharing drift → overlay peers not learned/installed consistently

  • “Policy exists” assumption → policy in portal is not proof of steering state installed on the gateway

Closing

Steering is not magic. It is a deterministic pipeline:

Probing → eligibility (thresholds) → selection (prioritization/aggregation) → kernel tables → packet/connection processing → telemetry/events

If you troubleshoot it in that order, you’ll stop guessing—and you’ll fix issues much faster.

If you want, reply with:

  • “breakout or overlay?”,

  • a sample flow (src/dst/service), and

  • whether you’re using prioritization or link aggregation,
    and I can suggest exactly which kernel tables/telemetry points to validate first.

(1)
3 Replies
the_rock
MVP Diamond
MVP Diamond

Another great one!

Best,
Andy
"Have a great day and if its not, change it"
0 Kudos
Dibzera
Explorer

great!

WiliRGasparetto
MVP Diamond
MVP Diamond

thk's

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events